Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello gophers and all those who are interested in Goa.
Today I am going to talk about how to concentrate on business
logic by hiding transaction management and using a repository
and unit of work pattern. Small piece of information.
You can find the presentation using QR code at the bottom
left of the slide. All QR code in
the slides are links. Let's continue.
I work at Avita, one of the most popular classifieds in the world.
Our main language is Golang. We use it for more than
a thousand microservices. We use Golang almost
everywhere in cloud and network service,
command line interface DevOps and web development
and we can handle about 300 million visits
per month. That's cool, but we write the application
with business logic, not a benchmark.
My team and I work on smartphone reselling.
That complex domain includes more than 20 states where
phone cases from a seller to a buyer through verification,
repair, adding of warranty and delivery. Nevertheless,
I will use another domain as an example on the
slide, the all known domain of the shop.
Unfortunately, we cannot directly use approach from other languages
in Go because of the error handling. The small
disclaimer here is the most of the year handling in the slide
is hidden instead of full error processing. I will
hide if with checking from the slide then
Go is not conventionally object oriented language.
Go has unique insign into interfaces,
no inheritance, no protected access modification
and others. Also Go is young
programming language and there are no ready libraries
to build an enterprise application. Despite these reasons
and the difference between Go and other languages, we can use some
concepts from them. Let's see the repository pattern
in Golem. The repository has a simple interface and
hides saving, getting and mapping data from
the database that help us to concentrate on business
domain instead of database. We don't add validation
of usernames, product count in an order
or other business rules because the implementation of
the concrete repository is not a part of the domain.
The repository should work with only one model. If we
add several models to the repository, we create a
super repository which is difficult to modify,
extend and testing.
The sequence diagram for any repository looks like
that. When we call the repositories method,
we convert data from domain
view to database or vice versa. Then we receive
or save data in the database.
However, the repository can use data
mapper pattern if we have a complex model
and the repository will work with a data mapper
instead of direct calling the database.
The description is simple, but what is inside?
We little modify the repository interface
to our model and that's still simple. Then suppose we
have a user with username password and data data and
each user have profile information about avatar first
and last names, et cetera and Zed's
knowledge belongs to a user model. However,
we store this data in database as two different tables.
The repository includes machinery to hide the difference and
help us think more about the domain, not the database.
On the left side of the slide you can see domain
view of the model and on the right side you can see
database view of the model and you
see that's different and repository help us
to hide it.
Let's see an example. This is getting model I use
SQlix to review the code of data mapping from
a database. When the app receives data after
receiving, we have a database view in the user
row and profile row. Then we convert data in
a domain view as a single model user with a nested
profile model. In the save method, we convert
the data view in the database and then save data.
I use an insert and update in one query
to place it in one slide. Then we use two tables
and we need to add transactions to prevent corrupting
data if a problem happens in gap between two
updates. Comments leads Disclaimer I
simplify models to a few fields in the
next slides to feed code on a slide now
we have a user repository. Next our business tell us
to register a user. We create the message
register, add validation method
and save a model with the registry.
Then the product tell us that we must notify the
new user about registration. For that purpose.
We publish a message in queue about that.
However, we can catch an issue when
the queue goes down. We save data in the database and
load the message. This situation is unacceptable for us.
Let's cover saving data and publishing with a
transaction to prevent it. These are just two lines.
The first one is to begin the
transaction and other one to commit. But we
should explicitly pass the transaction to the
repository and publish method.
Passing the transaction in the repository complexity the
code with additional knowledge. They say knowledge is power.
However in this case additional knowledge is not about
power, it's just complication for complication sake.
And we change our repository interface by adding knowledge
about the database transaction. That makes our repository nonel.
But we get atomic registration. It's not a big
deal. We now have a registration but the app
can just create a user. Businesses want to sell
goods and we already matter. And easy making new
repository with getting and saving methods.
Then we create a new use case with validation,
saving and message publication in a
transaction. And now we see our elegant solution
with transaction registration and buying while
the developers are daydreaming, being perfectly
contempt with the result of their work, business comes
to them and demands a new scenario to increase the conversion
of new users into our customers. Throw purchase
without completed registration. That is technique when
users can buy things on the site without authorization
and by just typing email or phone number only.
Let's see on the hides. An attentive listener may have
already noticed the issue in the code.
The solution is the same as for the two previous scenarios.
We add control, transaction control and passing the
transaction in the use cases in. Additionally,
we complicate transaction control inside the use cases.
The two lines are changed to if else to decide
whether the flow is on the top level or it
was called something else and it's not good.
But as we know, with great business logics comes
great legacy. Let me briefly
recap what we have here. First, we have
nested transaction use cases and that's cool.
However, the transaction spread everywhere.
That forces us to duplicate the routine code to
control them. Duplication adds an increased
chance of making mistakes. Also, if we want
to change the database, we must change the app in many places.
Now I won't rephrase our issues to our wishes.
The first one is an EDL repository for
simple work with the database. Second,
the use cases are transactional and nested.
I want to save different models in the single scenario without
thinking about how to open close the transaction and
rollback it on error. The third is
supporting nested use cases.
Now I won't rephrase our issues in our wishes.
The first one is an EDEL repository for simple
work with the database. Second, the use cases arent
transactional and nested. I want to save different models
in a single scenario without thinking about how to
open close the transaction and roll back it on error.
Also, I want to hide transaction control and
finally I want to replace the database easily.
We already have some wishes done as nested transactional
use cases, but we need to get the others.
Let's try to hide transaction control.
The simple solution is using closure, but where
should we place it? We can add closure in the repository.
We call our scenario in the transaction, get the user
and the order as a result and
then save them. However, I hoped that
you will be skeptical about this solution.
First, we still should pass transactions in the repository,
in this case in the order repository.
Another problem is that for each new scenario
we need to code a new closure with new range of models.
And the last problem is that over time our
scenario will became more complex and the user repository
would be smarty and it would know about almost everywhere
a model in the app. As a result,
we would need much more time to extend and
test our apps functionally. However,
we can move the closer to a separate function and see the result at
first sight with transaction is just a copy code from the
use case. Take a look at transaction control pros in
the closer and register scenario.
Let's simplify our scenario. With closure we
replace it lines onto and decrease the
chance of create a bug. The code is
entirely placed on the slide and looks better than it was
and I hope you agree with me. However, our code
is still bound with the SqLix Tx transaction.
As a result, we can only change the database by
rewriting all use cases and somebody can make
mistakes by handling control of a transaction.
Next, I want to hide passing the transaction in
the repository. To do that, we can use the
factory method in the repository which enrichs
our repository with a transaction, the factory method
getting the transaction and save it in our repository
and then we can use the saved
transaction in our methods.
The code in the getting and save methods will change slightly.
We replace one if condition on the
single line. Nevertheless, I am eager to
see how our use case would look after
updates. We call this transaction
repository method and then the use case is not
changed. Is that better than it was?
Unfortunately no. We add time dependency colon.
We must call repository methods only after factory method
and I don't think it's cool. Could we use
reflection in this case to remove the time dependency.
Let's use a function with repositories as argument
in this transaction. We pass it in our closer,
then the reflection retrieves a list
of repositories and calls the method
with transaction to enrich them.
As a result, we remove an argument with a
transaction in repository method. However,
reflection is not a golden way and we still
should pass the transaction in the use cases and the
queue. Also, explicit passing spreads knowledge
about the database through the application and
where we can to store a transaction. Let's see how other
languages solve this issue.
In Python, the passing depends on a library.
In SQL alchemy, we cases the transaction explicitly
as a function argument and we already have
it. Solution in Django,
transactions are stored in a global variable because Django
processes one request in one thread.
PHP uses the same approaches because it does not have
multi threads processing at all. Unfortunately,
it's not our solution at all and
fortunately go can work with multiple threads by
Go routines. More enterprise languages such as
Java or C sharp use thread local storage that
is similar to the global variable but limited by a single thread.
But where to store a transaction in go passing
as an arent is not our solution. We want to
hides a transaction to provide high coupling use
cases with a database. We have Gorotine and
God doesn't support built in Gorotin local
storage. Could we create a similar solution
by Gorotine Id? And the answer is probably no.
Golang does not allow us to get a gorotine id directly.
Go experts don't recommend using Gorotine
because it contradicts Golang Way and they
want to prevent building applications that associated
all consumption with a single gorotine. And now we should
use hacks to get an id. Also, new language
updates can break the hacks and no one
can guarantee the stability of the solution.
However, some libraries implementations Gortin
local storage but all of them arent built on the
hacks and most arent not my intended.
Fortunately we are not in 2016 and we
have the context package that can be used
to store the transaction. However, there are
opinions that storing something more than primitive types
in a context is a bad idea.
That means should we reinvent the wheel?
Fortunately not some experts articles
there is an exception for the specific
values to be scoped to the request and destroyed after
request. That gives us the ability to
use context to store a transaction inside
the context. Excellent. We have a place
for our transaction. Let's look at our closer.
That's not so bad. We check a transaction in the context.
If not, we create and put a new transaction in
context. In the use cases, we replace the transaction
with a new context. Then we add
getting a transaction from the context in the repository to work
with the database. Also, we create a
TR interface to replace direct
database connection on a transaction and vice versa.
Now we have a simple repository interface without the explicit
transaction argument and nested transactional use cases,
we can forgot about transaction control by using transaction
closure in. Additionally, we concentrate on working
with database in closure and repository
and can replace that database without changing use
cases. However, we have a problem with closure,
which is difficult to test because we cannot create a mock or
stop for a function and a global variable
works with a database. Also, we need to
rewrite use cases to add a new closure or replace the
current one. Let's fix it. We must
convert with transaction closure in a structure
with an interface. The structure allows
us to create a mock and path that database as an arent instead
of global variable. Let's name the interface manage
to do function calls, do with settings with default settings
and do with settings. Control a transaction as with
transaction with additional features such as nested
transaction readonly transaction timeout,
et cetera. Then we introduce a generic
general interface transaction which can commit
rollback, show the transaction status and return
an actual transaction. Also we have the interface
to create nested transactions if a database supports
them. Also we adding the settings
interface store standard configurations for different databases.
We implementations new settings for each database transaction
because different database have their configurations and
abilities. We cases settings in
specific interface to configuration the transaction
in a factory. It's time to see what we have
in the use case. Manager replaces this transaction
and visually everything changes. However, we can
mock manager or our staff for testing
in Java. We can hides the transaction from the use case by
annotation or XML configuration.
We can do the same in go by reflection code
generation. Some tricks with generic or just decorator
a simple decorator on the slide, but we should create a new
decorator for each use case. It's not hard
but it's so boring. Let's try to
use generic decorator. To get
generic decorator, we should use trick. The trick
consists of using the structure for
arguments of the use case and naming the method
identical in all use cases.
In our case we use the name handle.
Then we create an interface which match our
use cases. The structure gives us the ability
to create the generic interface which can work with any
use cases. After that we implement
a decorator by using the interface and generic.
The decorator is not idiot generic,
but we can easily implement it
and remove routine code. Writing the
using of the decorator is simple and
presented on the slide.
Then we can simplify getting a transaction from the context in
their repository. Let's create an
interface to extract the transaction. The default
method returns a transaction with the default context
k and thereby k returns a
custom k. It is
necessary when the repository process two transactions simultaneously,
but be careful with that. Also, when we create
context manager, we set the default k to have the ability
to change the k without changing code in the repository.
However, we should cast transaction in the transaction
structure of the database to work with it and casting
is not safe. We can create an interface for each
database to skip casting in the repository. In the
first iteration, the saving methods look like that.
Let's use SQL context manager in the repository.
This simplifies the code a bit and reduces
the chance of error. Let me briefly recap
what we have. We kept the transaction interface simple
by hiding the transaction in the context. We get
transaction operation nested use cases and
hiding transaction by the transaction manager.
And also we can migrate to another database without changing
use cases. And finally, the solution does
not create a problem with testing,
but what did it cost? Fortunately not
everything. The solution works only on the Go
1.13 version which was
released in the 2019. It was not
so long ago. Nevertheless, Go is updated
by a minor version and I hope you have already updated or
it is not problem for you.
Also there are a few ready transaction adaptations
such as SQL, SQL leaks, Gorm,
Mongo and Redis, but you can write
new adaptations by 70 code offlines.
The next disadvantage is losing performance.
The first benchmark shows that the difference with and
without the solution is about 3.4%.
The result was impressive but the reason was SQL mock the
library for mocking SQL requests.
SQL mock consumes a lot of resources.
Therefore I decided to rewrite the benchmark on
SQL Lite in memory. The result was more natural and
the difference is about 18%. However,
most applications which I have seen use database that
store data on disk or are dedicated on another
server. For that reason I
wrote benchmark with MySQL on the
same server. The file system has added overhead
and the solution takes the same amount of the time as
code. Without the transaction manager,
the network would consume significantly more time than the file
systems which mean the overhead of the
solution would be minor. The other disadvantage
is that we should pass context everywhere. I hope you
already use context in applications to store request id
or other data for login or to cancel a request if a
user closes a connection. Therefore passing context
is acceptable for you. The last and most
considerable drawback is that we cannot do a long business
transaction because the transaction in the database takes
a connection and limited database resources.
That can happen when we call external services thoroughly.
The simple solution is to request all data before
the transaction, but we lose the ability to simple
insert a use case into another.
And another solution is the unit of work pattern which I
will discuss later. Now we can see and repeat the drawbacks.
The first is a limitation of the
Golang version when the solution can work.
The next one is that there are only five
adapters but there are more than five database drivers
and orms in go. However, only 70 new
hides can solve it. Third one,
the solution consumes about 17% more than
without it or five microseconds.
Next one is passing context everywhere. And finally,
the solution does not support long business transactions,
but the unit of work pattern can solve it.
Let's take a look at what was and what is
now in the code. We remove the storing connection
of the database with the repository interface by hiding
a transaction in context in a repository,
we replace checking an existing transaction with
context manager code is shorter than it was and
now it's harder for us to make mistakes.
Then we can skip a massive block of database infrastructure
to focus on business actions in
use case. The chart on the slide
shows the dependence of the number of additional lines on the
number of nested scenarios. The red line presents data
without the solution. The green lines is for the solution.
The cross point is 1.8 when road code
loses in comparison to the solution
regarding the amount of additional code in a use case.
In addition, not only the count of code is
growing when adding nesting use cases, the chance
of making bugs and cognitive load on
developers is increasing too. Nevertheless, I give
you a tool and you decide whether to use it
or not. The link to the library is in
the last slide. Now I want to go back
to the long transaction drawback. I mentioned that
the unit of fork could solve it. The interface of
the pattern for Golang on the slide. Let's describe
each method. Register new
marks a model as new, register dirty marks
a model as dirty or updated. Register clean marks
a model as clean of getting from database without changes
and register delete marks model
as deleted. Then comet saves
data atomically in a database in one transaction
and finally roll back reset state
to initial or previous success commit
okay, we know what each method does,
but how and when can we use it?
Let's see the definition to catch it.
Martin Fowler defines a pattern as maintaining a list
of objects affected by a business transaction and
coordinating the writing out of changes and
the resolution of concurrency problems.
The first fun fact about the definition I mentioned concurrency
problems which are situated with the database,
but they are resolved by pessimistic or optimistic
lock and it's not a part of the unit of work. The second
one, in the original book pattern of enterprise
application architecture, where the pattern was written,
there is no rollback method.
Let's understand what does pattern give us first,
that gives batch changes in the database,
which can be significantly faster than sending each change
individually. We can optimize insert by
insert into values with multiple rows and
other commands by removing network overhead on
each command sending. Also, we can only update
the changed data even if we change the
model several times.
The second, a business transaction can be long and
not depend on the database transaction.
And the last one, our update is atomic.
What are disadvantages? First, we can't use a
pessimistic log. Sorry, we can, but we lose
the ability of a long business transaction because the optimistic
log uses database transactions. Second is
complexity. At the first view,
the pattern interface is simple, but the implementation is
not let's present the UML class diagram
of the pattern. The first block is our pattern.
The next block is the identity map pattern which stores our
models after they are registered in the sum
state, new, clean, dirty or deleted.
Further, we have classes to work with. Database transaction I
prefer to replace it with an interface to unchain
from the database. The interface help us to
use the transaction management which we wrote.
The next part of the pattern is mapper register which
returns a suitable data mapper for a model. And data mapper
is a pattern which maps data from models to database
view and vice versa. Some implementations of data
Maple in other languages use reflection to work
with any model and they support configuration
by YAML, XML annotations and
ETCA. Additionally, if you want to implement the pattern,
you should remember a problem with Avito increment
identifiers. The application can get
the identifiers only after the saving data in
the database. You can use generic identifiers
on the app site as a simple solution,
we finish with depicting the UML
cases diagram. Next, let's see the
example. Order and product are our models
and the pattern coordinates the saving of their changes.
Order function is our business transaction.
We check if the user exists in the external service.
Then we create a new order and mark it as
new. After that we get and marks product as clean.
Then we write off the product from our warehouse
and mark it as dirty or updated.
And finally we commit changes to save it in the database.
On the slide you see the sequence diagram. For our
use case we get mark, change and save data.
However, the most interesting part is hidden
in the commit call because all magic happens there.
When we call the commit we create a
database transaction. Then we get the data mapper
for a model. In our use case it
is the order. After that the code execute
batch changes in database.
Finally we commit the transaction.
I hope I have explained the unit of fork pattern and
shown its complexity.
Fortunately, we can simplify it to the interface.
On the slide we remove identity map mapper, register and
data mapper patterns. The implementation still
has a long business transaction and atomic update,
but this is bad updates because we work
with a callback function and in. Additionally,
the model will be updated multiple times instead
of once. That means if we add three
commas updating the same model, we send
three queries in a database.
However, the original implementation optimized that.
Nevertheless, the implementations even fits in
two slides. We use callback function as
comments, register methods save comments in
list. The comment is
a part of the data mapper which we simplify
when our business transaction is finished, we call
all comments in the comment method.
Also, we can replace comments on queries or
operations of some databases, so we
return batch updates. Then we add the
interface DB runner to send queries to
a database as batch data.
However, the model will still be updated multiple times instead of
just once and we have a problem with after increment
entities. The next step is a complete
unit of work pattern with blackjack, an identity
map, a fully implemented data mapper and a mapper
registry. Unfortunately, there is
no ready library where we need to add only
the data mapper for our mobile. However,
the library with the name works tries to solve that problem
for SQL databases. Also you can help the go
community and implement the universal unit
of work pattern. In conclusion, I want to remember
what we have passed. The first is a repository. It is
used if your application grows and data in the application standing
differs from the database view. Then we use
the transaction manager to have nested transaction use cases
and hides knowledge about transaction in them.
And finally, if you want a long business transaction,
the unit of war is your solution.
Thank you for your attention. The source code of
examples and libraries access by second
link and please press the like button
if the presentation was helpful for you or
write comments if I missed something and
simple writing of business logic. Good luck.