Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone. My name is Maximskov and I work as a
lead microfrontend engineer at Yan X. Let's start by thinking about
what comes to mind when you hear the word monolith.
Most likely you think of something huge, indivisible,
and heavy. It's totally fine. But then it comes
to software development. It's not exactly something you want
to work with. You probably remember a project with a large
code base that was very difficult to understand and
also challenging to make. Change to the complexity of
such projects tends to increase rapidly as the code base grows.
In other words, these projects don't scale well.
On the other hand, monoliths are often compared to
microfrontends, which are frequently discussed at conferences,
giving the impression that this is the best way to build
large scale applications without turning development process
into a nightmare. In this talk, we'll explore
if it's true or not. We'll also delve into the main issue of
monoliths and how microfrontends can help us to address it.
The term glorious in the title signifies a good
monolith, suggesting that not all monoliths are equally bad
or bad at all. Next, we will delve into the
topic of building an effective and scalable monolith
by the end. The aim to understand then it becomes necessary
to seriously consider adapting micro front end
architecture and determine the extent to which we
can rely on the monolith architecture without experiencing
any development issues. So why did I
decide to make this presentation? Well, it's pretty simple.
I've been in web development for over 13 years,
and during that time I've become proficient in building new
products and maintaining large scale existing code bases.
For the past three years, I've been leading the infrastructure team at
Yandexdirect, which is a major online advertising
platform. One of the challenges they faced was the scalability of
the project. And here's the interesting part, they successfully
solved it without adapting microfrontend architecture.
We still have a monolith, and I want to share my experience and
what we ultimately came to. So let's dive in and
explore the world of monoliths. But before the start,
let me share some numbers about Yandex Direct so you can better
understand the scale at which our monolith successfully
operates. Yandex Direct's code base is over 20 years
old, and during this time, a massive amount of code
has been written. When I joined the project a couple of years ago,
the application had several million slides of
code, and that was just the front end part. At that
time, every developer including myself, kindly felt the
pain working with a large monolith that didn't want
to get any bigger. In other words, we found that the
project architecture doesn't scale well to the size of
our code base. So what the challenges does the team face in such
situation? Usually in such project, several things happen.
First, the project reaches a point where it becomes too
complex for one developer to fully grasp. It becomes challenging to
understand all the code and system functionality.
This complexity makes developers hesitant to make changes
to the code, as even a minor modification
in one part of the system can potentially break important
functionality in another one. Writing new code becomes
difficult too, as it's not always clear where to edit.
Multiple solutions to the same problems may exist as
often, it's easier to write new code instead of understanding
the existing code. Moreover, to solve any task,
developers need to immerse themselves in a vast amount
of code base, which takes a significant amount of time.
As a result, the project becomes difficult to move
forward and the development process slows down,
leading to a decrease in productivity and in product quality
too. The chart illustrates how team productivity
decreases as the code base grows. The main reason for
the decrease in productivity is that the complexity of
development in the project is growing much faster
than the size of the code base. And this happens because
the main contribution to the system complexity is
not complex algorithms or complex user interfaces,
but the high coupling between different parts of the system.
For example, in the worst case, two modules or
functions depend on each other, resulting in two connections or dependencies.
The three modules, the number of connections, can increase to six,
and the four modules it can go up to twelve. This pattern
continues until we end up with a big ball of mud where
the boundaries of any functionality are completely blurred.
And to solve any task, we have to delve into the
entire code base. So we need to address this issue and
aim to reduce the coupling. However, since amanolith
is a physically single code base, there are usually no restrictions
on importing code anywhere, leading to a rapid increase
in the number of interdependencies. To regain control other
system complexity, we must learn how to manage the
coupling in our code base. And here comes micro front end.
You may have already heard a lot about them, but if
not, it's the idea of dividing an application into
several smaller applications, like embedding a
widget, displaying currency exchange rates on a news website.
And there are many implementation options from a technical point of view.
But I won't go into detail about that.
What's important is the concept itself,
where they divide one big application into several smaller applications
that are independent of each other. This separation provides
us with several advantages. Firstly,
since each micro front end is a separate application,
this is our own code base. Often in a separate
repository, we can no longer simply import a function from
random file than we need it. These applications usually communicate
with each other according to specific rules, such as through a
public API, events or something. As a result,
the coupling between these applications will be significantly less than
between different parts of a monolith. Also,
dividing the code into multiple applications, on the one hand,
leads to a lower productivity at the beginning of a new project
as it incurs additional constraints and significant
expenses in preparing more advanced infrastructure.
But on the other hand, as the project evolves, we will
experience fewer scaling problems than we would in a monolith.
Making change to smaller applications is easier, resulting in a
less significant decrease in the team's productivity over time compared
to a monolith. Another advantage of dividing a monolith
into different applications is the ability to distribute responsibility
for different parts of the project among different teams.
These teams will have the freedom to develop their own processes,
choose technologies, define style guides, and determine release
circles. As a result, teams can work at their own
pace, with some applications being released more frequently
than others. And what's even more, if a bug
happens in one application, it won't require rolling back releases
from other teams, which can be quite painful.
Process well, it turns out that microfrontend
architecture offers great advantages, scales better than
manoliths, and there is even a certain bias around this
topic that makes us want to try it in our projects.
But does this mean that microfrontends are a good
fit for every application? I don't think so, because adapting
microfrontends brings not only important benefits,
but also comes at a cost that is important to
be aware of. Firstly, everything becomes more complicated.
To connect all micro front end applications into a
unified application, a complex infrastructure is required.
Bundling becomes more challenging and you need to have a
deep understanding of your bundler capabilities and it
will be necessary to decide how applications communicate with each
other and prevent any conflicts between them on a single
page and if you need server side rendering. The complexity
of the infrastructure increases significantly as
it requires a distributed system of microservices,
which has its own set of challenges. It's also necessary
to think more about backward compatibility in releases.
It's important to always remember and not break contracts between micro
front end applications. It would also be more difficult to
update shared dependencies. Trust me, if you ever tried
to update the version of react on a bunch of micro front ends.
You know, it's not the easiest thing to do, and besides,
it's not always necessary to split application among
multiple teams. Maybe you have a relatively small
product that is being worked on by just one
team, or maybe you want to have consistent processes and
technologies across all teams so that you can easily transfer
engineers between teams and focus on critical tasks at
the time. And what should we do if the monolith doesn't scale well
and the microfrontends are complex and often redundant?
Well, there is a solution. We can try to take the
best of both worlds. By combining the advantages of monoliths
and micro front ends, we can create an architecture that can
handle the growth of the code base without incurring significant
infrastructure costs, especially at the beginning of the
new project. In this case, productivity at the beginning
might be slightly lower than in the traditional monolith due
to the need for architecture and development tools setup.
But as the project evolves, productivity won't
decrease as significantly as with traditional monolith.
It will be more similar to that of microfrontends.
To achieve this goal, we can keep all the code in one repository,
which allows us to easily reuse our code, have one
release, and not worry much about backward compatibility.
And it also helps us avoid complicating the development,
deployment, build or maintenance of our application.
At the same time, they need to preserve the advantages of
micro front ends, which enable us to control coupling and system complexity.
In other words, we need to find a way to divide the code
into smaller and isolated parts. And if
to think about it, what prevents us from doing this in a
regular monolith? The only thing that actually stops us
is the fact that in the code base of regular monolith,
they have complete freedom of action. And glorious
monolith differs from a regular one in that
it has a well thought out architecture and
complaints that this architecture is controlled by automated tools
such as linters. So it's important to understand
what architecture exactly is. Software architecture is
mostly about rules and constraints that lead to the creation
of flexible and scalable system.
So let's talk about the rules and constraints that are
essential for building a good monolith. It's worth starting
with the introduction of the module concept. The main idea is that
they need to divide our code base into separate and loosely coupled
parts, which will call modules. The clear boundaries
and weak dependencies between these modules are exactly what
allow us to create scalable system.
In such a system, each module will be responsible for a specific
product or technical functionality, depending on the
application. Examples of modules can include a cash balance
module, data filtering module, maybe something bigger
like a sidebar of an application, or even the entire page.
Typically, modules responsible for a large amount of
functionality are assembled from smaller modules.
Externally. This looks very similar to a microfrontend
architecture, but we are still within the same code base
and do not incur the infrastructure tax for microfrontend.
But unfortunately it won't be sufficient to just
distribute the entire code base into different directories and
consider it as modules. To create a good monolith,
modules need to meet certain requirements. These requirements are
all about how modules structured, how they are isolated
from each other, and how they communicate to each other.
Let's start with how things are structured inside modules.
Inside a module, there can be everything that exists in a regular
monolith application, for example, UI components,
styles, the business logic of an application,
and even technical things like libraries and frameworks.
In general, all the code that exists in an application should
be contained within one of the modules. Additionally,
it's important that each module is implemented in a similar way,
otherwise the team will have difficulty switching between the development
of different modules. It's a good idea to limit both set of
technologies and the high level directory structure.
This will allow logically separating the code of each
module into several segments, each with its own
area of responsibility. For example, you can have four segments like
on the slide, which I took from the feature slice design methodology.
However, depending on the project needs, you can come up with your
own set of segments, like you can add a separate
segment for server side code, which can contain
API endpoints, database connections and something. The key
is that the entire team clearly understands where to put new
code and where to find existing code.
Unfortunately, it might be difficult to perfectly synchronize everyone's
understanding of what each segment is responsible for.
Each engineer might have a slightly different interpretation.
For instance, let's imagine, can a state manager inside
the model segment roll the user a pop up notification about
the successful completion of a data saving operation?
Well, someone might say yes, why not? While someone
else might say no. That's the responsibility of the
UI segment. So we can resolve this ambiguity
by introducing import restrictions between different segments,
like in the layers of the clean architecture. In particular,
what we can do is to put all the segments in
order and prevent lower segments from depending on
higher segments. For example, the UI segment can
use both model and API segments as well as utility code.
On the other hand, the model segments can only use the API
segment and utility code, but it cannot
use anything from the UI. Alright, these simple rules
highlight the code's responsibility and help us
to put it in the right place in the module. This makes decision
making much easier and also comes with additional benefits.
First, all the related code will be in one place within the
module. That makes it easier to understand what the module
is supposed to do and how it works. Secondly, it prevents the
mixing of business logic and UI, which leads to more flexible,
composable and easy to understand code. All of this
make it way simpler to work with a large code base and switch
between different modules with ease during development.
To sum up, the strict structure of modules lets us effectively
solve any tasks in any part of the system,
even in modules we are not familiar with.
The next step is to isolate a module from the rest of
the system and other modules. This is a key part
of the rules that helps us to achieve the two main
goals. First, it makes the module loosely coupled from
the other modules. They achieve this by allowing other modules
to only use the functionality that the module developer
has specifically prepared for this purpose.
And since other modules won't have access to all the inner work
ends of the module, the number of dependencies between modules
won't grow as quickly as in regular monolith. At the same time,
the dependencies themselves will be more obvious and controlled.
Which brings us to the second goal, the ability to make changes
to the module safely. The developer can confidently
make changes to one part of the system without worrying about
unexpected bugs in other parts of the system. And to
gain that confidence, modules need to be isolated at every
level from code to styles and data. So when
it comes to code isolation, it's basically about two things,
making sure there aren't any global side effects in your module
and controlling what functionality is exposed to other
modules. What do I mean by global side effects?
Basically, it's anything that can implicitly change the
behavior of other modules. For example, if our
module patches some global objects, loads polyfills,
it can cause other modules to rely on this behavior.
And if loading order of modules suddenly change
or we decide to remove some legacy modules,
these dependent modules will stop working correctly. That's why global
side effects are highly undesirable and should
be avoided. And then I say controlling what
functionality is exposed. What I really mean
is a set of rules which give us
a way to have one entry point for a module
and treat it like a contract for how the
module interacts with other modules in the system.
And we call it the public API of the
module. So the first thing we need to do to implement such
public API is to create an entry point in each
module, like an index file at the root of your module.
In this file, we'll define everything that is available for
use in other modules. Then we can set up a linter
that will prevent other modules from importing anything except
that index file. For this we can use Eslint
and ready made plugins such as Eslint plugin boundaries.
And how can we describe the public API of a module inside that
index file? There are few options here, mostly depending
on the framework you use in your application and your personal preferences.
You can simply use ES modules and re export the part of the module's
functionality. Alternatively, you can use dependency injection
principle, especially if you are using angular or SJS
frameworks. Or you can use an event driven architecture
and connect all the modules by communicating with events
sent to some sort of event bus. Each of these options has
its advantages and disadvantages. For example,
DI makes dependencies between modules less strong,
but it does complicate the infrastructure a bit.
On the other hand, the event driven architecture decouples
modules even more, but you need to be careful with the
model loading sequence to not miss any important events.
Let's say we want to use ES modules and simply reexport some
of the modules functionality, as shown on the slide. In this example,
we are using the react Redux stack,
so we mainly export react components,
redux selectors and actions using ES
modules in this case allow us to save on infrastructure,
since no additional development is required
for GI or event based architectures,
and it works well with code analyzers out the box.
For example, we can easily build a dependency, graph the system
and use it for selective test execution. For instance.
Yes, modules can also be loaded both statically and dynamically
to implement code splitting technique, and this is available
out of the box too. And sometimes we can still use event
emitters exported from a module entry point to make
the dependency between two modules as weak as possible,
but only in that cases when it's not
crucial to handle all the events and it's safe to lose
some of them. One more thing to keep in mind is to control the
size of public API. It's better to keep it as small as possible
because larger public API increases the chances of additional
dependencies between modules, which in turn makes
the system more complex. In general, there are few factors that contribute
to the size of a public API. This includes,
of course, the number of exports from the entry point and
the number of arguments for each export. For instance,
exporting a component with a high number of arguments
makes it more difficult to use it and also more difficult to
understand its functionality. It's also worth paying
attention to the complexity of the data structures that a module
receives or returns. The more unnecessary data is passed
between modules, the harder it is to make change to the internal implementation
of a module. And again, it creates more opportunities for
additional dependencies in the system, so it's
better to try to keep a public API as small as possible.
Alright, in most cases this rule should be enough to
isolate code between modules, but the main thing is to enforce
these restrictions with automated checks and linters, because it's
almost impossible to remember all the rules and perfectly
synchronize them across the team. So since we are
talking about front end applications, we also have style sheets,
right? It's also important to make sure our styles are
also isolated, because style sheets has
a global scope by default and can affect everything on the page.
For instance, two modules might have the same class name,
or one of the modules might add some sort of
reset css and mess up the layout of all other
modules in the system. So in order to avoid
any unexpected styling issues, we need to make sure we keep
our styles isolated from each other. There are a bunch of
ways to make it happen, and we've only shown a few on the
slide. And each of these options has its own
advantages and disadvantages, and the one you choose
will mostly depend on the project's requirements and your personal preferences.
But the important thing is all of them
work pretty well for building a good monolith, as long
as we'll stick to some additional agreements. For example,
if we go with CSS modules, they work really well
for isolating styles. But as long as we only use
classes and pivotal classes to select elements,
using other selectors could easily cause styles to leak onto
elements of other modules. Also, it's better to not
import CSS files between different modules.
Instead, if we need to override some styles in another module,
we can pass, for example, class name as a component
property and add it to the necessary elements.
But it's even better to avoid style overrides
at all, as it introduces dependency on the module
loading order and selector specificity.
Also, it's important to be careful with CSS custom properties,
since custom properties have a global scope
as well and can easily conflict between different modules.
So it's better to avoid creating new custom properties
inside module styles to prevent any potential
visual bugs. And finally, we can control
all of these rules with the help of the style in plinter.
This overall isolation allows us to avoid any unexpected
visual breaks in an application and safely make change to
module styles. So when it comes to style
sheets, the rules and constraints are quite simple.
But when it comes to data isolation, things can get a bit trickier.
In any system, code and data are highly coupled because
data basically the main reason why we build almost every application,
right? So if the module doesn't control access to its
data, we will run into a bunch of problems.
First, it will be really tough to make change to the module
because we could easily break other modules in the system,
then changing internal data structures of the module.
Secondly, it's not really obvious dependency, then a module
depends on data from another module. It's more
like a global side effect, and change to the data
can implicitly change the behavior of other modules.
On top of that, then we are developing a new module.
It will be challenging to use data from existing modules
because we would have to dig through the entire code base of the system
to figure out what data is available to use.
And to give you some examples, these two cases on the slide
are both incorrect. In the first case, we have one global
storage with all the data, allowing every module to have full
access to the data of the system. In the second case,
the data storage is inside each module, which is correct.
But the public API of the modules isn't strict enough,
so other modules have full access to the data storage.
And that's a problem. And to avoid these kind of problems,
we just need to follow a few simple rules. First, each module
should have its own data storage, and we don't want
one single global storage for every module. And if one
module needs some data from another module,
it should only obtain it through a public API.
The CQRS pattern is perfect for creating this kind of
API. It lets us provide some separate
operations for reading and mutating the data without
exposing the entire storage. Also,
when it comes to building user interfaces, it's important to respond
quickly to data changes. So the public API should
let us to subscribe to this change, not just receive
the data once. Another thing to consider is protecting our data
from accidental or intentional mutations.
We can do this by simply exporting a readonly
version of the data in the public API.
In typescript based projects we can simply use the only type
for this, or we can freeze the objects using object
freeze, but it will add some runtime overhead.
It's worth mentioning that all these rules and restrictions
can be applied with almost any state management library,
no matter the framework. The important thing is
controlling which data is accessible for other modules
and limiting how the data can be modified. Alright,
so now we've got these well isolated modules,
which means we can safely modify the code within them and
control the coupling using the public API.
The next thing we need to do is to introduce a
runtime for these modules. But what is that?
When they are creating a new module, it's crucial to understand
how it will fit into the application. For example,
we need to know what features of a bundler are available,
what environment variables we can use, how to
import images, styles and so on. It's also important
to know which version of browsers and node js
the code will run on, and also which library
to use, for example, to provide data access in the public API.
So we need some common rules and libraries for
all the modules to make module development easier. And for
this reason we introduce this thing called runtime for modules.
For example, this is runtime from Yandex Direct.
It consists of versions of the most important libraries like
typescript, react and redux. It's important to share
these kind of libraries across every module because they have
a huge impact on what we can do in the public API.
In fact, communication between modules heavily
relies on these libraries. There are also some another
libraries that don't affect the public API,
but it's handy to have them as a part of runtime,
for example, HTTP client router, library,
components library, and so on. All of these libraries
help us with common development tasks when we are working with
modules. Although it's possible to remove almost everything
from this list to create highly minimal runtime,
there will be a downsides of doing so then designing a
runtime for modules. It's important to find the balance between
the size of a runtime and the convenience of
developing modules. Making runtime bigger will
make it more difficult to maintain it. In fact, making change
to runtime is always risky because it affects
all the modules at once. On the other hand, making the runtime
smaller will make model development more challenging.
There might be a lack of functionality requiring us to
repeatedly develop custom solutions for each module.
Let's take react as an example. If we decide to include react
in the runtime, we can export components from module's public
API, and that can be easily integrated between
modules. But if we decide not to include
react in the runtime, on the one hand each module
can use a different framework, but on the other hand connecting
multiple modules will be more difficult. So now we have runtime
for modules, which makes module development much easier.
But we still need to figure out how to organize our code
base into multiple modules. Like when
should we create a new module? How do we locate existing
code within modules. Those are the kind of questions
we need to address. Well, there is no one size fits all solution
here, but luckily the problem has been extensively
researched and there are plenty of methodologies like
domain driven design, clean architecture, feature slice
design and others. All these methodologies suggest some
common rules and for breaking code into modules.
First and foremost, each module should have just one responsibility
and most of the time it's connected to the product domain,
meaning the module is responsible for a certain product functionality.
For example, it could be a module for handling payments,
user authentication or module for editing articles.
Secondly, each module should have high cohesion inside
and low coupling with other modules. High cohesion is
a good sign that we have properly defined the module's responsibility
and that the module contains all the related functionality.
High cohesion also makes it easier to locate the necessary code and
dive into the business logic of a specific part of the application.
It also means that the module will have fewer reasons to
interact with other modules, which greatly reduce the
coupling and the overall complexity of the system. And besides,
each methodology suggests dividing all modules into several
meaningful groups and implementing strict rules on how these
groups can depend on each other. These limitations
help achieve a predictable system breakdown and facilitate
faster discovery of existing code. Moreover,
it simplifies the decision making process on there to place new code
by reducing the number of options available.
And if you don't want to create your own rules and constraints
for modules layout, you can simply choose one of the premade
options like a trendy feature slice design methodology this
methodology basically outlines an architecture that's
pretty similar to a modular monolith when it comes to dividing the code
into modules. The main idea of the methodology is to
split them into six layers. Each layer has a different
level of understanding about the product domain and a different level
of impact on the system. You can find more information in
the official documentation on the website, which is really useful.
But what is important now is that there are two main rules
to follow. First, the layers are strictly ordered and
imports between layers are unidirectional from the app layer
to shared layer. For example, a module in the widgets
layer can use modules from the features,
entities and shared layers, but cannot use
pages. And second rule is the modules
on the same layer cannot import each other.
For example, two modules in the features layer are aware of each
other and cannot import anything from each other. These two
rules require careful consideration than dividing code
into modules, but they lead to a predictable decomposition
of the system, which in turn simplifies navigation and understanding
of a large code base. This methodology has been used
in Yandex direct for over two years and has proven to be
effective. The only drawback I can mention is that there is
a quite high learning curve. It will be necessary to ensure that
the team has a synchronized understanding of the different layers
and ideally provide documentation with examples
for each specific application. So this was the key
principles for building a scalable monoliths. It's time to draw some
conclusions. First, a good monolith is built on three key
principles. That is, highly isolated modules,
runtime for convenient module development, and rules for
predictable decomposition of the system into modules. Secondly,
a good monolith is scalable, maintainable and
adaptable to future change. It does not become a bottleneck
in product development. That is because isolated modules
and lau coupling between them allow us to work with a small piece of
the code base at a time which can be quickly read and understood.
It also makes possible to safely make changes to modules and
avoid unexpected bugs and side effects. Developing a
monolith like this requires some additional effort, but it's still
a lot less effort compared to adapting a microfrontend architecture.
And last but not least, it's highly
likely that you don't need microfrontend tent architecture because
it brings significant expenses for both implementation and
ongoing support. Objectively, there is only one reason to
adapt it, and that is then you need to completely isolate
multiple teams from each other, allows them to have the
allen technologies, processes, releases and so
on. All the other advantages of micro front ends can
be effectively implemented in a monolithic application. In fact,
a monolith is perfectly suitable for the vast majority of
application, especially in the early stage of development.
And to dive deeper into the topic you can explore the
following subjects. The modular monolith there are a
lot of articles and presentations available on this topic in
the Internet. The clean architecture has an excellent book
with the same name. Feature Slice Design has excellent documentation
on their website and to start building a glorious monolith
in your project, you can use the following tools. Typescript it
is essential for building any large scale application.
Eslint and Stylint are used to enforce architecture
rules and constraints, and dependency cruiser helps in
controlling imports within the system. These resources and
tools will assist you in developing and maintaining a well
structured and scalable monolith and that's all from
me. Thank you for joining the presentation. I hope you really
enjoyed it. Feel free to leave comments I'll be happy to answer
any questions and have a nice day.