Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello I'm Rafael and I'm speaking to you from Poland.
My talk is titled Architectural caching patterns for
kubernetes and I will tell you what different approaches you can
use in caching while using kubernetes. What are the applications
for your system designs? But first a few words about myself.
I'm cloudnative team lead lead at Hazelcast, and before Hazelcast
I worked at Google and CERn. I'm also an author of the book continuous
delivery with Docker and Jenkins. And from time to time I do conference
speaking and trainings, but on a daily basis I'm an engineer.
A few words, but Hazelcast. Hazelcast is a distributed company
and is distributed in two meanings. First meaning is we
are distributed company because we produce distributed software.
Our products are hazelcast in memory, data grid,
hazelcast jet and hazelcast cloud. But the second
meaning is that we are distributed company because we always
work remotely. So it was always that way. Our agenda for
today is pretty simple. So there will be a very short
introduction about caching on kubernetes in the microservice
world in general. And then we will walk through all possible
caching patterns that you can use in your system. And while I'll
be talking, I would like you to think about two things. First thing is
which of this pattern you use in your system because
you must use one of them, because this list is complete.
And the second question I would like you to ask yourself
is youll it make sense for my system to change to any
other pattern? Youll it be beneficial to me. And with this question
I leave you to listen to this talk. So we are in the microservice
world, in kubernetes we deploy microservices and that is a
diagram of a classic microservice system. So we have a
lot of services, they have different versions,
they are written in different programming languages and they
use each other. Now the question for this talk is where
is the right place to put your cache? Is it inside of each
microservice? Or maybe as a separate thing in
your infrastructure? Or maybe we should put cache in front of each service.
And that what we will discuss. So the
first caching pattern, the first topology that you can use is
embedded cache. Embedded cache is like the simplest possible
thing you can think about. A diagram for this looks as follow.
So we deploy it on Kubernetes. So as always, kubernetes request goes
to our system, it goes to the Kubernetes service. Kubernetes service
forwards the request to one of the Kubernetes pod in which our application
is running and we have a cache inside our application,
embedded as a library inside our application. So request goes
to our application, application checks in the cache okay, did I already
executed such a request? If yes, return the cache value.
If no, do some business logic, put the value into the cache, return the
risk. This is so simple that we could even think about writing
this caching logic on our own. So if you happen to use Java,
that is how it could look like. So we can have some collection
like concurrent hashmap then processing the request, okay check
if the request is into the cache. If yes, return the cached value.
If no, do some processing, put the value into the cache and return
the response. If you use some other language, data will be the same.
Now you can implement it on your own. However, please never do
it. Never do it because a collection or concurrent
collection is not good as a cache. It's not good because it
has no eviction policy, no max size limit, no statistics,
no expiration time, no notification mechanism. It misses a lot
of feature that you will need from the cache. That is why if you happen
to use Java, there are a lot of good libraries. Guava is one of
them, where you can define all these missing features upfront in
your constructor. Or eh, cache is also another good solution.
If you use some other languages for every language you will find
a good library for caching. Now we can move this idea of caching
one level higher and put it into our application.
So if you again work with Java, your application framework
may be spring. So if youll would like to cache something with spring,
you don't need to write all this manual code, you just annotate
your method cachable and then every call to this method will
first check okay if the given ISBN is already in the
cache called books. If yes, returned the cached value,
and only if the value is not found in the cache called
books only then execute a method find book
in slow source. But be careful if youll use spring because for some reason
spring uses concurrent hashmap by default. So you're
better off changing your caching manager to something else
to for example guava. So embedded cache is pretty simple.
But there is one problem with embedded cache. So imagine now that request goes
to our service, it's forwarded to the application. Let's say on the top
we do some long lasting business logic, put the value in the cache,
return the response all good. Now the second time the same request
may go to the Kubernetes service, but it's load balanced
to the application at the bottom. And now what happens? The application
needs to do this business logic once again, because these caches
are completely separate, they don't know about each other.
That is why one of the improvement of the embedded cache will
be to use embedded distributed cache. So in terms of the patterns
or topologies, it is still the same. However,
we just will use a different library, not caching library,
but distributed caching library. We can use for example hazelcast
which is a distributed caching library, so you can embed it into your application
and now the flow is the same. But now no matter which
embedded cache instance youll use, doesn't matter because they
both form one consistent caching cluster. How to
use it how will you use it in your application? If we stick to the
spring example, the only thing you need to change in your
application is actually to specify I would like to use hazelcast
as my caching manager. All the rest is the same. So cache
hazelcast instance embedded in each of your application, they will all
form one consistent caching cluster and will work
fine together. Now you may wonder like but how,
I mean you deploy it somewhere like on Kubernetes
and how they discover each other, how one instance
of hazelcast knows that it needs to connect to another instance
of Hazelcast. So we thought how to solve this discovery problem and
we came up with the idea of plugins. So for each environment
we have a plugin which is by the way auto detected. So you run on
kubernetes, hazelcast discovers. Okay, I'm running Kubernetes,
I should use Kubernetes plugin and it uses kubernetes
API to discover other members. So you really don't need
to do anything and your hazelcast cluster will form automatically.
If you are interested in details how to configure this, then there
are a lot of resources, we have documentation, we have a lot of blog posts
which you can read. So we ended up with this diagram of
our embedded distributed cache. So let's make a short
summary about embedded caching. So from the good
size embedded caching, it's very simple. Configuration is simple,
the deployment is simple because it goes together with our application.
So you don't need to do anything. Youll have very low latency data access
and usually youll don't need any separate ops team needed.
From the downsides, the management of your caching is not flexible
because if you youll like to scale up the caching cluster, you need to do
it together with your application. It's also limited to JvM based application
like this Hazelcast example. But in general your
embedded cache is limited to your language of choice. For every
language you will have a different library and the data is collocated with the applications,
which may be a problem or may not be a problem in your case.
Okay, the next pattern, the next topology that you can use
is client server. Client server is kind of database style,
database style. So we will deploy our caching server separately
and then use cache client to connect to the server.
It looks as follows request goes to again to our Kubernetes service,
it goes to one of the application and then the application uses
cache client to connect to the cache server which is deployed
separately. Usually in Kubernetes it will be deployed as a stateful
set because cache server is a stateful thing. So it will
be deployed as a stateful set. Now if you compare this
solution, this pattern to embedded caching, there are
two main differences. The first difference is that we have
this thing on a diagram. So this cache server, it requires
some management, some maintenance. That is why in the big enterprises
you usually see even a separate team dedicated to
operate like not only cache clusters but like databases.
All this stateful thing for youll system, but also it's
deployed separately. It means that you can separately scale it
up or down. You can think about all this management like
backups separately. Now if we compare this diagram
to the embedded mode, there's also a second difference which is very important
and that is this part. So now your application uses
cache client to connect to the cache server. And using cache
client means that you can actually use
a different programming language for your cache server and different programming
language for your applications. Because there is a well defined protocol between
cache client and cache server. So no problem with that. That is a very common
strategy in this microservice world where you usually deploy
your cache server separately or multiple cache servers
and then your applications written in different programming languages,
they can access the server. This is such a common strategy
that like redis, it supports only this cache client
server, the same with memcached. So these are the only topologies
they actually support. Now how to set it up? If youll like
to say okay, I would like to have this client server,
how to do it? So for the Kubernetes we provide a helm chart,
we also provide an operator. So actually the simplest you can do is
helm install hazelcast, you already have your cache server running.
Now the client part, if we stick to this example from spring,
that is how it will look like so we need to define, okay, I would
like to use Kubernetes plugin for discovery. So please discover
my cache server and that's it. Client will automatically
discover the caching server, connect to this and that's actually all
you have to do. So let's come back for a moment to this diagram.
So we separated this, this cache server is a separate thing.
Then your application goes separately. As I told you, in a
big enterprise, usually this cache server is
managed by a separate team. So we can even go one step
further and move this managing part outside our
organization and move it into the cloud. So cloud is kind of
client server but it's very specific because the server part, it's not
managed inside our organization. So it works like this.
So again request goes to Kubernetes service, it's load balanced
to one of the application. Now application uses cache client to connect to the cache
server and the cache server is deployed somewhere, it's provided
as a service so you don't need any management, you don't
need ops team, you just need to pay the cloud provider which
is usually cheaper than maintain it on your own, how it looks like
in the code. So to start up the cache server caching
cluster you just click on the console of youll ClI and then when it's started
you can take the discovery token, put it into your
application and it will discover your cache server automatically.
And this is by the way like not only how hazelcast cloud works
but how most cloud solutions, how they provide you.
Even like databases, MongoDB or whatever you use,
that usually is based on the Discovery token pros and cons
of client server and cloud solution. So from the good sites
we have our data separated from the application, we have separate management.
So you can scale up down your cluster separately from your application.
And it's programming language agnostic because you use cache client to connect
to the cache server. From the downside you have a separate ops
effort or you need to pay for the cloud solution and
higher latency. You need to think about latency. This is
something I didn't mention yet, but we need to cover this. So usually
when you have your cache embedded latency is low because cache
goes together with your application. However with the client server
or cloud you need to think about latency. If you deploy client
server on premises then you need
to make sure that they are deployed in the same local network
because remember we are in this domain of caching,
it's a very low latency domain. So even one router hop
is a lot. So make sure that you deploy your cache server inside the
same local network where your application is running.
Now what about cloud solution? So about cloud solution is the
same. What do we do in hazelcast cloud? So when you create your caching
server you need obviously to create this in the same geographical region.
So we don't provide infrastructure using hazelcast
cloud. You can deploy hazelcast on AWS, GCP or
Azure. So you should choose the same cloud provider where your
application is running. You should choose the same region where your application
is running. But that's not enough. We provide also a way
to do a VPC peering between our network and between
the network where we deployed your cache cluster for
you and your application. So after that VPC peering,
you are like running in the same virtual local network basically.
So there is not even one router hop in between. And that is very important
to keep in mind because otherwise your latency will suffer.
Okay, we covered like all the
patterns so far, like embedded client server cloud. They are quite
old in a sense that you know them, probably from databases
or they are nothing new. So now there will be a pattern that is
quite new, that is very popular, especially in Kubernetes,
but it's not limited to Kubernetes. In fact you see sidecar in
other systems as well. So cache as a sidecar, how it looks like
request goes to our Kubernetes service, it is forwarded
to cloud balance to one of the Kubernetes pod.
And now inside each of the pod is not only application that is running,
but also a cache server. So request goes to the application
and application connects to the local host where your cache server is
running. And all these sidecar cache servers,
sidecar cache containers, they form one consistent caching cluster.
So that is the idea. So this solution is somehow similar
to embedded mode, somehow similar to the client server mode.
It's similar to the embedded mode because kubernetes will always schedule
your caching server on the same physical machine. So you
have your cache close to your application, it scares up
and down together. So it's kind of like embedded. There is no discover
needed, your cache is always at the localhost. That is
good. But it's also similar to client server because after all your application
uses cache clients to connect to the cache server.
So there is no problem like with cache
can be written in different programming language than your application, no problem with that.
And there is some kind of isolation between cache and application.
It's on the container level, which may not be good enough or may be
good enough for you, depending on your requirement. How to configure
this. So let's stick to this spring example. So in a spring
how we configure this, this is our client configuration.
So we connect to the cache server with the local host because we
just know, so it looks like a static configuration, but actually
the whole system is dynamic. We just know that the cache server is
running at the local host and the Kubernetes configuration. So we have two
containerbased. One is our application with our business logic, and the second
one is our cache server. In this case it's hazelcast.
Short summary sitecar cache from the good sites configuration is again
very simple, it's programming language agnostic. We have low latency and there is
some isolation of data between application and
the cache. From the downsides, we again do not have flexible management
because your cache scales up and down together with your application,
and your data is after all collocated in the same application
with the application pop, which again may be good, maybe not good enough,
depending on your use case. Okay, we covered sidecar.
The last caching pattern for today, last caching topology
will be reverse proxy, and reverse proxy will be something completely
different. It will be completely different than what we've seen so far.
It will be different because so far our application
was all the time aware that such a thing as a cache
exists. It was explicitly connecting to the cache server.
However, now we will do something different. We will put cache in front
of our application, so our application will be not even aware
that such a thing as a cache exists, how it looks like. So request
goes to our system, and now just before Kubernetes service,
after Kubernetes service, or maybe together with like
ingress in Nginx, we put cache and first like
if the value is found in this cache, just return the response, it does not
even go to the application. If the value is not found in the cache only
then you go, the request goes to the application. Nginx is
a very good solution because it's very mature, it's well integrated
with Kubernetes, and it's just something you should
use. If you go with the reverse proxy, how the configuration for caching
looks like. So that is the simplest thing you can do with Nginx.
So specify, okay, cache it on the HTTP level, that is
the path for your caching. So Nginx is good, it's mature,
it's well integrated with Kubernetes, but it has some problems.
So one maybe not a problem, but the trait of Nginx,
it's HTTP based, but okay, we are HTTP
people, this is fine. But another problem which is
a bigger thing, it's Nginx is not distributed and NginX is
not highly available. And NginX maybe does not store
data of the disk, but it can offload your data to
the disk, which for example is not the case in hazelcast when you are guaranteed
that your data is stored in memory, so that
latency is low. That is why you have to accept
these brings if you use Nginx. But still Nginx is a very good solution.
Now the last, last variant of the reverse proxy
will be reverse proxy sidecar caching. So this will be the last
variant of the reverse proxy topology. This looks like
that request goes to Kubernetes service. It load balances the traffic
to one of the Kubernetes pods. But now it's not the application
that receives the request, but it is something that we will call reverse
proxy cache container. So like cache server, but also like
the network interceptor which checks okay,
what goes to the application, and only if the value
is not found in this reverse proxy cache container, the request
goes to the application. So the application again does not even know
that such a thing as a cache exists. But okay, like all this
thing, that application is not aware of the caching. It's good and bad.
I mean it has some good sides and bad sides,
maybe starting from the good ideas. So why is it good that application is not
aware of caching? You remember this diagram from the beginning of this presentation, a lot
of services in different versions, different programming languages, they use each other.
That is by the way very small microservices in general you have way,
way bigger. So now you can look at the system,
you can look at the architecture design, you can look at diagram like
this and say, I would like to introduce caching service to
version one and service one and that's it. I mean you don't need to change
the code of the service in order to introduce
the caching layer. So you can do it in a declarative manner at the
functionality of caching. And that is the whole beauty of reverse proxy
caching. It simplifies a lot of things and
how it looks like in practice. So in practice you
will usually have like starting maybe from the containers at
the bottom you will have your application, then you will have your cache server
and the interceptor like a caching proxy. And then you need
some init container which will basically do some iptables
changes so that the request from outside does not go to the application, but it
goes to this caching proxy. But okay, if we look at this diagram and
about this idea of declarative manner of modifying your
system, it may make you think about like
istio and all these service meshes. And that is actually
true because that is the same idea. And recently
actually they added an envoy proxy. They added the support for
HTTP caching. So this reverse proxy sidecar caching
will become a big thing together with envoy proxy and
all the services meshes that use envoy proxy. Like for example
istio. I really think this will be the way a lot of
people will do caching. But okay, like I said, okay, there is no
such thing as a free lunch. There are some bedsides about
all this idea that application is not aware of caching.
And if you think about it, like if the application is not aware of
caching, there is one thing that becomes way more difficult
and this thing is called cache invalidation.
And actually if you look anywhere on the Internet, what is the hardest problem
with caching? It is the cache invalidation, meaning when to decide
that your cached value is outdated. It's stale, you should not
use it anymore. But go to the source of truth and
this is not a trigger problem. And when youll application is
aware of the cache, then you can have some business logic to
evict the cache value or do anything, basically anything youll
want depending on the business logic. However, if your application
is not aware of caching, then you are left to watch what HTTP
has, like timeouts, etags, basically timeouts,
and that's what you can do. So that is the biggest
issue with the reverse proxy caching. Short summary reverse
proxy caching. So from the good sides, it's configuration based,
so you don't need to change your application in order to introduce caching.
That is actually the reason why youll will use reverse proxy caching.
It's also programming language agnostic. It's everything agnostic because you don't
even touch the code of your application and it's very consistent with the
containers and microservice world and Kubernetes. That is why I really believe that
will be the future of caching. However, from the downsides, it's difficult to
do cache invalidation. There are no matrix solution yet. I mean it was
implemented in envoy proxy, so it will be matrix soon.
And it's protocol based, which is not such a big deal as we discussed.
So we covered all the caching topologies, all the caching patterns.
So now what I suggest as a short summary, I will try
not to repeat anything I said before because it may be boring.
So what I propose as a summary is a very simple
is oversimplified decision tree which can help you decide
which caching pattern is for me. So the first question I would ask is
do my applications need to be aware of caching?
If no, am I an area adopter? If no, use reverse proxy
with Nginx. If yes, use reverse proxy sidecar caching like
with the envoy proxy, istio or some other prototypes.
If your application needs to be aware of the caching, then your next
question is do I have a lot of data or some security restrictions?
If no, do I need to be language agnostic? If no embedded or
embedded distributed? If yes, use sidecar caching.
Now if you have a lot of data, you work in big organization,
you have some security restrictions. Then the last question you need to ask
is is my deployment cloud and if no run your own on
premises client server? If yes, use cloud solution.
So as I said, it's a little maybe oversimplified, but at least
it gives you the direction where to look for the right topology for
youll caching. As the last slide from the presentation, I would like just to mention
a few resources. So if you would like to play with all these
patterns and run the code, here is the link. First link is can
just run this code sample. Second link is a blog post of
how to configure hazelcast as a sidecar container.
Third one is our prototype, nothing that you can use on
production. A prototype of this reverse proxy sidecar caching with Hazelcast
and the last link is a very good video. Talk about Nginx
as a reverse proxy caching and with this last slide I
would like to thank you for listening. It was really a pleasure to speak
to.