Architectural Caching Patterns for Kubernetes

Video size:

Abstract

Kubernetes brings new ideas of how to organize the caching layer for your applications. You can still use the old-but-good client-server topology, but now there is much more than that. This session will start with the known distributed caching topologies: embedded, client-server, and cloud. Then, I’ll present Kubernetes-only caching strategies, including: - Sidecar Caching - Reverse Proxy Caching with Nginx - Reverse Proxy Sidecar Caching with Hazelcast - Envoy-level caching with Service Mesh

In this session you’ll see: - A walk-through of all caching topologies you can use in Kubernetes - Pros and Cons of each solution - The future of caching in container-based environments

Summary

Rafael's talk is titled Architectural caching patterns for kubernetes. He will walk through all possible caching patterns that you can use in your system. While he's talking, think about two things: which of this pattern you use and if it makes sense to change to any other.
Where is the right place to put your cache? Is it inside of each microservice? Or maybe as a separate thing in your infrastructure. Or maybe we should put cache in front of each service. The first topology that you can use is embedded cache.
Client server is kind of database style, database style. We will deploy our caching server separately and then use cache client to connect to the server. This is a common strategy in this microservice world. You can even move this managing part outside our organization and move it into the cloud.
Last caching topology will be reverse proxy sidecar caching. Application will be not even aware that such a thing as a cache exists. Nginx is a good solution but has some problems. You can do it in a declarative manner at the functionality of caching.
A simple decision tree can help you decide which caching pattern is for you. Do my applications need to be aware of caching? If no, use reverse proxy with Nginx. If yes, use sidecar caching. Last question is is my deployment cloud and if no run your own on premises client server?

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hello I'm Rafael and I'm speaking to you from Poland. My talk is titled Architectural caching patterns for kubernetes and I will tell you what different approaches you can use in caching while using kubernetes. What are the applications for your system designs? But first a few words about myself. I'm cloudnative team lead lead at Hazelcast, and before Hazelcast I worked at Google and CERn. I'm also an author of the book continuous delivery with Docker and Jenkins. And from time to time I do conference speaking and trainings, but on a daily basis I'm an engineer. A few words, but Hazelcast. Hazelcast is a distributed company and is distributed in two meanings. First meaning is we are distributed company because we produce distributed software. Our products are hazelcast in memory, data grid, hazelcast jet and hazelcast cloud. But the second meaning is that we are distributed company because we always work remotely. So it was always that way. Our agenda for today is pretty simple. So there will be a very short introduction about caching on kubernetes in the microservice world in general. And then we will walk through all possible caching patterns that you can use in your system. And while I'll be talking, I would like you to think about two things. First thing is which of this pattern you use in your system because you must use one of them, because this list is complete. And the second question I would like you to ask yourself is youll it make sense for my system to change to any other pattern? Youll it be beneficial to me. And with this question I leave you to listen to this talk. So we are in the microservice world, in kubernetes we deploy microservices and that is a diagram of a classic microservice system. So we have a lot of services, they have different versions, they are written in different programming languages and they use each other. Now the question for this talk is where is the right place to put your cache? Is it inside of each microservice? Or maybe as a separate thing in your infrastructure? Or maybe we should put cache in front of each service. And that what we will discuss. So the first caching pattern, the first topology that you can use is embedded cache. Embedded cache is like the simplest possible thing you can think about. A diagram for this looks as follow. So we deploy it on Kubernetes. So as always, kubernetes request goes to our system, it goes to the Kubernetes service. Kubernetes service forwards the request to one of the Kubernetes pod in which our application is running and we have a cache inside our application, embedded as a library inside our application. So request goes to our application, application checks in the cache okay, did I already executed such a request? If yes, return the cache value. If no, do some business logic, put the value into the cache, return the risk. This is so simple that we could even think about writing this caching logic on our own. So if you happen to use Java, that is how it could look like. So we can have some collection like concurrent hashmap then processing the request, okay check if the request is into the cache. If yes, return the cached value. If no, do some processing, put the value into the cache and return the response. If you use some other language, data will be the same. Now you can implement it on your own. However, please never do it. Never do it because a collection or concurrent collection is not good as a cache. It's not good because it has no eviction policy, no max size limit, no statistics, no expiration time, no notification mechanism. It misses a lot of feature that you will need from the cache. That is why if you happen to use Java, there are a lot of good libraries. Guava is one of them, where you can define all these missing features upfront in your constructor. Or eh, cache is also another good solution. If you use some other languages for every language you will find a good library for caching. Now we can move this idea of caching one level higher and put it into our application. So if you again work with Java, your application framework may be spring. So if youll would like to cache something with spring, you don't need to write all this manual code, you just annotate your method cachable and then every call to this method will first check okay if the given ISBN is already in the cache called books. If yes, returned the cached value, and only if the value is not found in the cache called books only then execute a method find book in slow source. But be careful if youll use spring because for some reason spring uses concurrent hashmap by default. So you're better off changing your caching manager to something else to for example guava. So embedded cache is pretty simple. But there is one problem with embedded cache. So imagine now that request goes to our service, it's forwarded to the application. Let's say on the top we do some long lasting business logic, put the value in the cache, return the response all good. Now the second time the same request may go to the Kubernetes service, but it's load balanced to the application at the bottom. And now what happens? The application needs to do this business logic once again, because these caches are completely separate, they don't know about each other. That is why one of the improvement of the embedded cache will be to use embedded distributed cache. So in terms of the patterns or topologies, it is still the same. However, we just will use a different library, not caching library, but distributed caching library. We can use for example hazelcast which is a distributed caching library, so you can embed it into your application and now the flow is the same. But now no matter which embedded cache instance youll use, doesn't matter because they both form one consistent caching cluster. How to use it how will you use it in your application? If we stick to the spring example, the only thing you need to change in your application is actually to specify I would like to use hazelcast as my caching manager. All the rest is the same. So cache hazelcast instance embedded in each of your application, they will all form one consistent caching cluster and will work fine together. Now you may wonder like but how, I mean you deploy it somewhere like on Kubernetes and how they discover each other, how one instance of hazelcast knows that it needs to connect to another instance of Hazelcast. So we thought how to solve this discovery problem and we came up with the idea of plugins. So for each environment we have a plugin which is by the way auto detected. So you run on kubernetes, hazelcast discovers. Okay, I'm running Kubernetes, I should use Kubernetes plugin and it uses kubernetes API to discover other members. So you really don't need to do anything and your hazelcast cluster will form automatically. If you are interested in details how to configure this, then there are a lot of resources, we have documentation, we have a lot of blog posts which you can read. So we ended up with this diagram of our embedded distributed cache. So let's make a short summary about embedded caching. So from the good size embedded caching, it's very simple. Configuration is simple, the deployment is simple because it goes together with our application. So you don't need to do anything. Youll have very low latency data access and usually youll don't need any separate ops team needed. From the downsides, the management of your caching is not flexible because if you youll like to scale up the caching cluster, you need to do it together with your application. It's also limited to JvM based application like this Hazelcast example. But in general your embedded cache is limited to your language of choice. For every language you will have a different library and the data is collocated with the applications, which may be a problem or may not be a problem in your case. Okay, the next pattern, the next topology that you can use is client server. Client server is kind of database style, database style. So we will deploy our caching server separately and then use cache client to connect to the server. It looks as follows request goes to again to our Kubernetes service, it goes to one of the application and then the application uses cache client to connect to the cache server which is deployed separately. Usually in Kubernetes it will be deployed as a stateful set because cache server is a stateful thing. So it will be deployed as a stateful set. Now if you compare this solution, this pattern to embedded caching, there are two main differences. The first difference is that we have this thing on a diagram. So this cache server, it requires some management, some maintenance. That is why in the big enterprises you usually see even a separate team dedicated to operate like not only cache clusters but like databases. All this stateful thing for youll system, but also it's deployed separately. It means that you can separately scale it up or down. You can think about all this management like backups separately. Now if we compare this diagram to the embedded mode, there's also a second difference which is very important and that is this part. So now your application uses cache client to connect to the cache server. And using cache client means that you can actually use a different programming language for your cache server and different programming language for your applications. Because there is a well defined protocol between cache client and cache server. So no problem with that. That is a very common strategy in this microservice world where you usually deploy your cache server separately or multiple cache servers and then your applications written in different programming languages, they can access the server. This is such a common strategy that like redis, it supports only this cache client server, the same with memcached. So these are the only topologies they actually support. Now how to set it up? If youll like to say okay, I would like to have this client server, how to do it? So for the Kubernetes we provide a helm chart, we also provide an operator. So actually the simplest you can do is helm install hazelcast, you already have your cache server running. Now the client part, if we stick to this example from spring, that is how it will look like so we need to define, okay, I would like to use Kubernetes plugin for discovery. So please discover my cache server and that's it. Client will automatically discover the caching server, connect to this and that's actually all you have to do. So let's come back for a moment to this diagram. So we separated this, this cache server is a separate thing. Then your application goes separately. As I told you, in a big enterprise, usually this cache server is managed by a separate team. So we can even go one step further and move this managing part outside our organization and move it into the cloud. So cloud is kind of client server but it's very specific because the server part, it's not managed inside our organization. So it works like this. So again request goes to Kubernetes service, it's load balanced to one of the application. Now application uses cache client to connect to the cache server and the cache server is deployed somewhere, it's provided as a service so you don't need any management, you don't need ops team, you just need to pay the cloud provider which is usually cheaper than maintain it on your own, how it looks like in the code. So to start up the cache server caching cluster you just click on the console of youll ClI and then when it's started you can take the discovery token, put it into your application and it will discover your cache server automatically. And this is by the way like not only how hazelcast cloud works but how most cloud solutions, how they provide you. Even like databases, MongoDB or whatever you use, that usually is based on the Discovery token pros and cons of client server and cloud solution. So from the good sites we have our data separated from the application, we have separate management. So you can scale up down your cluster separately from your application. And it's programming language agnostic because you use cache client to connect to the cache server. From the downside you have a separate ops effort or you need to pay for the cloud solution and higher latency. You need to think about latency. This is something I didn't mention yet, but we need to cover this. So usually when you have your cache embedded latency is low because cache goes together with your application. However with the client server or cloud you need to think about latency. If you deploy client server on premises then you need to make sure that they are deployed in the same local network because remember we are in this domain of caching, it's a very low latency domain. So even one router hop is a lot. So make sure that you deploy your cache server inside the same local network where your application is running. Now what about cloud solution? So about cloud solution is the same. What do we do in hazelcast cloud? So when you create your caching server you need obviously to create this in the same geographical region. So we don't provide infrastructure using hazelcast cloud. You can deploy hazelcast on AWS, GCP or Azure. So you should choose the same cloud provider where your application is running. You should choose the same region where your application is running. But that's not enough. We provide also a way to do a VPC peering between our network and between the network where we deployed your cache cluster for you and your application. So after that VPC peering, you are like running in the same virtual local network basically. So there is not even one router hop in between. And that is very important to keep in mind because otherwise your latency will suffer. Okay, we covered like all the patterns so far, like embedded client server cloud. They are quite old in a sense that you know them, probably from databases or they are nothing new. So now there will be a pattern that is quite new, that is very popular, especially in Kubernetes, but it's not limited to Kubernetes. In fact you see sidecar in other systems as well. So cache as a sidecar, how it looks like request goes to our Kubernetes service, it is forwarded to cloud balance to one of the Kubernetes pod. And now inside each of the pod is not only application that is running, but also a cache server. So request goes to the application and application connects to the local host where your cache server is running. And all these sidecar cache servers, sidecar cache containers, they form one consistent caching cluster. So that is the idea. So this solution is somehow similar to embedded mode, somehow similar to the client server mode. It's similar to the embedded mode because kubernetes will always schedule your caching server on the same physical machine. So you have your cache close to your application, it scares up and down together. So it's kind of like embedded. There is no discover needed, your cache is always at the localhost. That is good. But it's also similar to client server because after all your application uses cache clients to connect to the cache server. So there is no problem like with cache can be written in different programming language than your application, no problem with that. And there is some kind of isolation between cache and application. It's on the container level, which may not be good enough or may be good enough for you, depending on your requirement. How to configure this. So let's stick to this spring example. So in a spring how we configure this, this is our client configuration. So we connect to the cache server with the local host because we just know, so it looks like a static configuration, but actually the whole system is dynamic. We just know that the cache server is running at the local host and the Kubernetes configuration. So we have two containerbased. One is our application with our business logic, and the second one is our cache server. In this case it's hazelcast. Short summary sitecar cache from the good sites configuration is again very simple, it's programming language agnostic. We have low latency and there is some isolation of data between application and the cache. From the downsides, we again do not have flexible management because your cache scales up and down together with your application, and your data is after all collocated in the same application with the application pop, which again may be good, maybe not good enough, depending on your use case. Okay, we covered sidecar. The last caching pattern for today, last caching topology will be reverse proxy, and reverse proxy will be something completely different. It will be completely different than what we've seen so far. It will be different because so far our application was all the time aware that such a thing as a cache exists. It was explicitly connecting to the cache server. However, now we will do something different. We will put cache in front of our application, so our application will be not even aware that such a thing as a cache exists, how it looks like. So request goes to our system, and now just before Kubernetes service, after Kubernetes service, or maybe together with like ingress in Nginx, we put cache and first like if the value is found in this cache, just return the response, it does not even go to the application. If the value is not found in the cache only then you go, the request goes to the application. Nginx is a very good solution because it's very mature, it's well integrated with Kubernetes, and it's just something you should use. If you go with the reverse proxy, how the configuration for caching looks like. So that is the simplest thing you can do with Nginx. So specify, okay, cache it on the HTTP level, that is the path for your caching. So Nginx is good, it's mature, it's well integrated with Kubernetes, but it has some problems. So one maybe not a problem, but the trait of Nginx, it's HTTP based, but okay, we are HTTP people, this is fine. But another problem which is a bigger thing, it's Nginx is not distributed and NginX is not highly available. And NginX maybe does not store data of the disk, but it can offload your data to the disk, which for example is not the case in hazelcast when you are guaranteed that your data is stored in memory, so that latency is low. That is why you have to accept these brings if you use Nginx. But still Nginx is a very good solution. Now the last, last variant of the reverse proxy will be reverse proxy sidecar caching. So this will be the last variant of the reverse proxy topology. This looks like that request goes to Kubernetes service. It load balances the traffic to one of the Kubernetes pods. But now it's not the application that receives the request, but it is something that we will call reverse proxy cache container. So like cache server, but also like the network interceptor which checks okay, what goes to the application, and only if the value is not found in this reverse proxy cache container, the request goes to the application. So the application again does not even know that such a thing as a cache exists. But okay, like all this thing, that application is not aware of the caching. It's good and bad. I mean it has some good sides and bad sides, maybe starting from the good ideas. So why is it good that application is not aware of caching? You remember this diagram from the beginning of this presentation, a lot of services in different versions, different programming languages, they use each other. That is by the way very small microservices in general you have way, way bigger. So now you can look at the system, you can look at the architecture design, you can look at diagram like this and say, I would like to introduce caching service to version one and service one and that's it. I mean you don't need to change the code of the service in order to introduce the caching layer. So you can do it in a declarative manner at the functionality of caching. And that is the whole beauty of reverse proxy caching. It simplifies a lot of things and how it looks like in practice. So in practice you will usually have like starting maybe from the containers at the bottom you will have your application, then you will have your cache server and the interceptor like a caching proxy. And then you need some init container which will basically do some iptables changes so that the request from outside does not go to the application, but it goes to this caching proxy. But okay, if we look at this diagram and about this idea of declarative manner of modifying your system, it may make you think about like istio and all these service meshes. And that is actually true because that is the same idea. And recently actually they added an envoy proxy. They added the support for HTTP caching. So this reverse proxy sidecar caching will become a big thing together with envoy proxy and all the services meshes that use envoy proxy. Like for example istio. I really think this will be the way a lot of people will do caching. But okay, like I said, okay, there is no such thing as a free lunch. There are some bedsides about all this idea that application is not aware of caching. And if you think about it, like if the application is not aware of caching, there is one thing that becomes way more difficult and this thing is called cache invalidation. And actually if you look anywhere on the Internet, what is the hardest problem with caching? It is the cache invalidation, meaning when to decide that your cached value is outdated. It's stale, you should not use it anymore. But go to the source of truth and this is not a trigger problem. And when youll application is aware of the cache, then you can have some business logic to evict the cache value or do anything, basically anything youll want depending on the business logic. However, if your application is not aware of caching, then you are left to watch what HTTP has, like timeouts, etags, basically timeouts, and that's what you can do. So that is the biggest issue with the reverse proxy caching. Short summary reverse proxy caching. So from the good sides, it's configuration based, so you don't need to change your application in order to introduce caching. That is actually the reason why youll will use reverse proxy caching. It's also programming language agnostic. It's everything agnostic because you don't even touch the code of your application and it's very consistent with the containers and microservice world and Kubernetes. That is why I really believe that will be the future of caching. However, from the downsides, it's difficult to do cache invalidation. There are no matrix solution yet. I mean it was implemented in envoy proxy, so it will be matrix soon. And it's protocol based, which is not such a big deal as we discussed. So we covered all the caching topologies, all the caching patterns. So now what I suggest as a short summary, I will try not to repeat anything I said before because it may be boring. So what I propose as a summary is a very simple is oversimplified decision tree which can help you decide which caching pattern is for me. So the first question I would ask is do my applications need to be aware of caching? If no, am I an area adopter? If no, use reverse proxy with Nginx. If yes, use reverse proxy sidecar caching like with the envoy proxy, istio or some other prototypes. If your application needs to be aware of the caching, then your next question is do I have a lot of data or some security restrictions? If no, do I need to be language agnostic? If no embedded or embedded distributed? If yes, use sidecar caching. Now if you have a lot of data, you work in big organization, you have some security restrictions. Then the last question you need to ask is is my deployment cloud and if no run your own on premises client server? If yes, use cloud solution. So as I said, it's a little maybe oversimplified, but at least it gives you the direction where to look for the right topology for youll caching. As the last slide from the presentation, I would like just to mention a few resources. So if you would like to play with all these patterns and run the code, here is the link. First link is can just run this code sample. Second link is a blog post of how to configure hazelcast as a sidecar container. Third one is our prototype, nothing that you can use on production. A prototype of this reverse proxy sidecar caching with Hazelcast and the last link is a very good video. Talk about Nginx as a reverse proxy caching and with this last slide I would like to thank you for listening. It was really a pleasure to speak to.

See all 45 talks at this event!

Conf42 Cloud Native 2021 - Online

April 29 2021

Architectural Caching Patterns for Kubernetes

Video size:

Abstract

Summary

Transcript

Rafal Leszko

Cloud-Native Team Lead @ Hazelcast

Join the community!

Featured event

2026

2025

Info

Conf42 Cloud Native 2021 - Online

April 29 2021

Architectural Caching Patterns for Kubernetes

Video size:

Abstract

Summary

Transcript

Rafal Leszko

Cloud-Native Team Lead @ Hazelcast

Join the community!