Transcript
This transcript was autogenerated. To make changes, submit a PR.
So let's talk what happens when clusters collide.
If you search in any search engine, the R for Kubernetes or the R for
AKS, you will feel like in the song, but life could be a dream.
Everything is simple.
You just deploy to Kubernetes clusters in two different regions.
You set up some load balancer or a DNS in front of them.
You may be copy some data from one cluster to another man.
Everything is super simple, but from working in the last
years with many companies, from startups to large enterprises.
I experienced that nothing is simple in doing a DR for Kubernetes
and it's actually pretty tricky.
And hi everyone.
I'm Lev, CTO at TerraSky and TerraSky is a global solution integrator.
We are helping our customers to make sure their Kubernetes is secured,
monitored, resilient, and maintainable.
What we see is that Kubernetes provide us a lot of power, right?
It's very self service focused.
And it's super convenient for developers, but it comes at a price of how easy it is
to set up DR, and there's a reason for it.
If you take a look how things were in the past, someone who was
responsible for the infrastructure would probably create the compute.
The servers, the disks, the load balancers, configure DNS,
configure, issue the certificates.
So everything was provided for the developer and it was maybe
more manual and slower, but the DR process was more straightforward.
Whoever set up this environment knew how to actually do the entire failover
and what is required to do that.
Today, when the developers can actually ask for many of those
things by themselves, they can get load balancers through Ingress.
They can get disks through pvnpvc.
They can issue certificates in the investor and we'll see how things become
more complex when we're talking about DR.
So let's talk let's set the stage, right?
We have two AWS region.
I will have two AWS region in this talk, EKS cluster, one EKS cluster
in each of the regions, and we want to have disaster recovery.
And I'm not going into the philosophy and the RTO, RPO, you decide what it means for
you and what are you looking for, but I will just show you a couple of examples.
So we will start with a simple one and a naive one that probably
rarely happens in the reality.
Let's say we have a simple stateless deployment, and this deployment is
receiving no traffic from outside.
It just runs, maybe processes something.
Easy stuff.
So I take this YAML of deployment of sample app, two replicas.
This is our image.
I will just deploy it to different regions.
U.
S.
is one, U.
S.
is two.
Two deployments, easy, right?
we deploy the first cluster, we deploy the second cluster.
In case of emergency, we can switch the traffic or deploy on demand, or we
can have them active in both clusters.
Nothing.
Let's do a little more.
Let's say we have a simple stateless application or
software with incoming traffic.
No encryption at this point, that probably never happens in
real production environment.
And at this point, we're doing manual DNS management, right?
So what it will look like.
So same deployment.
We will have our container with port 80.
We will have our service that is forwarding the traffic to this port 80.
And we will have the ingress that will actually, listen to this, FQDN
of Main Do Level Labs link, and it'll forward everything that is
going to slash into our service.
Again, easy, we just deploy to the first cluster.
We can deploy it at the same time to the second clusters.
So in case of the emergency, we can create some kind of a CNAME that will
point to each of the clusters and or we can have it as active and we can have
round robin or some kind of weighted DNS.
We just have to adjust DNS.
Not complicated.
Now, if we're talking about persistent volumes, there are
many ways to achieve that.
And this is actually when you're searching, in the search engines,
this is what you will find.
So here's an example of how you can do it with Valero.
You will just have a cluster, you will create a backup, put
it in your object storage.
You can have another cluster from which you can.
Take it from the object storage and restore it into another cluster.
Here's a link if you want to check how to do that.
Another option, you can do it with commercial products like Portworx.
So Portworx will be running continuously.
This is a synchronic replication, disaster recovery.
Portworx will take the data, periodically put it in object storage, then another
cluster will pull it from the object storage and continuously restore it.
And then you can execute commands to failover to another cluster
and just launch the workload.
Very convenient, very easy in terms of data.
But that's not why we're here.
We want to talk about Kubernetes and self service, right?
And there are two great projects.
One of them called external DNS and the other one is CertManager.
So what do they do?
Imagine a world where you can create DNS records automatically and you
can generate certificates on demand.
Now stop imagining and just create this ingress.
And this ingress will actually, because of this annotation and CertManager
that is installed, We'll go and issue the certificate for main.
levlabs.
link because of this TLS, and he will save the certificate in
this secret, and he will also generate the A record in, route 53.
Very convenient.
Great self-service for any developer that just wants to deploy his software.
This is what it'll look like.
so you will create an ingress.
This ingress will cause to go to route 53 and update the record.
It'll go to less encrypt and generate the certificate, and this is great, but.
Great power comes great responsibility, as we know.
So let's talk about the problems that it creates.
The first one is the routing traffic problem.
I can't deploy the same ingress twice because of what I just showed you.
If I deploy the same ingress twice, external DNS will now fight
over who the record belongs to.
So that's a problem.
So I need an ingress for the active cluster and I need the ingress for
the ingress And I need a CNAME for a main record that will actually forward
the traffic to the relevant cluster.
But there's a problem.
If I create this CNAME for main and I will actually access my clusters
where I have only ingress for active or ingress for the arm, no one will listen.
to the main host header.
So what I actually have to do is something that looks like that.
I will have two clusters with ingress for primary levelups.
link and dr.
levelups.
link that will auto configure the route 53.
And I will have a separate record that's called main that I control.
That will point to those ingresses that are actually
forwarding to the same service.
So this way I can control
which cluster I want to access and when.
Let's see how that looks in terms of configuration.
So here we have an example of the DR, right?
This is an ingress for the app DR that listens on the DR lablabs.
link and forwards to the application of DR.
And here we have an ingress for the primary, so it will
listen on a host of primary.
So those will create the route 53 records automatically.
And then I will have the ingress for the main.
As you can see what I'm doing here is I'm saying that the annotation
for this one, for external DNS, is actually the host name is empty.
This way, main.
levlabs.
link will not try to create any records in Route 53.
And this is how I can create my CNAME and manually control
who is the active cluster.
And then in each cluster, it will point to the relevant server, whether
it's active, whether it's DR, whether it's the same name in both of them.
And this is the CNAME in my, Route 53, right?
So you will see that it's a C name, and it's actually a forwarding
now to the primary lab, levlab.
link.
So this is the record.
This is what it points to.
And I have to have very short TTL so I can adjust it and failover traffic quickly.
Let's talk about the second problem, the certificates.
Now we have an FQDN.
Those, this is how our customers will access it, right?
So we have FQDN is called main.
levlabs.
link, but it doesn't match dr.
levlabs.
link or primary.
levlabs.
link that are auto generated from our ingresses.
So this is what it looks like.
And those are the ingresses that are creating this certificate.
And we have to figure some way out how to solve it.
So here's how I'm doing it.
I will have ingress of our sample app, let's say in the primary, but inside
of the TLS, I will have multiple hosts.
This is how I will create the SAN, the service alternative name, right?
So I will have multiple FQDNs here.
This will cause cert manager to go and issue the certificate.
One certificate with both FQDNs and it will place it in this
secret name in the ingress cert.
Then I will actually create my main ingress for main level labs, but it's not
actually going to Let's Encrypt at all.
But instead, I have the same TLS hosts.
And I'm leveraging the same secret that I issued here.
So this way, I will have Ingress that is listening to me, that actually will have
the certificate that matches its FQDN, and it will be able to receive the traffic.
But it's not trying to generate the, Route 52 record or the certificate for itself.
And this is what it will look like.
Here's my safari that was able to successfully access, the
site main level lobster link.
And you can see that I actually have DNS name for Main and DNS for primary.
And if I will fail over, I will actually have main and the dr.
So what this solution actually looks like, if I take everything
and I just show it in one picture.
So I will have two AWS region, one IKEAs cluster in one.
In one region, I will have self service for certificates, self service
for DNS and functional DR that I can control where the traffic is going to.
And this is how, right?
So I will have my ingresses that will access Amazon route 53 and let's encrypt.
And I will have, my.
Ingresses that are forwarding to service that is forwarding into my
deployment and potentially I can have persistent volume with data replication
that I can copy, let's say, with Portworx or periodically with Valero
commercial products like a step.
It's important to remember that we are, it's a true Pandora box, right?
For example, not everything has to be replicated.
There are a couple of.
gotchas.
One example is that you cannot replicate or copy the same ingress into both places,
especially if you're doing stuff like self service with self manager or external DNS.
Another example is that you probably want to change some of the labels that are you,
that you're restoring in another cluster.
Maybe you need to change them, adjust them, something like this.
So you can't just copy paste.
A lot of people or a lot of companies forget about the dependencies.
They're really focused on just making sure they can switch stuff
from one cluster to another.
But there are things that potentially close to the cluster that can fail.
For example, the manifest repository, where you pulling all your yamls, all
your helm charts, all your manifests.
Second thing obviously is the image repository.
It has to be accessible and available in the second site or from the second
site, when you're doing the failover, obviously you have to address the
monitoring, the ICD and authentication.
Now, in addition, again, as you can see.
Many of the companies really focused on the data.
They copying data and from their perspective, this is all they had to do.
But the reality is the connectivity aspects are usually ignored and they're
super important to take in consideration.
And you have to drill your DR, do your chaos engineering and just test this.
Thanks a lot.
It was helpful.
If you have any questions, please talk to me.
I will be available in the chat.
Or email me.
Thanks a lot.