Using cloud native approach to organize data protection

Video size:

Abstract

Modern cloud applications and platforms make it easy to use multiple data services and tools resulting in wide range of deployments, services and data stores. In this talk we’re going to learn how to use K8s tools to organize data protection and make day2 operations easier using CNCF project Kanister

Summary

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hi, I'm Daniel Fedotov. I'm a senior open source engineer in custom by Veeam and today I'd like to talk about how can we use cloud native approach and tools to organize data protection in our applications deployed in cloud native and talk about using Kinesis CEO is which is, CNCF project and an open source tool to do that. starting from why do we do that, right? Kubernetes deployments have come a long way, since first introduction. But there's still some, myths around data on Kubernetes, such as everything is stateless, which isn't true because Kubernetes itself has a state and, it changes and there's something new being added, something being removed. And also, stateful workloads, people run stateful workloads on Kubernetes because they need state in their applications. And if you want to run the application as Kubernetes, they need state. the myth being that why would I need to protect my data when I can just recreate my application, which is true. You can recreate your application, but the application has data and apart without this data, it could be, useless to the point plus all the, security audit requirements, forensics, all those things require you to actually save your data. And more than that, it might require you point in time, snapshot of your data, or point in time recovery, even for disaster recovery, for example. the, common myth in the cloud development is that. My cloud provider is going to do the backups, which is again, true only partially because your cloud provider might just lose your account. That happened. your cloud provider explicitly says in the, all the, instructions on how to, backup your data. They say, please do some sort of export, of your data. and. That's something that they can't really do for you. and finally, there are some tools which can back up the etcd database in your Kubernetes, but recovering it, restoring it into something that works is almost impossible. And there's literally almost no evidence that it might actually work. Although backups are still possible. still important for, for tracing and, forensics, with all that, amount of data we have in the Kubernetes deployment and the cloud native deployment is growing more and more users are adopting, data services, volumes, databases, on the Kubernetes over time. And the most, one of the most common applications, run on Kubernetes is Redis, right? Or Boster, elastic Search. All of these things need some sort of a data, storage disks, et cetera. and people actually run those. So this is the, this is something that you want to do and this is something that, really needs, management because all of this data is running on this. on disks, which are vulnerable, which needs to be secured for many reasons. and finally, another thing that we face in another challenge that we're facing with cloud deployments is that, in order to follow the three to one backup rule, which says that you have multiple copies, have them in different media, and you have them, at least one of them is off site. this is additional challenge for a Kubernetes deployments because we need to move data out outside of the, cluster that we actually run it on. so some extra challenges of the, Kubernetes deployment in particular compared to other types of deployment, like VMs, being that, with, The way how with how easy it is to create, heterogeneous apps, right? You can run microservices. They can have access to some data services. They can include multiple, span over multiple domains, have many moving parts, use different cloud, backups, cloud, backends, So managing all this backups, which could happen in your cloud infrastructure, or which may not happen in the cloud infrastructure, managing disks, managing the workloads and, application life cycles, such as your application can actually be slowed down and backup can be performed and then application can be, restored again. this is all getting more and more complex as we. actually have very different ways of deployments, for our Kubernetes application and the ways that our deployments access the data. so we could have deployments, which access some cloud database. We can have some database in the cluster in separate namespace, or we can have Namespaces, data services, co located with the actual application code. All of these things exist and all of these things need some approaches to backup and restore. and on top of that, there's another challenge, which is, again, being more and more pronounced with the Kubernetes, more microservices, is that, we can't just have, a VM with all of our application, slow down, shut down, data collected and restarted, right? We have all the microservices. So running things like application centric, backups. in which your application should be consistent with your data backup, in which you can have multiple data storage, all taken, all backups taken at the same moment, that becomes, that becomes a challenge. and a lot of Tools that we have, in the cloud services, they would just do one thing like, an RDS snapshot or, volume snapshots in, Kubernetes or in the, on, in the cloud providers infrastructure. So how do we manage that? And, to manage that, I'd like to introduce the canister. Canister is a product, created by Kasten. it is currently CNCF, sandbox project. So it's. It's officially a part of CNCF org and it's moving towards, incubating stage. what Canister contributes is essentially a framework, which can, which is open source, which can be used by anyone and which, can have integrations with all these, different data services. And, different data storages and this, integrations, can be shared between different organizations and can be, they can be reused between different organizations. And this really just putting a standard on the Kubernetes, a soft standard, I would say. Similar to how, Istio was introduced as a soft standard, to just teach people how do they, how can they organize their, the backup and restore process and give the tools out of the box. Here's your tool, how to back up the passwords, for example. How does it work? So how can we achieve all these things with application consistency? With supporting multiple, topologies and multiple services, right? How would you do that? In home, in your custom bespoke scripts, you would have a script which would call all the, data services, disk APIs, and they will perform some sort of a snapshotting operation. And then it will collect all the data, all the artifacts from them, present them in some way. Then you need to run the script somehow, somewhere and some at some point in time. And then you need to store your artifacts somewhere. What Estu does it, it moves that to Kubernetes. so what we have there is a bunch of custom resource definitions, which is a way that, you can interact with canister. first one being the blueprint. blueprint is a description of how do you want to perform a backup or a store, such as I want to, let's say scale down the application. I want to back up the disks. I want to back up the database. I want to scale up the application. These are several phases you would have in your script. They just described in a slightly different manner, in the blueprint. Then action set is a custom resource definitions, which runs the backup or restore operational cleanup operation. which. Upon creation, performs an operation, records the status of this operation, and records the artifacts which were, produced by this operation. So this is your, essentially log of actions, the action sets in your Kubernetes, they will be stored in the Kubernetes database, right? And there will be a log of the operations. And finally, profile is a configuration which allows us to point to different. destinations for data, such as, S3 buckets, Google objects, storage, Azure objects, storage. so we can export data into different, destinations. now this is roughly how it works, right? Your blueprint describes what kind of actions do you need to take. you would create an action set. canister controller will pick this action set up, find the blueprint, build an execution plan, run all the operations on Kubernetes runners. the operations would do some, do some extraction of, let's say from MySQL app or some application, some data storage, and they will save it into some object storage. And finally. They would report an artifact, the status of the artifact. Back to you to the action set so you can read it from the action set And if you look closer into The blueprints essentially are custom resources which can be created in your Kubernetes database There are some, example, blueprints in a canister, repository, those, custom resources includes actions. So actions could be something like backup or restore or cleanup or delete. In this case, each action would execute, several phases, one or more. And those phases. Execute, sequentially and they would do each phase would do something like run a pod or, create an RDS snapshot or do something else. what blueprints provide is a template. For what needs to be done. now in order to fill in the template variables, we have the action set. Action set is a command to execute a certain action. And it fills in values of the template. It could reference an object such as config map or stateful set or deployment. It points to a profile, which essentially points to a destination where data will be saved. And it's, describes the actions, which actions needs to run and what values, what template values needs to be, filled in. in there, then action sets also this important thing that they track the status of the operations. So if a phase fails, the action set will have information of which phase failed and with what error. And, if phase completes, it contains information about the artifacts that this, phase, and then we have the, resume of all the artifacts produced by the phases in this and this action set, so action sets in this way becomes your log of operations and blueprints are your definitions of, which operations need to be run. yeah, we can continue with the more concrete example. I'm going to run. An RDS backup, and I'm going to describe what I'm doing in the process of, setting up a canister, creating a blueprint, and then, performing a backup of RDS Postgres database. as, set up in this, demo, we have an RDS database, we have our Kubernetes cluster with database client. There's going to be a config map pointing to this RDS database. then we'll have canister installed in the Kubernetes cluster. And this canister would have access to an S3 bucket. The important note here that. Although we use S3, it could be actually any object storage, it was just easier to set up for the demo, but, RDS and S3, in this case, don't really have to be, in the same, cluster, in the same, data center or anything. These are separate things. Our, or S3 could be not S3, but we will. and. Okay, let's run the demo. I have it pre recorded, so sorry for the aspect ratio and for the mouse movements. as mentioned, we have an RDS database, we have a Kubernetes cluster, we have our test app deployment, and We have our S3 bucket, which we want to actually save our backups to. So now let's populate some data into the database. Let's have some just Postgres tables. we have some, we have some data we want to protect. so next step would be to install kinister, which is, pretty straightforward, what it takes is just, install on Helm chart and that's, it's pretty standard. so we have a Helm repo. then we have, actual Helm releases installed. we create it in the kinister namespace in this case. all the defaults, for, canister, usable out of the box, there's some additional parameters you can configure such as cluster role bindings, maybe. security, what cuts us to the namespaces, but this works out of the box just with Helm install. We see there's an operator running and we have a few custom resource definitions such as blueprints, section sets, and profiles that we're going to create as a part of this demo. so in here we create a blueprint, which is. In this case, we take it from canister repository, ideally you would, maybe take some example blueprints, as an inspiration and create a blueprint for your infrastructure, because you might have more than one moving part. I might want to, actually coordinate, so one executes first, another executes next, et cetera. Blueprint here was created as a custom resource. It's there. now our cluster contains information, how to. backup, RDS databases, and as you can see, it's all template variables, which needs to be filled with, with the action set. next step, we want to create the profile, which is a reference to the, S3 bucket essentially. and, profile ID, we'll need that just in order to reference it from the action set. That's why I'm creating profile here in the demo. finally, we want to create an action set, which will perform a backup operation. So as, as soon as action set is created, canister, controller. Will perform the backup operation in this case. We're using con CTL tool, which comes with canister Which is just a helper tool to create custom resources. it's all it does It's just more convenient to do it with a bunch of parameters than to create A young file with all the definitions. That's why we're using it here, but you can create the young file So with, with the definitions and, with the right spec and just create it as normal Kubernetes resource. we reference, the blueprint, we reference the profile, we reference the objects, which is, which is our database config in this, particular case. In some other cases, it could be a deployment or a stateful set, for example. So now as we get our action set created, we can, get its action set ID, which a record in your Kubernetes database, for which action set can be identified. pretty useful in order to, create restore, for example, from backup operation. And we're going to look into what's going on in there while it's running. And so while it's running, we have, the current running face and we have the artifacts in the status. And as you can see, those are currently not populated and I will get populated as the action actually gets executed as phases get finished. And they put, artifacts information into, into the artifacts of, in the status of the action set. So here, when everything is completed, of course, there's, I did some editing here because this particular action runs for some time. after it's all completed, we have a state complete. We have, the time when actions were each phases, each phase was executed and we have, artifacts, which were produced by all this phases. So this is our complete operation. so we could take the artifacts and combine them with a profile and the blueprint to then create or store operation, because this action set contains everything you need to know about the operation, like what blueprints was there. what profile, where do we send the data and what was the artifacts named and also when it was executed. So this is like full information of the operation. so in the S3 bucket, now we have this file, which is our artifact that we created from this operation. so now. Let's say something happens With our table in this case. I'm just gonna do very simple thing. I'm just going to draw table Which something that happens more often than these corruptions? Really in the real world people drop tables all the time So that's one of the main scenarios for a cover really but of course there are other scenarios possible for backup and restore is just, going to go for the simple one. And, in this, command, I'm also using const TL tool to create one action set out of another action set. That's, even more convenience for this const TL tool, because. we will need to create the YAML file and then fill it up with output from the status of the action set resource that we just seen for the backup operation. While this one can do that automatically. All it does, again, is just, creates a custom resource as a result, it's just a simpler command in here. And here we select, action and we point to, a previous action set from which it gets the blueprint. so now we can take a look at what this, restore, action set, looks like. so it exists, it completed, has a time when it completes it. And of course we can take a look inside of what it looks like. So inside of it, you can see that it got, Input artifacts in the spec, which were populated from the backup, action set. We have a blueprint in the spec, which was populated from the backup. We have a profile, which was populated from the backup, action set. And we have a operation name, which is restore, which is something different that we specify in the arguments. then in the, status of this action set, we see an operation that was performed. some output that it produced, which is not really too relevant right now. and then we have a status that it's complete. this is our restore operation session. And, yeah, now what we want to do is to verify that our table, is back. gonna just check the table. Here it is. so we do have a bug. so that's pretty much it for the demo. the only last thing, is that, we have the action sets in our both action sets for backup and restore in our. Kubernetes database. So this is our log of operations. and it contains all you need to know about what was performed, right? It has a reference to blueprints. It has reference to artifacts, reference to profiles, and it has a log of all the phases that were executed. so that's. Let's go back to the presentation. so what we just did right in this demo, we created an RDS snapshot. We exported it to an object storage, S3 in this case, but could be something else. we run everything, all the operations in the Kubernetes runners. So don't need any provision, any new machines or run anything locally. we have blueprints describing what exactly we did, which stored in the Kubernetes database, or it could be in your GitOps, which are there documented and, could be well, represented and tested. and we have all the operation log in the, form of action sets. So all the action sets. give you an idea of what was running. So this way, what was running, what was the output of it? when it was running. So you can really have a holistic, view on your, process of backup and restore for your application design specifically for application user blueprints. canister right now has a bunch of functions that you can run, kubexec and kubetask are quite useful because they run pod with a command. So that's quite universal for a lot of things you can do such as database, dumps, for example. we also support some, resource life cycles, such as scaling up, down. workloads, PVC operations to get files from PVC, pro from volumes, volume snapshots, Amazon RTS, custom function, which we demonstrated in this demo. And we support a whole bunch of, object storage destinations. yeah, and that's it. thank you very much for, joining this talk. here's some links for, our GitHub repo and our Slack channel. please take a look, download the blueprint, try it out yourself. Thank you.

Slides

Download slides (PDF)

See all 32 talks at this event!

Conf42 Kube Native 2024 - Online

September 26 2024 - premiere 5PM GMT

Using cloud native approach to organize data protection

Video size:

Abstract

Summary

Transcript

Slides

Daniil Fedotov

Senior Open Source Engineer @ Kasten by Veeam

Join the community!

Featured event

2025

2024

Info

Conf42 Kube Native 2024 - Online

September 26 2024 - premiere 5PM GMT

Using cloud native approach to organize data protection

Video size:

Abstract

Summary

Transcript

Slides

Daniil Fedotov

Senior Open Source Engineer @ Kasten by Veeam

Join the community!