Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hey everyone, I'm Michael Cade. I'm a senior global technologist
at Kasten by Veeam, where we focus on data management
around kubernetes. But in particular, my background comes
from completely data management, whether it be physical,
virtual, cloud case workloads, and now even more
recently is that requirement around having to protect workloads in
a Kubernetes environment. So I just want to debunk some of those myths
as we go through some of this. But then I've also got some demos,
pretty cool demos as well, to actually debunk
those myths. Any questions, please let me know
either in the chat function or find me on social media.
Email address is also down below.
So I think we've all seen a slide that resembles a lot
of this information around data,
whether it be data on social media, whether it be machine
learning from autonomous cars, whether it be photos,
digital footprints from a personal and business point of view.
But ultimately, we know that data is growing and really it
doesn't matter where the platform is or what we're seeing
out there around that. And I think it's also important
to remember that the data is the new lifeblood
of everything that we're doing. It's the common denominator.
So regardless of where your workload resides, whether it's a web page
or whether it's an actual complete system,
databases, et cetera, that build up your whole system
that your company uses or your customers use,
then the data is probably the most important
thing to you at the point, whether it be a virtual machine, a cloud
based solution, SaaS, paaS, et cetera.
But data is the common denominator, and we're seeing that across
the board when it comes to failure scenarios, most dominated
by things like ransomware in the ecosystem
at the moment, we're seeing a lot of that. And the
important thing to note is that all of these options
that we have, whether it be physical machines, whether it be virtual machines,
whether it be virtual machines in the cloud, or PaaS, SaaS based solutions,
but then even more so, containers and container orchestration engines,
the data still has to live somewhere, whether that be inside these platform.
So a virtual machine with SQL on or MySQL or
NoSQL, or whether it's a cloud based workload that's running RDS within
AWS, or whether it's a container running a stateful set which has
your data service residing there within the same cluster as
maybe your front end website or front
end application, or whether it's taking advantage of data
outside of a cluster maybe leveraging that RDS approach as well.
But either way, we need to think about the data protection and the management
of that and make the correct choices,
because none of those platforms are being to
stop the failure scenarios, the accidental deletions,
the malicious activity, both internal and external, the ransomware
attacks, the security breaches, it's not going to protect against that.
Yes, we have high availability, we have fault tolerance across many of
these different platforms if protected correctly. But one
thing that is for sure is that we don't necessarily have backup built into
those platforms as well. And that's where we want to highlight that and raise
awareness of what that is and what it looks like.
Another important thing that I've been saying for a long time, you see a lot,
if you went onto your favorite search engine and you did continuous
and VMS or continuous versus VMS is what you're going to find.
And actually that shouldn't be. The message that's portrayed out in our
industry is that none of these systems have gone away. None of these platforms
have gone away. We still have the requirement for physical systems,
for virtualization, for cloud and containers. It's about having
can awareness of what each one brings so that
we can make the right decision for our application and for our data within
our businesses. None of them are going away.
Yes, we saw a massive consolidation of physical
systems into virtual, but ultimately there
is still the requirement around physical systems.
But what then that does this freedom of choice as to where we
can then store it? And obviously there's a lot of technologies built on top of
these platforms or the areas that I just touched on, but we have
to make that decision. And if you only know about virtual
and physical and maybe a bit of cloud, as a systems
administrator, DevOps engineer, platform engineer, et cetera,
you're going to tend to go with what you know,
whereas what we're trying to do is raise awareness of these other platforms
out there in other sessions to let you know
that actually you might be better off using kubernetes
or containerized workloads or the cloud rds
and things like that from AWS. So one
of the key aspects that I kind of want to get across today is
just because the same but different is we're going to
focus in on kubernetes and data management and in
particular is focusing on that storing and protecting
your data via backup and restore. So we've been doing backup
and restore for a long time, obviously way back
from physical systems point of view with an agent, virtualization.
We came along and we started hooking into the native APIs
cloud exactly the same. And now here we are with kubernetes.
So it's about choosing the right tool for the right job, but also being
able to leverage some of the platforms underpinning APIs
to be able to take a more efficient and fast way of being able
to protect that data, as well as being able to restore.
No one really cares about the backup, you only really care about the restore,
but you have to back up to be able to restore.
Another key area is disaster recovery. I mentioned around oh,
we've got high availability, we've got fault tolerance built into these
platforms. However, disaster recovery is not
built into them. So we have to think about what happens if
the failure scenario of fire, flood and blood, what happens
if we need to bring up that data somewhere else, that mobility
of data. And then that leads me on to, well, there's other use cases
then that get highlighted from a data
management point of view, is from an application mobility point
of view, how can we move that data wherever we want?
And whether that is to reduce risk, whether it is to increase
efficiency or whether it is to reduce cost,
one of those three things or more is going to impact
the business in a positive way or a negative
way if it is not able. So one of the things that we've
been massively promoting from a casting k ten point of view is the ability to
move data from a to b and back again if need be. But also
think about things like being able to clone that copy of that data and be
able to put it to work around that. So let's
get straight into the first demo. Now, what I've got here is a
very simple environment, and it actually leads on
from another demo that I did in another session that built
out this environments. But for the purposes of this demo is
we have a three node cluster, we have a control plane, we have a node
one and we have node two. We have a service within our Kubernetes
cluster. We have a web application which is written in node
JS, and we have a database that is using MongoDB.
And within that we have a persistent volume claim that
is using the CSI driver to
use the host path driver here in this instance.
But what this demo is really highlighting is
we are using everything is but into the same platform,
into this container, because the next demo is going to talk about
data outside of the Kubernetes cluster that still needs to be protected.
But how do we concentrate on the whole application? Because it's
all well and good being able to take a copy of that persistent volume claim,
which some other subsoftware can do,
and that might be enough for your requirements when it comes to that failure scenario.
But in my opinion, you're going to want to be able to capture that whole
application. You're going to want to be able to restore the service,
the web app, the database, the ingress that goes with that,
the persistent volume claim that goes with that, all of the external
data that lives in that mongo database as well. So let's
get into that. And I'm obviously using my mission critical
application Pacman for those that
want to see this, this is open source. It's out there as a
helm chart now as well as well as deployment yamls as
well. So mission critical application,
front end node js written in is acting as
my front end back end database where all my mission critical high
scores are living, is on a persistent volume. So let me
try and. Okay, so it's kicking off.
So what we're going to do is we're going to play a quick game.
Let's rack up one of those high scores. And if you happen
to be watching this and you go to app Vzilla co.
UK, depending on what other demo
I'm doing at the time, because I tend to use that DNS name quite a
bit, you might find that you can have a play.
As you can see here, I have a lot of very important high scores
across different Kubernetes clusters as well. You can see that
we pick up some of that important information as we go.
Now I didn't log a score that hit on the high leaderboard.
So if I now go into, and this is talking about the application mobility
as the rest of the demo. So here we have our Pacman namespace
within our Kubernetes environment. But now if
I go and switch to a secondary, maybe a disaster recovery or
maybe just a secondary system in particular,
I'm going to go to EKS, AWS, EKS. And now if I go get namespaces,
now you can see I don't have Pacman here. I have custom running, I have
custom running in both, but I don't have the Pacman
namespace. And what I want, I want to be able
to bring that data, that important data, I want to bring that to my
EKS cluster. Now that could be a migration, that could be disaster recovery,
and it could also be a clone. Like there might be a service within EKS
or AWS that I want to take advantage of. That data could really
do with some of the services that are native there
versus it being in GKE or AWS.
Now another thing that we can do with that restore policy is being
able to transform what that looks like. Because what we had in
the primary cluster, we might be using storage type a
and we might be using a load balancer. But where we go
to eks, I want to change that because I want it to come up on
a different storage tier and so on and so forth. So I'm
now running through this restore policy that can be scheduled and
we can automate that. You saw the frequency on there and then we've got this
import policy that we're now going to run within
that. So if I jump into my EKS cluster and that was
a snippet just before where you can see all of my clusters, that's a snippet
of k ten and its multicluster capabilities. So now I'm in my
EKS cluster and I'm running that import policy to bring
that Pacman, that whole application in a consists
fashion over into my EKS cluster with all of the transformation
that I need to get it over there. So if I now go and look
at that namespace which by magic is now being created,
but also now you can start to see the restore configuration,
you see a deployment for both Pacman and Mongo.
And now if I go and check the services that we have
within there now, this won't be,
but it could have been. I could have decided if this was a migration,
I could decide to change that DNS entry from apps Vzilla
Co. UK, which is going to another
forward facing IP address on the Internet or a DNS IP
on the Internet. Now we're going to go to an AWS
session. You can see up in the top it says AWS. And if I
go into this, so I've restored this now and it's running in eks now.
The most important thing is, do I have them mission critical high scores?
Yes, I do. Everything's in there. All good. And that's exactly what
we wanted to show. So this just highlights a few of those
areas. Yes, backup recovery is super important. It's kind of a table stakes,
but you got to do it in the right way. K ten
lives within the Kubernetes cluster, so it leverages that
API so that we're more efficient to be able to capture these whole
application. You can see there that we've just shown the completed
successful run on that.
Let me jump in. So what we just
spoke about was very much storage within Kubernetes
leveraging persistent volume. Persistent volumes. Persistent volume
claims and can external storage volume. Now this
hasn't always been the case from a Kubernetes point of view. Now this is what
we just did. We had a stateful set. In fact I think it was a
deployment, but it was using a persistent volume claim. A persistent volume
and our external storage volume.
I'll work backwards on this. So we have the container
storage interface. Now the CSI driver
enables all storage vendors to write against
the framework that Kubernetes has developed
or the community have developed, so that we're marrying those
up. So functionality with storage vendor functionality,
which means which is better than the intrigue provisioner
which was waiting for the whole code release every single time Kubernetes
was released. So without going into too much of
this detail, all of these will be available afterwards.
But basically what this means is that whatever the storage vendor that
you're choosing here, it means that we've got the ability to use
that within our Kubernetes environment if
we have that CSI driver compliant way of being able to
leverage that, as well as things like volume snapshot classes,
which is what we're going to use to take a very efficient point
of recovery. I wouldn't say that snapshots are the only point of recovery that we
should have and we should have an export out into object storage or another
storage layer, but it gives us a way of being able to recover
super quick into the live production system
if accidental deletion or something very tiny was
to happen with that failure scenario. Now the next
one that we want to talk about is what if we've got a
data service that is actually running external from the Kubernetes cluster,
but maybe our node JS front end is running in
Kubernetes, but we've had a database server that lives on a
virtual machine, in a cloud virtual machine, or we're leveraging RDS,
how do we get access to that? And we can do that as well
with Kubernetes using config maps and secrets. And what
that allows us to do is marry up the Kubernetes cluster or
use the Kubernetes API to access that,
thus giving our node JS application access
to that database. So we're actually seeing this
quite a lot within the environments.
Okay, so as you can see here, I have the RDS. This is actually
a postgres database within my
RDS cluster. You can see where it's located, et cetera.
So think of this, this is where all my mission critical application
is going to be living. And in our Kubernetes
cluster we have a namespace called RDS app. And if
we then go and take a look inside of that and we have
a config map that is saying how do we
connect our application to our RDS
instance? And we also should have a
secret in there as well which
has given us the DB creds into as you saw on these slide
before. So I've also got
cast and k ten deployed and in here I've already got
a policy created. Now if I hit this run once
and I'll come back to what that looks like shortly. So if we
now go into this, basically what this is doing is we're giving it
a name, we're giving it our comments. If we want to give it a description,
we're saying what we want to do with that, it's an action snapshot.
And we're saying when do we want to run it? So I could have it
on a backup frequency or just have it on demand,
then I'm saying how do I. We can just take a snapshot but then
we can also export that out into a separate location,
an object storage location. In this instance I'm sending it to AWS
S three. And there's a few more options around this. So we get
to choose what application we actually want to use. We can do this
by a namespace or we could do it via labels. And then what
application resources you want to actually specifically capture. Now I
want to just do everything in here. And we also
have something where we dive into the postgres database
or any data service that enables us to coerce that workload
comes that application data so that we've got a consistent copy
of that. So that should be running and
it takes a little bit of time. So I'll probably speed this up.
But if we then go back in here and we do a refresh,
we should start to see that we've got this backing up
status. And then what we'll do is we'll just
wait for this to complete.
And then we've got the ability to use that data to
recover from like we saw in
the previous demo in exactly the same fashion,
this dvd rental database.
And I can actually restore this
into an EKS stateful set. And again
we've got these same database, these that we've recovered into.
But let's just make sure that all of this works. So backup
is initially taking that first snapshot which is here.
And then what we're doing after is we're going to export that out into
our object storage. This is also
currently running in a. So my Kubernetes cluster is
a GKE cluster and we're connecting our application into
RDS. So there's a freedom of choice when
it comes to where you want to run your workloads and where
you want to run your data as well. For that use case where
data doesn't always have to reside in the same location or
in the same platform that your application is, it might be
that you're using kubernetes from a compute point of
view in a certain geo or a certain cloud provider that
you wish, but then you're leveraging something like a PaaS solution as
rds to give the best option
for the data.
Okay, so that's everything complete. And if we go
back into our RDS, we should see that we're now
back to being fully available. Although we didn't take anything offline
and again in postgres we've got the ability to see
that database and these from a
previous restore. We've got that in an eks cluster. Now from
a k ten point of view, if we go back to the dashboard, we go
into our applications, we have several restore
opportunities, either from a snapshot or from the exported
copy where we can say okay, something bad has happened,
I want to restore that. So we can then start to say, okay, I need
to restore this. I want to restore everything that
goes with that back into our environment.
So I think from that point of view, obviously data can reside
anywhere, but you still need to be able to protect that. If you just used
a point solution that was protecting Amazon RDS,
then you wouldn't have any idea about the whole application.
So if you had to maybe capture some
of that dvd rental front end application,
maybe it's not just built of a front end and a back end database.
Maybe there's other microservices that build up of that application.
Maybe there's some sort of metrics and login that we also want to capture in
a consistent fashion. So just capturing that point or that
RDS instance is not going to be enough. And that's kind of the same
ethos as we can protect the potential lun that's coming out of
the storage system and you could just protect that.
Take a NAS backup, do something with that.
But when it comes to restore, what does a restore
look like then I'd rather have everything as an application recover as
an application. And these get granular about how you recover that.
I think another misconception is around that stateful
workloads within Kubernetes are the only ones that need to be protected.
However, I could argue that many of the different
applications that you maybe consider to be a stateless workload
still have some sort of data that you would still like
to retain, whether that be just simply logs,
logs of visualization of those,
but also complex environments being able to protect those.
And if you've got more than 100 or 200 namespaces
full of different applications in your Kubernetes cluster,
that's another use case is yes, you might have the actual source code and
be able to recover very quickly, but what if you don't know which one it
is? And then we've got customers doing the same thing there.
This is me going back to that point, about all of those
different platforms are still available. And when we get to Kubernetes,
there's no shining light, there's no shining green button to say, oh, we don't have
to backup stuff anymore, everything's sorted in kubernetes, they've fixed it
all. That's not the case. We still need to recover failed applications.
There's still accidental deletion, it's still a database
at the end of the day. It still requires that application consistency
so that we can recover that. And more to the point, that data
is still the most important asset probably within your business as well. So we
want to make sure that we're covered so that we can recover from any
of those failure scenarios. Just some of these challenges that we
have protecting these persistent storage complex stateless
environments, protecting individual stateful workloads,
these logs and other areas. Application consistency
is one that I don't have on here, but things like stateless configurations
around load balancing, that IP address that I first mentioned in the
first demo could be a huge savior if you're
having to recover across different geos or
across different cloud. Just making sure that we can update the DNS
as part of that process. I've mentioned
this all the way through. The approach has to be on the application.
Now it says Caston's approach. This should just be any data management.
If you don't use Caston, that's absolutely fine. I mean, we have open source
tool sets as well that look after the application data in
canister, but we need to be looking at the whole application.
So that includes the ingress, the service, the pods, the services,
the stateful, the config maps, the secrets, et cetera.
All of the persistent volumes. It needs to contain all of them
together so that we can recover them all together,
and then the freedom of choice, like wherever kubernetes can pretty
much run anywhere. So we need to be able to protect all of those
different areas, be able to run on all of those. But also
no database is the same. Like we might be running postgres. You saw me running
postgres and MongoDB in this demo. But maybe we're using
elasticsearch for our login and metrics, maybe we're using
different tool sets for other
areas of our data services. So being able to protect all of those across
different distributions and then under different storage
gives us the ability to have that freedom of choice when it comes to that.
And then finally how
to get hands on with casting k ten we have some
lots of learning resources at learning casting IO.
I believe that QR code should take you there. If it doesn't, it will take
you to a hands on lab that is very similar, and that means that you
don't have to go and spin up your own cluster so
that you can get hands on and see what it does.
But yeah, with that, I'd just like to say thank you very much for
sticking with us. Hopefully those demos are useful.
But yeah, please reach out if there's questions at all and
enjoy the show. Thank you.