Transcript
This transcript was autogenerated. To make changes, submit a PR.
Are you an SRE,
a developer, a quality
engineer who wants to tackle the challenge of improving reliability
in your DevOps? You can enable your developers for reliability
with chaos native. Create your free account at Chaos
native. Litmus Cloud hello,
welcome to conf 42 site reliability engineering
Edition. My name is Ricardo Castro and I'm a senior site reliability engineer
at Farfetch. And today we're going to talk a little bit about Gitops.
So what we'll be covering today, we're going to cover what Gitops
is, some of the things that Gitops can do for us.
We're going to see Gitops implementation in terms of software
agents focused on delivery of applications to Kubernetes agents.
We're going to see a demo of one of those agents in practice
and if we have some time we're going to see some access.
So let us start with the story. It is a beautiful day.
I decided to do a change on our production environment so it's
a little bit tricky. So as usual I submit my
pull request and my colleague Joe decides to take a look and see
if everything is supposed to work okay.
He looks in attentively, he looks around, seems fine.
So everything's good, right? We click enter
some pipeline triggers and all hell broke loose.
So we had a Kubernetes cluster that suddenly
disappeared. So at this point this is us,
right? So we had some production cluster that suddenly vanished.
We had applications running there. We have clients that are already complaining
whats can't access our services. So we're in the
midst of a big confusion. Of course this
story has a second part, so thankfully we
have everything in terraform. So we just need to relaunch the cluster and we're back
to square one, right? So cool, we relaunched the cluster.
So now we need to get that part that we need to figure out.
So we launched the cluster. So we need somehow to deploy our
applications and configure everything in there. But because we
have a pipeline for mostly everything, we just need to trigger those pipelines.
Long story short, we started uncovering some
things that weren't supposed to be like that. So we have a pipeline that
was deactivated. We have no idea why. So we need to go into tasks
with a particular team to find out why was that. Also we
have another pipeline that said it was successful but the application
actually doesn't work. So we don't have any idea. So we need to talk
to a specific team. Someone suddenly remembers that there were
some manual changes that were done there that were needed for the application
to actually work. So yeah, long story short,
the idea is that everything wasn't fine, so we had
infrastructure as code, but there were some pieces here that actually
didn't fit our motto. So what
is Gitops and how will it help us in these kinds of situations?
So the idea for Gitops is whats the entire system is described,
declarative. What does this mean? So Kubernetes is just
one example of many modern cloud native tools that are declarative
and treat everything as code. So declarative
means that the configuration is guaranteed by a set of facts instead of
a set of instructions. So this means that instead of
me saying that launch this server, put this package here,
start this service, I just declare state.
And because we have everything declared in Git, we have a single source
of truth. So our apps can easily be deployed and roll back
to and from a Kubernetes cluster. And even more importantly, when we
have a disaster like the one that we just described, our cluster infrastructure can
also be independently deployed and quickly
reproduced. Another advantage of Gitops is that the
canonical desired state of the system is version in git.
With the declaration of our system stored in a version control system
and serving as our canonical source of truth, we have a single source
from which everything is derived and driven. This trivializes
rollbacks. We can use Git revert to go back to a previous
application state, and because of Git's excellent security guarantees,
we can also use SSH Git to sign commits and enforce strong
security guarantees about the authorship and provenance of
our code. Also, approved changes can be automatically
applied to the system. So once we have a state declared
and kept in git, the next step is to allow those changes
to be automatically applied to the system. What's significant
about this is that we don't need cluster credentials to make a change with
Gitops, we can have specific agents that looks and
interprets a desired state and knows what to do in that
particular system. And those are those
software agents. Whats will ensure the correctness and alert on divergence.
Once the state of our system is kept under version control,
those software agents can inform us whenever the reality that
is described doesn't match the operations.
Those agents can alert us on something like slack and then
go the extra mile of actually reconciling the state.
So on a day to day basis, what can Gitops do for us so
it can increase productivity. So continuous deployments, automation with
an integrated feedback control loop, speed ups time to deployment.
This means that our teams can ship a lot lot faster and
more and more changes and increases our overall output
for several times. Also, we have an enhanced
developer experience, so we push code, not containers.
Developers can use familiar tools like git to manage updates
and features to Kubernetes clusters more rapidly and without
having to know the internals of kubernetes,
so newly onboarded developers can get up to speed
and be productive within days instead of months. We also gain
improved stability when we use git workflows to manage our
clusters, we automatically gain a convenient way to
have audit logs of all the changes that were applied to a Kubernetes cluster.
An audit trail of who did what and when to
a cluster can be used to meet compliance requirements.
We also gain higher reliability. So with git
capability to revert or roll back or even fork,
we gain stable and reproducible rollbacks.
Because our entire system is versioned in git, we have a
single source of truth from which to recover after a meltdown,
and that reduces our meantime to recover. We also gain
consistency and standardization. This means that because
Gitops provides one model for making infrastructure
changes, apps and Kubernetes add ons, we have a consistent
end to end workflow for the entire organization.
Not only are our continuous integration and continuous deployments pipelines
all driven by pull requests, but our operation tasks also are
fully reproducible through Git. We also gain strong security
guarantees, so git strong correctness and security guarantees
backed by the strong cryptography used to track and manage changes,
as well as the ability to sign those changes to prove authorship and
the origin is key to a secure definition of the desired state
of the cluster. And of course, with all of this we gain easier
compliance and auditing. Since changes are tracked and logged in
a secure manner, compliance and auditing are made trivial.
We can use tools like Kubediv, teradif or ansible diff to
alert that something isn't actually the state
that is described in git. And then agents can go the extra step and actually
apply those changes. So now that we are going to focus a
little bit on the delivery of software agents to Kubernetes
clusters, what do those agents need to have to
actually work? So they need to be declarative, so they need to actually
find a git repository and understand some syntax
that is there that describes state. They also need to have to operate
automated, so whenever they see a change on a Kubernetes
cluster, they can alert on the change. But we need to actually have a way
to, if we want to make them apply those changes automatically,
they also need to be auditable so we need to have a way
to actually track what changes were applied to the Kubernetes and then trace
that back to the changes that were in git. For our particular use case.
We needed to be designed for Kubernetes because that's where we
are going to focus on. Also, it needs to have out of the box integrations.
We don't want to be reinventing the wheel every time we want to
do something, but of course we all have specific use cases
on our infrastructure, so those tools need to provide a way for us to actually
add our custom things particular to our infrastructure.
So one of the first tools for Gitops continuous
deployments tools is flux. So Flux was
developed by the company that coined the term Gitops, which is webworks. And Flux
is a tool for keeping Kubernetes clusters in sync with sources of
configuration, like git for example, and automatically updates
configurations when there is a new code to deploy. So basically
Flux will be looking at one or more git repositories.
It whats a particular syntax that it knows how to interpret, and then it will
know how to interact with the Kubernetes API to make deployments.
Flux knows how to interact with both customize
and helm, so we can use our usual workflows to actually manage
that kind of workflows. Another tool that
is gaining a lot of traction recently is Argo, and Argo is
a declarative Gitops continuous delivery tool for Kubernetes.
So on a very high level it works similarly to the
way that Flux works. So it knows how to look to one or
more git repositories and then knows how to interact with the Kubernetes cluster
to actually make those changes in a git repository a reality.
It also has support for
both helm and customize, and it has a lot of
other features that Flux doesn't have, things like it has a nice
UI and has a lot of more integrations that it comes out of the box
with. And last but not least, we have Jenkins.
So Jenkins has had a bad rep for several
years, and Jenkins X is the next evolution
of Jenkins, which was built with cloud
in mind and with a Gitops approach. So it's essentially
a pipelines automation built with GitHub, built on top of Gitops
and previews environments to help teams collaborate and accelerate their
software delivery at any scale. So again, on a very high level,
it works very similarly to the other two tools that we've just seen.
So it knows how to look to a git Repository, knows how to interpret those
changes and then knows how to apply those changes to
a Kubernetes cluster. So on a very high level,
these are just three tools that we can use to quickly get up to speed
on a Gitops automation workflow to deliver applications to Kubernetes
cluster. So how would Gitops pipeline work?
So imagine we have a sample application, we do some
changes, we do a pr, someone reviews that code and merges the request.
Some pipelines will actually see that change will produce
container image and will put that on a container registry.
Then some tool that could be running inside the Kubernetes
cluster or outside will automatically do a commit into a git
repository with whats particular can saying that there is
a new version to be deployed and then some agent that
will run inside a Kubernetes cluster will pick up that change eventually
and know okay, so now I have a new application to deploy. I know what
to do. The version is x. So I'm going to deploy this application automatically.
So let's see some of this in practice with
a demo. So we're going to be using a
git repository that has a sample application. We're going to do a quick
overview of what's inside, what this tool
will be using. So we're going to use flux to deploy
an application called pod info. So it is an application that is
widely available on the Internet. That application
has customized files to actually deploy
applications and we're going to make use of that. So inside flux
there's one component called the customized controller that we are going to use
that is going to be used to actually look at these customizations and
know what to do with them. So let's start by creating a cluster
that is going to be used to deploy our service.
So this just creates a Kubernetes cluster that's going to
be run locally and is using K actually deploy that application
to be the deployment of that cluster.
So it's just going to take a few seconds. So should
be running. Now we're just going to get the kubeconfig.
So if we do Kubectl get nodes, we should
have a Kubernetes cluster running. Okay, so we're almost there.
In a few minutes we should have everything up and running.
So next we're going to export a few variables
just to have access to our cluster.
I have done that already. So we're going to start and see all of
this in action. So this first command, what it will
do is that it will deploy the flux agent to the cluster.
It will tasks those variables so that the flux agent
can actually find the
code and know what to do. So we're telling Flux to look at the master
branch to a folder called staging cluster and that's
it. So before we deploy that, let's see what exists
inside this staging cluster. So inside this folder we have
a few files. This folder here is just something that Fluxus uses to
store some state if it requires. So let's start by seeing
what this web app source is. So inside web
app source we are declaring something that is called a git repository.
And we are pointing to a git repository in GitHub,
which is the application that we've just seen that we are going to deploy.
So we're just telling flux that there is a git repository,
that it is at this location. Next we have
something called web app common. So it makes sense that we look at this first.
So here we're declaring that we have a customization so that flux can
look at it. And we're saying that it will use that git repository.
And inside that git repository it should look at deploy
web app common. So let's go at that repository and see what we have.
Deploy web app common. So if
we come here we start to see that we have normal
Kubernetes files. So here
we have the declaration of a namespace, here we have a declaration of a service
account and so forth. If we go back and look at the other files,
we see that we have something here called back end. So inside
the back end we once again have a customization.
We're again pointing at this repository,
but now we're looking at a different folder. One curious thing
to look here at here is that we can specify a depends on.
So we're saying that we just want to deploy this customization once web
app common actually has already finished doing it.
So if we go here inside web app, we go into backend,
and here we see that we have regular
kubernetes manifests. So if we look at here we have a deployment,
it's going to deploy a specific application. We have here the container
image that we want and a bunch of other applications. And as expected,
we have here a web app front end and that web app front end,
exactly the same thing. Looking at the specific git repository,
it is looking at the front end folder.
And at that front end folder we have something called depends on. And this one
depends on the back end. Again, just very quickly we
look at front end and again same thing, a deployment,
horizontal polytoscaler and a service so let's see all
of this in practice. So here we're going to bootstrap
flux. Here we go. So flux is connecting to GitLab,
it has already cloned the repository, it's installing components in
flux namespace. So because we didn't say what components, whats we
want, it's going to basically install all of those. So it's going to install
the source controller which is a component that actually pulls git repository
and see if there are changes. And then we have other components like the helm
controller, the customized controller and the notification controller.
The only one that is being to be used is the customized controller because it's
the one that understands what is inside
that brigitte repository. So we're going to watch for customizations,
we're going to watch for the logs for that particular component
and we're going to see applications just showing up. So here
we can see that at this point we don't have anything yet,
but Flux has already recognized that it needs to deploy something called web
app common, web app backend and webex frontend.
And as we see it has already been
done with the web app common. If we remember the web app
common was creating a namespace, was creating a
service account. Next it passes to the next component
which is web app back end. We can see here that the reconciliation
is in process and here we can see the log that it actually is applying.
Eventually it finished, the back end is actually deployed and
then it will start in a few seconds. The front end again,
the same thing being the same thing doing the reconciliation. We can
see that it's already trying to apply front end
and front end application. It's almost up,
should take just a few seconds to be up and it is already
up. So if we do here a port forward
and we go to our browser, we can see that we
actually have our applications, we see that it is working,
it has a metrics endpoint, we can see
that it exports metrics for Prometheus. So just by pointing
flux at a particular cluster, we actually deployed namespaces
inside a Kubernetes cluster. We deployments
several applications with dependencies among them. So it's quite easy
to, if we have a scenario of disaster recovery, just spinning up kubernetes
cluster with the configurations that we want, pointing to a specific git
repository, and it will do everything for us.
So now that we have our demo concluded, we have
a few extras here to talk about. So a question
that usually arises is about secrets. So people
usually come. Okay, so that's all. Well and good, I have everything in git,
but now I actually need secrets. So if we
are not fetching secrets directly from the application from
somewhere, we need somehow to inject those secrets. So there are several projects
that actually deal with it. One is filled secrets and the other
one is SOP. So they work on a high level on a very similar
way. So we encrypt those secrets and put them alongside our
code. They are encrypted. And then an agent that lives inside the cluster,
once it receives that encrypted
secret, knows how to decrypt that and knows what application it
actually needs to deliver that secret to.
Another option is the vault agent sidecar injector. So it's a vault
agent that could run alongside our applications that knows how to
fetch secrets from vault and actually deploy them to
specific applications. So there are several options to actually deal with this.
Another question that usually arises is
Gitops versus infrastructure s code. So the idea
of one of the main differences between infrastructure as code is Gitops
is that the use of immutable containers, whats deployable
artifacts that can be converged on by a suitable orchestration
tool. For example kubernetes, as we've seen in the example, so all
the desired state is kept under source control. This isn't always
the case with some infrastructure code tools. So some infrastructure
Xcode implementations vary and sometimes the
source of truth is split between a git repository, a database,
and sometimes spread between weekly linked union of
multiple git reptiles. While infrastructure as code is
usually used to manage only
infrastructure, it doesn't manage the whole cloud native stack
Gitops. Here we've just seen the use case of deploying
an application, but we can go a step further and use
the principle, the Gitops principle to actually do this
for all the stack.
And another question that usually arises
is between push and pull. So we usually push changes, for example
to a Kubernetes cluster. So say deploy this application. But with
a Gitops approach we are actually on a tools
based approach. We do some change to a kubernetes,
to a git repository that then an agent pulls
that change down and applies. That has some advantages in terms
of security because now we don't have to actually open the cluster
to an outside tool or an outside person to do a deployment.
And that comes with those benefits in terms of security.
And that's all for my part. I hope this tasks was informative
for all of you. It was a gentle introduction to what Gitops
is. It also showed
a few tools and showed one actually in practice doing deployment.
You can find me at those addresses. And don't
hesitate to being me if you want to discuss this topic further
and get into more detail of how these tools
work. So thank you very much and thank you for being here and assisting to
my talk.