Transcript
This transcript was autogenerated. To make changes, submit a PR.
You. Hello and welcome to my session
for this year's comfort. Two DevOps.
Just to be fully transparent. I have a dog here and I'm home alone,
so there might be noise coming. She's, she's currently laying about and
being a good girl, but we'll see what happens.
There might be some noise and stuff like that, so we'll see how we'll deal
with that. Fortunately that's one of the things, by working from home or not having
a proper actual studio and everything like that. So anyway,
this talk is called how to achieve actual get offs
with terraform in Kubernetes, and it's a little bit of a cheeky title. I'm going
to get back to why this is written in this
fashion, but just to clarify, in case you
didn't know, that's me. My name is Robert.
I work as a principal cloud engineer at Amesto 42.
I'm a Microsoft MVP and hashicorbin bad
store for the last couple of years. And hopefully,
obviously this year also, that I'll probably continue to be that
as well. We'll see. I'm very active in the cloud
native computing foundation where I'm one of the co chairs in the platforms working group
where we're creating a lot of resources and helping
establish the platform engineering parts and get that
into the CNCF space, so to speak.
And I'm also one of the maintainers of the Open Gitops project, which is
a project that came out of the Gitops working group under the CNCF,
where we kind of define what Gitops is. And that's one of the reasons
why I'm holding a stock. I'm one of
the founders of, well, a lot of things, but cloud native
Norway is one of the newest ones. It's so new that it's actually not fully
actually founded, but we're going to do
live events and things like that, but also whole like the
aim here is to do Kubernetes community
days here in Norway. I'm also in the azure cloud native
user group where we're doing a lot of meetups, so feel free to. If you
look on Meetup, you'll probably find us there. And we're doing everything online, so I
know it says location on the meetup because that's just how Meetup.com
works, but it's fully online and everything like that
also. Or the founder of Norwegian Powershell user Group, which has
been a little bit neglected lately, but I'm going to get back to that also.
I'm trying to put something interesting here, like talk
about myself. I just don't know how to do it properly. But my
interest is in music. I'm a former musician.
Well, I play a little bit, but I'm former actual musician
and I'm into the metal and progressive
music space. So wide area of interest there.
But also gaming, like retro gaming, rhythm games,
because I like music and shoot them ups and stuff
like that. And I'm a film buff. That means I don't actually watch films
that are new, but I watch a lot of old films. So if any of
those things interest you as well, feel free to reach out and talk about that
as well. But anyway, our agenda for today is
to do, first of all, a quick introduction to Gitops and
actually a quick introduction to Terraform as well, just to make sure that everyone's on
the same playing field there. Talk about how we can combine
terraform in Githubs and then do a little demo of it,
because obviously we want to see how it works. And hopefully
the demo gas this time is on my side and
we won't have any issues.
So I usually start with this kind of like statement thing.
Terraform stored in get automated with pipelines is not
Gitops. The reason why I say that is when
Gitops became a term, it was actually a description of
can actual operational model. It has meaning
behind it. A lot of people think of if you put stuff
in git and you do your pipelines, you can call that Gitops,
you can call it whatever you want. But in all honesty,
honestly, it's just CI CD at
that point. So it's just normal automation.
The idea behind Gitops is to take that to a further length.
It's about the continuous deployment part of this.
When you're done with your CI CD and you have something
that you want to deploy, do you need to
actually sit there and define imperative
steps that says, I want to deploy this to this
location and so on and so forth? Probably not. You just want it to be
deployed right. So that is what Gitops is. Git ops
came from, was coined by WeWorks
back in 2017. And the idea
is that this is actually what we wanted to do all the time,
but we haven't had the technology, proper technology to do it at in
the Githubs working group or the Open Gitops project.
This was one of the things that we started off with first
is actually creating some principle that if it follows these
principles, this is actually git ups. I know my head is blocking,
I got a thick head, but I'll try to get out of the way when
I get to 0.4. But we have four principles
and not to just read them off. The first one is about being declarative.
The thing is that we cant our system state to be
defined in a declarative fashion, meaning we don't want to say this
is what you should do to get this to run. We just want to say
this is how we want our system to look like. And based
on that it should just be
deployed by itself. We want it stored in some place
that is versioned and immutable, which is why git works
well for this. But that doesn't mean that git actually is the only
place you can store stuff and still be git ops.
The thing is that you have version and immutable as
a concept, where you define your state, your desired
state, once, and instead of creating
new versions or tweaking the state, if there's something that's going to
change, you want the entire version, an entire version to
be supersede the previous one. This means
that you have a complete version history, so that if something
fails in your deployment, you can always roll back to the previous
version and it will work right. So this
works well with Git in our case, but that
doesn't mean that an S true bucket or can azure
storage account, blob storage, this actually
works with that as well. We want it to be pulled automatically,
so we don't want to push, we don't want to say this is now it's
ready to be deployed. We want software agents to make
sure that the desired state is always up to date. And then we have the
continuous reconcile, which is we're pulling
in the state as is. And then we
want it to the soft brasions to
actually make it so that the observed desired
state is what happens in the actual system itself.
This is best done in the
world of kubernetes because of these principles. That doesn't mean that
that has to be the case, but obviously this came
from a Kubernetes perspective,
and the tools were just there to do this
in a proper fashion.
Speaking of tools, there's two that are frequently
referred to. It's the Argo CD and Flux project.
Both of these are in the CNCF and both of them are
actually a graduated project. So they've gone through the entire loop. They're officially
certified or graduated from the CNCF,
which has been a long journey. Argo CD works
slightly different than flux. We're going to look at flux this
time because the tools that we're using in this demonstration
is more flux centric. But that
doesn't mean that you cant do terraform with Argo CD.
It just means that the tools that I'm showing off here is probably not going
to work. Speaking of Argo
about flux, this is a high level drawing that I just stole
from the flux cud website. This just shows all
the different components of flux. And as you can
see here we have some controllers and these are the controllers
that are actually making everything work. So you have your source
controller. This is the controller that actually looks
at the states in your version
control and pulls in
the newest data. Then we have our and
that pulls down the data and it writes that directly into the Kubernetes API
through the sources and customizations, custom resource
definitions. And based on that you have a customized controller
which takes basically just like your manifest,
right? It can either be plain manifest
or it can be using customizations
which is just a template, a way of
templating your Kubernetes manifests.
There's more power there, but at the same time it gets more confusing
if you're totally new to it. So we're probably going to look at both in
this. There's also a helm controller. So when you're doing helm
deployments, it can actually keep
control over your helm deployments and everything like that. You don't
need to use that, but it's a different process than customize because customize
is just kubectl. Basically it's built into your
CLI tooling and helm is a different tool. So that's why those two
are there. There's also a notification controller and things
like that, but that's something we're not going
to look at. These are basically what we're going to be using.
So terraform, in case you don't know what that is, it is
seen as an infrastructureascode, as code tool.
However, it actually works with basically everything that has an API.
It is declarative. So again you
don't say I want to do step one to step
five. You define your resources in a declarative fashion and
then terraform works with the API to make it so
it's modular. Everything's a module. So when you're
writing terraform, have that in the folder and you run terraform on that.
That's what we call a root module. And then you have child
or resource modules, depending on what you want to call it, that you can then
reuse code in different root modules.
These root modules and resource modules
take inputs and outputs. So you can put
in some information that you need to use, for instance like a deployment
name, a location and so on
and so forth. And then you can output stuff which then can be used by
other modules. So that's how they kind of talk between modules.
It's also stateful. So it has what's called a terraform
state file. That's something you need to take care of,
which is part of why the TF
controller from Weworks works so well for this. So the TF controller from
WeWorks is a project where they're
building on top of flux, which is again, why this
wouldn't work with Argo CD in an easy fashion.
But this works with Flux. So the source
controller is used as a way of
getting the information that the DF controllers is doing.
And it's creating a custom resource for
terraform. So you can define your terraform as a
resource in the Kubernetes API. It also does a
lot of lifecycle management. So for instance, if you don't
define that, your state is supposed terraform state is supposed to be somewhere
else. It would actually keep that in the Kubernetes cluster as a secret,
and thus it can manage state in
that sense. Obviously in production you want your
state to be probably somewhere else. But for testing and
demo like I'm doing now, this is perfect.
It also has for instance, a dependency
attribute. So we're going to look specifically at
that, how we can kind of build up these deployments that are done
in succession and reference each other.
All right, so that was about it. I'm going to jump
right into visual studio code. Hopefully everything
works fine there. Let's see if I can find it. Here we go.
So just an explanation of what you're seeing here. I got
a double check. I got a visual
studio code up here with a terminal. The terminal has k
nine s or k nine s, which is
basically a CLI, visual CLI tool. So what you're looking at
here is a relatively blank Kubernetes
cluster running kind on my local machine.
Flux and all other Githubs tools
work with any Kubernetes cluster, so that doesn't matter. But this
is just for demonstration purposes. So I have that so we can visually look at
what's happening. This git repository,
it's on my account on GitHub.
It's called GitHubs terraform or terraform githubs. No,
GitHubs terraform. So everything that I'm doing here
is available there. It's a little bit of a mess because I've been using this
back and forth in different presentations and hopefully I'll get to at some
point clean up a little bit.
But in itself, if you can follow along here,
then you can go in there and get the code if you want to.
So I have some terraform files.
I have one called greetings, which is a really simple,
like a hello world type thing where I take an input
called greeting and I takes an input called subject.
And as you can see here, it defaults to hello GitHub stays
and it formats and outputs the
message. So this
was written while doing a demonstration actually on this for
GitHub stays. So we're going to override those with
actual inputs. But this is just like a simple test to
make sure that everything works. And what I'm going to do
in the background here, I'm going to bootstrap flux,
point it towards this particular folder
where I have a flux
systems. This usually will get overwritten,
but there's a customization here that
takes in what's called the GitHubs
toolkit components, which is the original name of flux
version two. But I'm also adding
on a TF controllers YaML file, and if you look at
that one, it's setting up a helm repository
and a helm release for the TF controller.
So what this will do is not only bootstrap flux,
but it will also install the terraform controllers.
And we have a greeting Yaml here.
Sorry, I just need to make sure that I get this in the right order
so we're all paying attention where I
have a customization, but this one is depending on the flux system one.
So we cant our terraform controller to get in
because that has the CRD. So if
we didn't do that, it will start complaining about
CRD missing. Let's see,
clean up so you don't see all the mess that I've been doing.
So the idea here is that we'll bootstrap
this cluster with flux, and TF
controllers will also be part of the deployment and the greetings one.
So we're running the flux bootstrap command
for GitHub, pointing it towards saying that I'm the owner putting
in the repository and we have to put in a little flag that
says this is a personal one because this is not owned by an organization,
this is an actual user's git repository.
And then again, like I said, pointing against the path.
So if we do that, it will do a lot of things and it
says it will install the components. What I want to do is just jump in
here and see these controllers starting to run.
We then look at custom,
let's not break k nine s.
If you look at the customization, we can see that the
flux system customization has already been pulled in
and it says applied revision and then it referenced a get
hash or commercial hash. So that
means that we should have now a terraform,
which we do, that says, that's called greet folks.
That is this one.
And as we can see here, we have it set to approve plan
auto. You can have this to manually approve.
However, the way that I see it,
if you don't have it set to automatically approve,
then we're kind of not doing Gitops again,
that is something for a topic to be discussed.
I feel that the entire idea behind Gitops is that
deployment should be continuous and it
should just work. Right. Anyway,
that's beside the point. So here we're putting in variables
for the greeting, changing that salutations, subject to files.
And then we're writing the output to a secret. So if we
look now we have a secret. Well, we have several
secrets. I just created some in the background without you noticing. I'll get back to
that. But we have one says greeting output. We have one that says TF
plan default greet files and we have in TF
state default greet folks.
So when you're doing a terraform plan, what you usually
do in a pipeline is that if
you're doing a terraform plan, if you don't output that into a
plan files, if you're then going to do can approval
step and make sure that you say this
is what's going to happen, do you want to approve it? If you don't put
the plan into a file, you could potentially have a different result
the next time you're running. If you're going to run it again with TF apply.
So you usually put it into file. In this case this TF
controllers puts that in as a
secret. So if you look into the TF plan,
you have a TF plan data, you also have a
state, like I said, if you don't do anything, it will actually keep track of
the state file itself in the
Kubernetes cluster as a secret,
which obviously you could then export and then, but somewhere else if
you want to take a backup of it, so on and so forth. But more
importantly we have this greeting output. I'm going to press X to
decrypt it. Sorry.
And now it says salutation folks. Right,
so this
is basically the workflow you can have
these automatically run if you can
put the outputs into a secret and then you can reference that
back and forth. So this is not a great example.
This is just literally running
the most basic terraform ever, just taking inputs
and then exporting that as an output.
What we want to do is put in a new,
to show off a little bit more capabilities we're using to put in a new
customization which have an example of
using dependencies. And if we
look at examples we'll see we got a bunch of
files here. We first of all look at the terraform ones. We have
a root module for shared resources.
This is where we're going to create some azure resource groups
to have stuff in. We're going to put in a virtual network,
we're going to do some security groups for that virtual network and then
we are going to export that. So we're exporting the subnet id and
we're exporting the resource group name because
we have these shared resources. We want several virtual
machines, for instance in this case this example, to be deployed into
the same resource group in the same subnet, et cetera, et cetera. So we're creating
these resources and then we have a workload which is basically just
a vm.
We create a network interface that is using that subnet id.
We have a virtual machine that just gets deployed and
we have an admin password that we're
going to be using. So in the background here
I put in two things. I put in a secret for the workload ids
because we want these secrets to come in from somewhere else.
We don't want to store that in git. You could do that
if you're using sops or something like that to encrypt
your secrets. But for demonstrations
I'm just like putting the secrets in manual in the cluster while you're not
looking. And then we're going to run the
shared resource one. We're going to disable the backend
config. So we're not going to store stuff in the Kubernetes cluster.
I have put in a secret for, well, it says here terraform enterprise
Cli config. It's basically just a token for terraform
cloud. So I have a token for terraform cloud which means that
I can use that as my backend.
And we're putting in the variables, deployment name and location
and we're going to write our outputs to a secret.
And this secret is then going to be used by a workload,
one which is doing exactly the same thing as the other one.
But it has a depends on. So it
actually will wait until the previous or the shared
resources one is deployed in a proper fashion.
So if we look at the customizations
here and if we go into our
customization file uncomment,
it's kind of fun using conventional commits to stylize
in a proper fashion, something that's basically just not
really that important. But I'm
just doing this because. I'm sorry,
hold on to your hats, I need
to fix that. I'm on a new computer, sorry. And now
I'm going to sync the changes up and as soon as I do,
well as soon as I do, after a while the
source controller will see that there's a change here and
we'll actually pull it down and apply that. Which means that our customization,
this one will then pop in. If that takes
a little bit of time, never mind. It can take a little bit of time
to do that because it's on can interval but
you can force reconcile it.
But it's already done that. So if
we now look at Terraform we can see that we have two
more. We have the shared resource and workload.
And the workload one specifically says that dependency flux
system shared resources is not ready so
it can't do anything. And if you see the other one is now
saying that it's doing a plan. So if we look every
time you put in this, it actually spins up a pod that is a runner
and it runs the code for you. Let me just check my time
here. And what we can do is
we can open this hopefully. There you
go and see the plan run in terraform cloud but
in itself just like if you did this through a CLI,
you will also get this in
your CLI tool. So as you can see here,
it started, it did a check. These resources are already existing
because I didn't want to take any chances with the demo gods.
But it runs over and makes sure that everything's good and
when that's done it destroys that
pod because it's done. However, now if
you look at the terraform one,
it wrote the outputs to a file and
it said the ready state is now set to true.
So as soon as the reconciliation process for workflow
one triggers which I think is set to 1 minute
I hope. Yes. So this should happen pretty quickly. It will
look and see. Oh my dependent terraform
resource is ready. So everything's good. So I
can start do my stuff and
like I said, hopefully that will happen pretty soon. There we go. So this
now does the same thing. Now it spins
up a new workload runner or a terraform runner.
Sorry. On the workload one resource,
in this case, like mentioned, I have deployment
name and location, but it also takes in the
workload secrets, which is more or less just not
even sure what that is. It's the admin password. Let me double check.
Yeah, it's the admin password and deployment name. So I'm overriding the
deployment name because why not?
And then it also pulls in as variables the
subnet id and the resource group name
from the shared resource one. And then it's using that to do the deployment.
And I think this one got deleted, so it might take a little.
All right. I don't know, it's just refreshing. State, no changes. Cool.
So as you can tell, this is how we can
do a dependency chain
here. So you have a certain root module that is dependent on
different resources. So you can kind of like build up a little bit more complex
scenario in that sense.
But one of the main things why I really like this
way of doing terraform is because I'm
building platforms on kubernetes and
while I can program and go and I
can make all this kind of a little bit more complicated things work
and create APIs and so on and so forth, many people
that I work with who are doing infrastructureascode,
it's a big learning curve to start learning to program. If you're a more
traditional infrastructure or cloud operational person,
like doing terraform relatively easily,
it's not really that hard. So by doing
it through this fashion, like what I usually
say to people, if you can put your things into
a terraform file and make sure that you can repeat it
over and over again so you have the proper inputs and
you know, if there's a dependency for down the line, that you do
your proper outputs. If you just do that, we can put it into
this and then we can create basically a platform for
automatic deployment of infrastructure and more using terraform
as one of the base tools, which is kind
of like a standard in our industry. So while
the terraform part of this is just like plain straight terraform,
we as platform engineers can kind of facilitate
this automation in a more proper fashion.
There's plenty of things that you can do with this. I highly
suggest that you look into it and look
into it. Make sure that if you are doing terraform,
give this a try. It's relatively easy to get running.
I basically did the entire thing from scratch here. But if there is anything
that you're thinking about or want
to discuss or something, feel free to just
yell at me. I'm preferably on LinkedIn or through email and
I'll make sure to help as much as possible.
Thanks for having me here for comfort to DevOps 2023.
I hope this was valuable and I
hope to be back someday, so thank you.