How to achieve (actual) GitOps with Terraform and Kubernetes

Video size:

Abstract

GitOps might sound like a self-explanatory term, but it is not as easy as it sounds. Many think this just means to store your Infrastructure-as-Code in Git, then have pipeline run the code, but it is actually much more complicated than that. True GitOps takes the deployment out of CI/CD, and the most popular solutions are using Kubernetes controllers to do all the heavy lifting. So if we can’t just store our HCL files in a repository and have that trigger a pipeline, how can we use Terraform in a true GitOps fashion? Let me take you through your options, but also show you how we at Amesto Fortytwo have implemented our Terraform-based GitOps workflow.

Summary

Robert is a principal cloud engineer at Amesto 42. He is very active in the cloud native computing foundation. Cloud native Norway is one of the newest ones. The aim here is to do Kubernetes community days here in Norway.
I'm trying to put something interesting here, like talk about myself. But my interest is in music. I'm a former musician and I'm into the metal and progressive music space. Also gaming, like retro gaming, rhythm games and film buff. If any of those things interest you as well, feel free to reach out.
Gitops is a description of can actual operational model. It's about the continuous deployment part of this. We want it stored in some place that is versioned and immutable. This is best done in the world of kubernetes because of these principles.
Argo CD works slightly different than flux. Both of these are in the CNCF and both of them are actually a graduated project. That doesn't mean that you cant do terraform with Argo CD. It just means that the tools that I'm showing off here is probably not going to work.
Speaking of Argo about flux, this is a high level drawing that I just stole from the flux cud website. This just shows all the different components of flux. There's also a helm controller. These are basically what we're going to be using.
terraform works with basically everything that has an API. Everything's a module. root modules and resource modules take inputs and outputs. It's also stateful. It also does a lot of lifecycle management. For testing and demo like I'm doing now, this is perfect.
The idea here is that we'll bootstrap this cluster with flux, and TF controllers will also be part of the deployment. We're going to override those with actual inputs. This is just like a simple test to make sure that everything works. The idea behind Git is that deployment should be continuous.
So when you're doing a terraform plan, if you don't output that into a plan files, you could potentially have a different result the next time you're running. What we want to do is put in a new, to show off a little bit more capabilities we're using, is a new customization.
But one of the main things why I really like this way of doing terraform is because I'm building platforms on kubernetes. We as platform engineers can kind of facilitate this automation in a more proper fashion. There's plenty of things that you can do with this.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

You. Hello and welcome to my session for this year's comfort. Two DevOps. Just to be fully transparent. I have a dog here and I'm home alone, so there might be noise coming. She's, she's currently laying about and being a good girl, but we'll see what happens. There might be some noise and stuff like that, so we'll see how we'll deal with that. Fortunately that's one of the things, by working from home or not having a proper actual studio and everything like that. So anyway, this talk is called how to achieve actual get offs with terraform in Kubernetes, and it's a little bit of a cheeky title. I'm going to get back to why this is written in this fashion, but just to clarify, in case you didn't know, that's me. My name is Robert. I work as a principal cloud engineer at Amesto 42. I'm a Microsoft MVP and hashicorbin bad store for the last couple of years. And hopefully, obviously this year also, that I'll probably continue to be that as well. We'll see. I'm very active in the cloud native computing foundation where I'm one of the co chairs in the platforms working group where we're creating a lot of resources and helping establish the platform engineering parts and get that into the CNCF space, so to speak. And I'm also one of the maintainers of the Open Gitops project, which is a project that came out of the Gitops working group under the CNCF, where we kind of define what Gitops is. And that's one of the reasons why I'm holding a stock. I'm one of the founders of, well, a lot of things, but cloud native Norway is one of the newest ones. It's so new that it's actually not fully actually founded, but we're going to do live events and things like that, but also whole like the aim here is to do Kubernetes community days here in Norway. I'm also in the azure cloud native user group where we're doing a lot of meetups, so feel free to. If you look on Meetup, you'll probably find us there. And we're doing everything online, so I know it says location on the meetup because that's just how Meetup.com works, but it's fully online and everything like that also. Or the founder of Norwegian Powershell user Group, which has been a little bit neglected lately, but I'm going to get back to that also. I'm trying to put something interesting here, like talk about myself. I just don't know how to do it properly. But my interest is in music. I'm a former musician. Well, I play a little bit, but I'm former actual musician and I'm into the metal and progressive music space. So wide area of interest there. But also gaming, like retro gaming, rhythm games, because I like music and shoot them ups and stuff like that. And I'm a film buff. That means I don't actually watch films that are new, but I watch a lot of old films. So if any of those things interest you as well, feel free to reach out and talk about that as well. But anyway, our agenda for today is to do, first of all, a quick introduction to Gitops and actually a quick introduction to Terraform as well, just to make sure that everyone's on the same playing field there. Talk about how we can combine terraform in Githubs and then do a little demo of it, because obviously we want to see how it works. And hopefully the demo gas this time is on my side and we won't have any issues. So I usually start with this kind of like statement thing. Terraform stored in get automated with pipelines is not Gitops. The reason why I say that is when Gitops became a term, it was actually a description of can actual operational model. It has meaning behind it. A lot of people think of if you put stuff in git and you do your pipelines, you can call that Gitops, you can call it whatever you want. But in all honesty, honestly, it's just CI CD at that point. So it's just normal automation. The idea behind Gitops is to take that to a further length. It's about the continuous deployment part of this. When you're done with your CI CD and you have something that you want to deploy, do you need to actually sit there and define imperative steps that says, I want to deploy this to this location and so on and so forth? Probably not. You just want it to be deployed right. So that is what Gitops is. Git ops came from, was coined by WeWorks back in 2017. And the idea is that this is actually what we wanted to do all the time, but we haven't had the technology, proper technology to do it at in the Githubs working group or the Open Gitops project. This was one of the things that we started off with first is actually creating some principle that if it follows these principles, this is actually git ups. I know my head is blocking, I got a thick head, but I'll try to get out of the way when I get to 0.4. But we have four principles and not to just read them off. The first one is about being declarative. The thing is that we cant our system state to be defined in a declarative fashion, meaning we don't want to say this is what you should do to get this to run. We just want to say this is how we want our system to look like. And based on that it should just be deployed by itself. We want it stored in some place that is versioned and immutable, which is why git works well for this. But that doesn't mean that git actually is the only place you can store stuff and still be git ops. The thing is that you have version and immutable as a concept, where you define your state, your desired state, once, and instead of creating new versions or tweaking the state, if there's something that's going to change, you want the entire version, an entire version to be supersede the previous one. This means that you have a complete version history, so that if something fails in your deployment, you can always roll back to the previous version and it will work right. So this works well with Git in our case, but that doesn't mean that an S true bucket or can azure storage account, blob storage, this actually works with that as well. We want it to be pulled automatically, so we don't want to push, we don't want to say this is now it's ready to be deployed. We want software agents to make sure that the desired state is always up to date. And then we have the continuous reconcile, which is we're pulling in the state as is. And then we want it to the soft brasions to actually make it so that the observed desired state is what happens in the actual system itself. This is best done in the world of kubernetes because of these principles. That doesn't mean that that has to be the case, but obviously this came from a Kubernetes perspective, and the tools were just there to do this in a proper fashion. Speaking of tools, there's two that are frequently referred to. It's the Argo CD and Flux project. Both of these are in the CNCF and both of them are actually a graduated project. So they've gone through the entire loop. They're officially certified or graduated from the CNCF, which has been a long journey. Argo CD works slightly different than flux. We're going to look at flux this time because the tools that we're using in this demonstration is more flux centric. But that doesn't mean that you cant do terraform with Argo CD. It just means that the tools that I'm showing off here is probably not going to work. Speaking of Argo about flux, this is a high level drawing that I just stole from the flux cud website. This just shows all the different components of flux. And as you can see here we have some controllers and these are the controllers that are actually making everything work. So you have your source controller. This is the controller that actually looks at the states in your version control and pulls in the newest data. Then we have our and that pulls down the data and it writes that directly into the Kubernetes API through the sources and customizations, custom resource definitions. And based on that you have a customized controller which takes basically just like your manifest, right? It can either be plain manifest or it can be using customizations which is just a template, a way of templating your Kubernetes manifests. There's more power there, but at the same time it gets more confusing if you're totally new to it. So we're probably going to look at both in this. There's also a helm controller. So when you're doing helm deployments, it can actually keep control over your helm deployments and everything like that. You don't need to use that, but it's a different process than customize because customize is just kubectl. Basically it's built into your CLI tooling and helm is a different tool. So that's why those two are there. There's also a notification controller and things like that, but that's something we're not going to look at. These are basically what we're going to be using. So terraform, in case you don't know what that is, it is seen as an infrastructureascode, as code tool. However, it actually works with basically everything that has an API. It is declarative. So again you don't say I want to do step one to step five. You define your resources in a declarative fashion and then terraform works with the API to make it so it's modular. Everything's a module. So when you're writing terraform, have that in the folder and you run terraform on that. That's what we call a root module. And then you have child or resource modules, depending on what you want to call it, that you can then reuse code in different root modules. These root modules and resource modules take inputs and outputs. So you can put in some information that you need to use, for instance like a deployment name, a location and so on and so forth. And then you can output stuff which then can be used by other modules. So that's how they kind of talk between modules. It's also stateful. So it has what's called a terraform state file. That's something you need to take care of, which is part of why the TF controller from Weworks works so well for this. So the TF controller from WeWorks is a project where they're building on top of flux, which is again, why this wouldn't work with Argo CD in an easy fashion. But this works with Flux. So the source controller is used as a way of getting the information that the DF controllers is doing. And it's creating a custom resource for terraform. So you can define your terraform as a resource in the Kubernetes API. It also does a lot of lifecycle management. So for instance, if you don't define that, your state is supposed terraform state is supposed to be somewhere else. It would actually keep that in the Kubernetes cluster as a secret, and thus it can manage state in that sense. Obviously in production you want your state to be probably somewhere else. But for testing and demo like I'm doing now, this is perfect. It also has for instance, a dependency attribute. So we're going to look specifically at that, how we can kind of build up these deployments that are done in succession and reference each other. All right, so that was about it. I'm going to jump right into visual studio code. Hopefully everything works fine there. Let's see if I can find it. Here we go. So just an explanation of what you're seeing here. I got a double check. I got a visual studio code up here with a terminal. The terminal has k nine s or k nine s, which is basically a CLI, visual CLI tool. So what you're looking at here is a relatively blank Kubernetes cluster running kind on my local machine. Flux and all other Githubs tools work with any Kubernetes cluster, so that doesn't matter. But this is just for demonstration purposes. So I have that so we can visually look at what's happening. This git repository, it's on my account on GitHub. It's called GitHubs terraform or terraform githubs. No, GitHubs terraform. So everything that I'm doing here is available there. It's a little bit of a mess because I've been using this back and forth in different presentations and hopefully I'll get to at some point clean up a little bit. But in itself, if you can follow along here, then you can go in there and get the code if you want to. So I have some terraform files. I have one called greetings, which is a really simple, like a hello world type thing where I take an input called greeting and I takes an input called subject. And as you can see here, it defaults to hello GitHub stays and it formats and outputs the message. So this was written while doing a demonstration actually on this for GitHub stays. So we're going to override those with actual inputs. But this is just like a simple test to make sure that everything works. And what I'm going to do in the background here, I'm going to bootstrap flux, point it towards this particular folder where I have a flux systems. This usually will get overwritten, but there's a customization here that takes in what's called the GitHubs toolkit components, which is the original name of flux version two. But I'm also adding on a TF controllers YaML file, and if you look at that one, it's setting up a helm repository and a helm release for the TF controller. So what this will do is not only bootstrap flux, but it will also install the terraform controllers. And we have a greeting Yaml here. Sorry, I just need to make sure that I get this in the right order so we're all paying attention where I have a customization, but this one is depending on the flux system one. So we cant our terraform controller to get in because that has the CRD. So if we didn't do that, it will start complaining about CRD missing. Let's see, clean up so you don't see all the mess that I've been doing. So the idea here is that we'll bootstrap this cluster with flux, and TF controllers will also be part of the deployment and the greetings one. So we're running the flux bootstrap command for GitHub, pointing it towards saying that I'm the owner putting in the repository and we have to put in a little flag that says this is a personal one because this is not owned by an organization, this is an actual user's git repository. And then again, like I said, pointing against the path. So if we do that, it will do a lot of things and it says it will install the components. What I want to do is just jump in here and see these controllers starting to run. We then look at custom, let's not break k nine s. If you look at the customization, we can see that the flux system customization has already been pulled in and it says applied revision and then it referenced a get hash or commercial hash. So that means that we should have now a terraform, which we do, that says, that's called greet folks. That is this one. And as we can see here, we have it set to approve plan auto. You can have this to manually approve. However, the way that I see it, if you don't have it set to automatically approve, then we're kind of not doing Gitops again, that is something for a topic to be discussed. I feel that the entire idea behind Gitops is that deployment should be continuous and it should just work. Right. Anyway, that's beside the point. So here we're putting in variables for the greeting, changing that salutations, subject to files. And then we're writing the output to a secret. So if we look now we have a secret. Well, we have several secrets. I just created some in the background without you noticing. I'll get back to that. But we have one says greeting output. We have one that says TF plan default greet files and we have in TF state default greet folks. So when you're doing a terraform plan, what you usually do in a pipeline is that if you're doing a terraform plan, if you don't output that into a plan files, if you're then going to do can approval step and make sure that you say this is what's going to happen, do you want to approve it? If you don't put the plan into a file, you could potentially have a different result the next time you're running. If you're going to run it again with TF apply. So you usually put it into file. In this case this TF controllers puts that in as a secret. So if you look into the TF plan, you have a TF plan data, you also have a state, like I said, if you don't do anything, it will actually keep track of the state file itself in the Kubernetes cluster as a secret, which obviously you could then export and then, but somewhere else if you want to take a backup of it, so on and so forth. But more importantly we have this greeting output. I'm going to press X to decrypt it. Sorry. And now it says salutation folks. Right, so this is basically the workflow you can have these automatically run if you can put the outputs into a secret and then you can reference that back and forth. So this is not a great example. This is just literally running the most basic terraform ever, just taking inputs and then exporting that as an output. What we want to do is put in a new, to show off a little bit more capabilities we're using to put in a new customization which have an example of using dependencies. And if we look at examples we'll see we got a bunch of files here. We first of all look at the terraform ones. We have a root module for shared resources. This is where we're going to create some azure resource groups to have stuff in. We're going to put in a virtual network, we're going to do some security groups for that virtual network and then we are going to export that. So we're exporting the subnet id and we're exporting the resource group name because we have these shared resources. We want several virtual machines, for instance in this case this example, to be deployed into the same resource group in the same subnet, et cetera, et cetera. So we're creating these resources and then we have a workload which is basically just a vm. We create a network interface that is using that subnet id. We have a virtual machine that just gets deployed and we have an admin password that we're going to be using. So in the background here I put in two things. I put in a secret for the workload ids because we want these secrets to come in from somewhere else. We don't want to store that in git. You could do that if you're using sops or something like that to encrypt your secrets. But for demonstrations I'm just like putting the secrets in manual in the cluster while you're not looking. And then we're going to run the shared resource one. We're going to disable the backend config. So we're not going to store stuff in the Kubernetes cluster. I have put in a secret for, well, it says here terraform enterprise Cli config. It's basically just a token for terraform cloud. So I have a token for terraform cloud which means that I can use that as my backend. And we're putting in the variables, deployment name and location and we're going to write our outputs to a secret. And this secret is then going to be used by a workload, one which is doing exactly the same thing as the other one. But it has a depends on. So it actually will wait until the previous or the shared resources one is deployed in a proper fashion. So if we look at the customizations here and if we go into our customization file uncomment, it's kind of fun using conventional commits to stylize in a proper fashion, something that's basically just not really that important. But I'm just doing this because. I'm sorry, hold on to your hats, I need to fix that. I'm on a new computer, sorry. And now I'm going to sync the changes up and as soon as I do, well as soon as I do, after a while the source controller will see that there's a change here and we'll actually pull it down and apply that. Which means that our customization, this one will then pop in. If that takes a little bit of time, never mind. It can take a little bit of time to do that because it's on can interval but you can force reconcile it. But it's already done that. So if we now look at Terraform we can see that we have two more. We have the shared resource and workload. And the workload one specifically says that dependency flux system shared resources is not ready so it can't do anything. And if you see the other one is now saying that it's doing a plan. So if we look every time you put in this, it actually spins up a pod that is a runner and it runs the code for you. Let me just check my time here. And what we can do is we can open this hopefully. There you go and see the plan run in terraform cloud but in itself just like if you did this through a CLI, you will also get this in your CLI tool. So as you can see here, it started, it did a check. These resources are already existing because I didn't want to take any chances with the demo gods. But it runs over and makes sure that everything's good and when that's done it destroys that pod because it's done. However, now if you look at the terraform one, it wrote the outputs to a file and it said the ready state is now set to true. So as soon as the reconciliation process for workflow one triggers which I think is set to 1 minute I hope. Yes. So this should happen pretty quickly. It will look and see. Oh my dependent terraform resource is ready. So everything's good. So I can start do my stuff and like I said, hopefully that will happen pretty soon. There we go. So this now does the same thing. Now it spins up a new workload runner or a terraform runner. Sorry. On the workload one resource, in this case, like mentioned, I have deployment name and location, but it also takes in the workload secrets, which is more or less just not even sure what that is. It's the admin password. Let me double check. Yeah, it's the admin password and deployment name. So I'm overriding the deployment name because why not? And then it also pulls in as variables the subnet id and the resource group name from the shared resource one. And then it's using that to do the deployment. And I think this one got deleted, so it might take a little. All right. I don't know, it's just refreshing. State, no changes. Cool. So as you can tell, this is how we can do a dependency chain here. So you have a certain root module that is dependent on different resources. So you can kind of like build up a little bit more complex scenario in that sense. But one of the main things why I really like this way of doing terraform is because I'm building platforms on kubernetes and while I can program and go and I can make all this kind of a little bit more complicated things work and create APIs and so on and so forth, many people that I work with who are doing infrastructureascode, it's a big learning curve to start learning to program. If you're a more traditional infrastructure or cloud operational person, like doing terraform relatively easily, it's not really that hard. So by doing it through this fashion, like what I usually say to people, if you can put your things into a terraform file and make sure that you can repeat it over and over again so you have the proper inputs and you know, if there's a dependency for down the line, that you do your proper outputs. If you just do that, we can put it into this and then we can create basically a platform for automatic deployment of infrastructure and more using terraform as one of the base tools, which is kind of like a standard in our industry. So while the terraform part of this is just like plain straight terraform, we as platform engineers can kind of facilitate this automation in a more proper fashion. There's plenty of things that you can do with this. I highly suggest that you look into it and look into it. Make sure that if you are doing terraform, give this a try. It's relatively easy to get running. I basically did the entire thing from scratch here. But if there is anything that you're thinking about or want to discuss or something, feel free to just yell at me. I'm preferably on LinkedIn or through email and I'll make sure to help as much as possible. Thanks for having me here for comfort to DevOps 2023. I hope this was valuable and I hope to be back someday, so thank you.

See all 41 talks at this event!

Conf42 DevOps 2023 - Online

January 26 2023

How to achieve (actual) GitOps with Terraform and Kubernetes

Video size:

Abstract

Summary

Transcript

Roberth Strand

Principal Cloud Engineer @ Amesto Fortytwo

Join the community!

Featured event

2025

2024

Info

Conf42 DevOps 2023 - Online

January 26 2023

How to achieve (actual) GitOps with Terraform and Kubernetes

Video size:

Abstract

Summary

Transcript

Roberth Strand

Principal Cloud Engineer @ Amesto Fortytwo

Join the community!