Lessons Learned from Writing Thousands of Lines of IaC

Video size:

Abstract

Immutable architecture is the backbone of infrastructure as code, to ensure production environments cannot be changed during runtime. While this has the benefits of its inherent safety measures, this can also be restrictive, all while creating new challenges for security. Immutable concepts are much more effective when it comes to securing cloud native environments and infrastructure, which is becoming an increasingly more complex task.

This talk will focus on some of the fundamentals of immutable architecture, best practices and recommended design patterns to work around its limitations and enhance security, as well as what you most certainly should not be doing when running immutable architecture both from an infrastructure and security perspective.

This will be demonstrated through a real-world example of deploying a single-tenant SaaS in an automated pipeline, typical challenges encountered, and what was learned on the way, through a Terraform, Kubernetes and step functions example.

Summary

Eran Bibi is the co founder and chief product officer of Firefly. Today's talk will be about some of the lesson learned while writing infrastructure as code in terraform. He will embrace the immutable infrastructure concept. Try to minimize the duplication of code when you're writing.
My next few tips will be regarding terraform state file. State file is the persistent way to keep the current state of the cloud. Use data form data block to get information out of cloud provider APIs.
My next tip for you is don't bypass the infrastructure as code pipeline. Even if you will put those safety measurement and try to do everything for education and behavioral change in your team, you need to make sure that you are monitoring for infrastructure drift.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hello. Welcome and thank you for joining my session. My name is Eran Bibi and I'm the co founder and chief product officer of Firefly and in the past decade I'm doing DevOps as an Obi and also as my profession. And today's talk will be about some of the lesson learned while writing infrastructure as code, specifically in terraform by embracing the immutable infrastructure concept. So let's start with understanding what is immutable infrastructure. So the concept saying that you cannot change or alter any configuration after you are provisioning a server. And it's very different than the traditional way of provisioning infrastructure where you provision a server and then keep patching it and changing the configuration over time. By having immutable infrastructure, you can enjoy a lot of benefits, lines keeping everything consist, and also you are getting more predictability about what is going on because everything is the same. And in case you have some issue, you just need to create a new infrastructure and you don't need to worry about the changes between the service that you have or the setting that you have in place. And as Bernie says, new is always better. Okay, let's dive into the details. So in this section I'm going to share with you a few of the tips. I'm going to use terraform as the main language for infrastructure, as code in my example. But some of the tips are relevant for other frameworks like Pulumi and CDK. The first one is a very basic pattern of using models. So instead of having a huge terraform file describing all of your application and infrastructure in a single place, you can split it into smaller parts and then reuse them. So if I need to describe in few words what is models? Models are smaller pieces of infrastructure, has code that the queue can share between project. And if you want to get the idea of how modules look like, you can see in this diagram. So you have the main project, this is the root model, this is what you are doing as a default. But if you would lines to have smaller pieces and reuse them in few separate use cases, you just creating that directory called models. And then you even can create a child model inside. So a child model can be that networking piece or the storage piece, everything that you can reuse in other project. So by embracing that practice you basically going to the other suggestions that I have is to use the dry patterns, do not repeat yourself. So try to minimize the duplication of code when you're writing. Even if you are going deep dive into one project and you think you will be done after completing that project, think about going ahead and you will assume that infrastructure changes happening only on a certain level or a certainly resources. So make sure to create that logical separation between the resources so you will minimize the risk. If you are writing comes change. You don't want to take the risk that if you are putting a typo or something that you missed in your pipeline will damage the entire application. You want that the blast radius will be small as possible, so try to create smaller pieces and reuse them whenever you can. Another tip that I have for you is keep everything consistent. So before you are creating your first project just do the reading and understand what is the best practice for naming convention and try to keep everything in the same convention. If you are using variable which is highly recommend just to make sure what is the right order to use them and if you are inherent resources from place to place, make sure that you are doing it in the right place and also about the use of the models. So you can use out of the box models that you can find in terraform registry but you can create new one. So before just writing a lot of code by yourself, make sure if that code can be available for you already written in the web. My next few tips will be regarding terraform state file and if you are familiar with the terraform architecture then you know that state file is the persistent way to keep the current state of the cloud. So terraform will know how to create new plan once you are introducing a change to your cloud. So if you are a team with multiple people that using terraform against a certain type of infrastructure you should use remote state and terraform providing a backend service that can help you to save the state in remote location. I personally prefer the s three bucket as the place to save the tf state file and this is what taking me to the next point of you need to make sure you are backing up the state file. So because the state file is so crucial in the piece of having a healthy terraform deployment, you need to make sure that you always keep in a backup of the specific state file. So if you are using s three as I did, you need to make sure that you are turn on the versioning in the s three bucket and then you have that peace of mind that you can always go back to the previous revision of the state file in case there is comes disaster of something happened to the state file and you need to go back. The next tip is to use the state locking. So we started with understanding the terraform is something that you can collaborate with other team members. So you need to make sure you know how to handle a situation where multiple people trying to provision and introducing changes into the infrastructure in the same time. And this is why there is a feature called state locking. So if you are using s three, you need also to use dynobodb in order to manage the lock. So the entire locking mechanism is just to prevent that situation where two people trying to write to the same file in the same time. So if someone doing a change while terraform is writing the state file is basically locking it for changes. The other team member will get a message that the state file is locked. And there, there is no situation where two people creating to the same file in the same time. The next tip is try to avoid situation when you are manually changing the state file. So as I mentioned, state file is only a JSON file and is human readable and you have the power to edit it and remove lines and add lines. But this is a bad practice and what will happen when you will manually change the state file is basically a place for a lot of errors. So if you are in a situation when you would like to import new resources that you are not managing terraform and you like to make them manage, just use the terraform import command. So I didn't face any situation that required a manually modification of state file, but I heard about a lot of people that tried to do that in some cases and they ended up with a corrupted terraform deploying. Next, I highly recommend you to use data form data block. So the data function is basically a fantastic way to get information out of the cloud provider APIs. It can be the cloud provider or other provider that terraform uses. But the thing is, just think about it as a query language that you can use to get information from the providers. And I can give you a very quick example of a great use of data calls. So if you have a model and you have an r coded list, for example, I would like to have a list of the availability zones. I put them in the file. So this list is basically statically handled and each time there is a changed in the availability zones I need to do a manual change. But if I will use the data block, I basically have in a call to the cloud using the provider and in this case I will use a call called AWS availability zone and I will give it a property called state available. And basically this data block will get the list of AWS availability zones which are available in a dynamic way through the API of the cloud provider. So the only thing I need to do is just to create that variable, the availability zones and put the value of the data call. So this is just an example. And a data call also allow you to share information between project. So if you will use it wisely, you will find that data call is a powerful feature in terraform. My next tip for you is don't bypass the infrastructure as code pipeline. So when you are creating an infrastructure has code, it's basically a committing to a certainly way of provisioning infrastructure and then infrastructure changes. While it's very easy to go to the cloud console or to use the CLI or API of the cloud, all of the changes from now on have to be through the infrastructure as code pipeline. So do your base your best and put those safety measurement in order to make sure no one can alter and change the infrastructure directly from the cloud. And this has got me to the next point of you should understand that even if you will put those safety measurement and try to do everything for education and behavioral change in your team, you need to make sure that you are monitoring for infrastructure drift. And just to make sure everybody here understand what is infrastructure drift. So drift is when your infrastructure become different than the one you describe in your infrastructure as code manifest. So it's mainly because somebody doing a manual change, but it also can happen by a third party application like a CSPM machine that also creating stuff to the cloud. So you need to make sure to have the right tooling in place that always evaluates your terraform state and terraform HCL configuration against the real actual deployment on your cloud, there is few projects that can help you to do that. And I think the most important tip that I have you today is just treat your infrastructure as code, the same as any other code. And the meaning is that if you have a very good CI CD pipeline with reviews and static code analysis and lines and scanners and all of the good stuff that the queue can put in the shift left in the CI, just make sure to have them also in your infrastructure as code. For example, there is a great project for security scanning like TF scan and Chekhov. There is a very good project that even can give you a cost projection like infra cost I o. So just make sure that you put all of those stuff even for your terraform code on the other infrastructure as code language. And also don't forget about getting your peer reviewing and approving your pull request. And this is something that we tend to be more loose with that state of mind. But from my experience, once we became more measures with understanding that infrastructure has code is just like another code, we see how the quality become much much better over time. So if I would like to conclude all the stuff that I mentioned here, and I think if you take not all of those items, only few of them, and you implement them in your journey, you will be with a very successful point and you will have a better experience using infrastructure as code. Thank you very much much. I will be available for any other question after the talk. Thank you and see you again.

See all 25 talks at this event!

Conf42 DevSecOps 2021 - Online

December 02 2021

Lessons Learned from Writing Thousands of Lines of IaC

Video size:

Abstract

Summary

Transcript

Eran Bibi

CPO @ Firefly

Join the community!

Featured event

2025

2024

Info

Conf42 DevSecOps 2021 - Online

December 02 2021

Lessons Learned from Writing Thousands of Lines of IaC

Video size:

Abstract

Summary

Transcript

Eran Bibi

CPO @ Firefly

Join the community!