Transcript
This transcript was autogenerated. To make changes, submit a PR.
Are you an SRE,
a developer?
A quality engineer who wants to tackle the challenge of
improving reliability in your DevOps? You can enable your
DevOps for reliability with chaos native. Create your
free account at Chaos native Litmus Cloud afternoon,
morning or evening everybody. Hope you're having a good conference so far.
My name is Andrew Kirkpatrick and I'm here to talk to you today about self
service pr based automated terraform. So maintaining your whole infrastructure using
terraform and reusable modules makes most of our lives that little bit easier.
But when those less familiar with DevOps want to create or update resources,
you usually either have to train and enable them to use terraform or handle
the request yourself. However, what if you could offload the execution of those changes to
a centralized tool and just review both the code and output
being submitted for review? Atlantis, terraform cloud,
or m zero can act as a pr based feedback loop for a hosted terraform
executor to make self service a little bit easier. So infrastructure
as code solves some problems, but not all problems.
So having a codified representation of everything in your infrastructure,
whether that be cloud or on premise, is great, means you
can point to exactly what line of code represents what thing.
But on the same hand, that doesn't stop people continually bugging you
for I need this change made, or can you look at this? Or something's
not quite right here? And there are lots of legitimate reasons why people
submit change requests. So they need like a new virtual machine for
increasing capacity for an existing application, they need to test
out a new application, they might need to make changes to a
database configuration, all kinds of things. So what is
important to keep track of? Do you actually need infrastructure engineers to
make these changes, or are there very specific things in their day to day that
are actually the more important parts to make note of?
So do we need to make sure the changes being made are performed safely?
To make sure that production infrastructure isn't accidentally misconfigured
or deleted, or making sure that any changes, say,
to network access is performed in compliance with
whatever network policies you have? Making sure that any changes are actually tracked
against specific individuals or particular teams or
against specific projects. Making sure that the changes
that you made are codified in a way that makes them reproducible so you
can duplicate them or roll back in case of accidental misconfiguration.
But most importantly, from a pr based perspective,
is are we instituting a proper peer review process similar
to a pull request perk workflow for regular code changes? And do we
have approval by the correct chain of command, making sure that any changes that
hit x, y or z are approved by the people that
it should be run past. So why automate terraform? So Hashicorp
control language is a great way to represent all kinds of different parts of infrastructure
with many different vendors like AWS,
GCP, Azure, plenty more. And one of the advantages
of it is being able to kind of bundle up more complex
concepts in modules. So kind of abstracting
some of the complexity of I need this specific set of resources
to go out in this exact configuration each time and making
that kind of tweakable. So using the pre built building blocks
that are sort of like that, would you be able to kind of hand those
over to developers or other stakeholders to
kind of roll out mostly cookie cutter bits of infrastructure,
provide you kind of give them the guardrails to do so. And if
you did that, how would you actually validate that those changes are going to be
correct and make sure that the approval process is there to make
sure that only the changes you want to go out and only the changes
that should go out. So validation is kind of a key point.
So terraform plan on the command
line is these are the kind of changes that I'm proposing to make. What are
you actually going to do based on the difference between the code that I've
got and what's in remote state. So being able to validate
and revalidate that, what's being put up, say, in a pull request
in this example is accurate to what say
the developer originally intended versus what say as
an Sre you'd want to double check. Just making sure that
that kind of matches up. And from an approval perspective, say if someone's making
changes to core DNS records, making sure that
say via a code owner's file or some kind of other validation that the correct
people are getting notified to make sure they approve. Just in case, say we're making
a change that could take down everything in production to make sure that the
right checks and balances are in place, not just from an audit standpoint, but from
a safety standpoint and making sure, say, if you've got
integration between something like Jira and GitHub, that the right kind of
workflows happen in other tools such as project management. So what are we going to
talk about today? So this is just going to be a brief touch point on
each of these topics, just kind of running through terraform
how that kind of sits in a self service infrastructure concept
and evaluating some of the tools that are there and going through a few examples
of how that might work. So terraform at a glance,
for anyone that's not familiar, it's actually kind of like a domain specific
language representative in what's called hashic code control language of
these are things that I wish to exist typically in
the cloud, but also for on premise infrastructure. We also use it for identity and
access management. So essentially any kind of thing that you want to
create, update, update, or delete, fundamentally they just abstract underlying
APIs behind terraform providers, which essentially translates
how this code looks into API calls under the hood,
and it works with many different providers. This is kind of the way that I
typically think about it, which possibly not strictly correct, but I
think of code as the things that I want to be true. If I've
added, then I want these things to be created. If I've deleted, then I want
these things to be removed. State is what either I
or somebody else last caused to be true. So this is
at last known time that we modified state what
was true, and then the APIs represent what is actually
true as of this moment. So if I'm going to try and make something,
I'm going to ask you and you're going to tell me either whether it was
successful or whether it failed. And remote state is kind of key to working with
multiple engineers, which is typical for a team based setup,
but also in the sense of a pool based workflow. It means that you
can continue to use your local development workflow to work on all
terraform projects as you do typically say, syncing state changes to
an AWS S three bucket or Google cloud storage bucket for example.
And then you use your central executor to also sync changes
there. So it means you can collaborate using pool based
workflows for some projects, but continue to use a local workflow for
others. So as a basic example, just in case anyone hasn't seen,
say this is how I would create a bucket, I would type this, but giving
some basic details with some defaults that are provided by the resource, I would
initialize it, which essentially because terraform is modular, just downloads
the modules and well, those plugins rather that I need to communicate to
the backend APIs I could plan, which means these are the API
calls that are going to get sent. I would like to create this bucket,
I would then apply, which then actually creates the underlying bucket as
you can see in the UI. And then we'll write to state saying the bucket
got written, and then subsequently if I deleted that code, it will
say exists in state benign code so I must need to delete it.
And then you see that becomes then reflected in the cloud. So that's
great and all, but how does that hook into self service infrastructure?
Infrastructure engineers usually write infrastructure code, and application engineers
usually write application code, but that often results in the thrown
over the wall antipatten. In terms of I've built this application,
I don't want to care about how it's run, you go and figure it out,
or I manage these servers, why don't you run applications
that don't use too much memory? And that's kind of like a typical clash
of historical dev and ops days that the whole DevOps ethos is trying to
break down. And that kind of creates a couple of problems of infrastructure
engineers want applications to be changed so that they don't cause too
much memory, too much cpu or application engineers want
to make a change to infrastructure because they need a different database,
more servers, more capacity. We won't focus on the former,
so today we'll kind of focus more on those latter of. How do these requests
typically come in? So a lot of this say is either via a Jira ticket,
a slack message, tap on the shoulder, other kind of
just pop up, hey, I would like this thing, can you do
it for me? And that just kind of represents typical toil that
infrastructure, or sres, have to deal with on a day to day basis. So self
service is designed to allow these people to do these things for themselves.
That being said, writing terraform from scratch is pretty daunting. There's a
reason why infrastructure engineers are generally the people that write all these
complex configurations for, say, how load balancers are supposed to
hook up with specific firewall rules, security groups, all of that jazz.
So how do we kind of encapsulate that complexity away so that,
say, an application developer can just throw up a server that automatically
is in the right VPC, hooked up to the right security groups,
all of that stuff, and it just kind of works transparently. So I
tend to think of terraform modules as kind of classes of HCL,
in that you should be able to configure attributes of specific things,
but not everything. And all of the other stuff should kind of happen automatically out
of the box. So as kind of example, using a digitalocean VM, there's lots
of different attributes that you configure on the resource, but if you abstract
that behind a module, you can say, either make things directly configurable via,
say, the name, or you can use interpolation or even
say, ternary statements or hashmap lookups. So say I'm using
a string for environment to configure whether I want backups or monitoring to be in
place. I'm interpolating cpus and memory into a specific
string and then say I'm forcing things like I want
this vm of this type to be in this region, I always want it to
be in this image to make it consistent across say all invocations
of this module or in particular all vms. Say if you wanted
to use a pre baked packer image for example.
So what options are there to kind of facilitate this from a pull request standpoint?
So Atlantis was one of the first options out there,
Terraform cloud subsorm that came along as quite a fully featured solution
from Hashicorp themselves. Then n Zero is a relatively
newer player that takes a slightly different twist on the concept. We'll lives into that
in a moment. So Atlantis versus some of the alternatives.
Atlantis works purely on the basis of pull
requests, whether it be GitHub, GitLab or BitBucket. You can
run it in a container typically, but it's just a go application. So anywhere that
will run and it just responds to webhooks. So the kind
of scope of what it does is very limited and specific to say this presentation.
Those other tools are slightly more or less flexible. You can configure it
to work with multiple repositories, multiple projects per repository,
which is based on directory structure, and you can implement kind of
custom workflows which we'll dive into in a bit. Terraform cloud on
the other hand, is a solution that runs either entirely
in the cloud, entirely on premise, or using a hybrid model where you
can run the control plane in the cloud and then executers within your own environment.
Bit uses an enhanced back env, so if you're familiar with remote
stake for terraform, say if you store it in an s three bucket,
it uses a specific special back env where
local development and development in Terraform cloud will both communicate
to a backend that exists in terraform cloud with some additional features.
But it does kind of feature some other slight limitations in
that how it works with workspaces is slightly different. So whereas Atlantis
is only pr based, Terraform cloud offers many many different ways that
you can work with it. So there's an API rest API that you can interact
with, there's a CLI tool,
it's much more fully featured. The actual confirmation screen
for manual approvals for specific things actually happens within the interface
itself, so it's not kind of triggered via pull
requests, as we'll see later with Atlantis, but the UI is
fairly simple to use and self explanatory. When you confirm those steps,
you can see obviously everything happens exactly as you would see it normally on the
CLI bit has some kind of handy features, like being
able to block destroying things accidentally, and it has a
lot of integrations in terms of notifying you when certain things have occurred
within terraform cloud as an overall platform. Some of the gotchas that
I came across is that it doesn't support SIM links, so I
use some trickery to link TFRS files
into auto TFRS files that's not supported. So it's just one
of those things. There may be others, and one thing that got me initially is
that bit only supports enhanced backends. Not that it's recommended to work with,
but it will actually not. So it didn't read remote state from my GCS
bucket. Mzero, on the other hand, uses what they call organization
templates, and that's essentially kind of like a one
to n carbon copy of any project that you have in a
specific directory. So kind of the idea is more along those lines of
ephemeral environments. Say if you wanted to spin up a dev environment
based on a specific template, this is kind of a tool that's built around that
kind of workflow and uses terraform workspaces to do it. It's a relatively new
tool, so they're probably adding more features, and they probably have since I originally
wrote this presentation, but it's definitely looking promising so far. So the
project templates you can see, you can kind of create a workspace name. And this
kind of differs from a typical workflow in that you don't pre create workspaces.
Say like if you wanted everything to be a carbon copy, you have development,
staging, production, you can just create these workspaces ad hoc,
and that's kind of the intention, or at least what I took away from
trying to use it. So those environments kind of pop up and you get
what is intended to be like cookie cutter environments of,
I want say this load balancer with these three
application servers, one db like that, just like print,
print, print and repeat. That makes it quite easy to do. So some
of the neat features is that it's got cost limitations built in. So say,
if you've got a team of 20 developers making sure that
they don't just spin up infinite amounts of environments to test things
so you don't run out of money, having things be truly ephemeral.
So say if someone spins up a workspace to test something out,
you can set how long it's supposed to last and it can be automatically destroyed
after. And you can limit the number of environments per user.
So kind of, I think it's a great feature for the environmental environments,
and there's a lot of features that really help support that.
So for the Gotchas, it doesn't actually support remote state. It literally
copies state bars out of a working directory within m zero itself,
because it runs entirely in the cloud and there's no other way to run it.
And it uses workspaces to kind of manage that in a way
that you never really see. So everything that kind of happens in m zero stays
in m zero, which is useful if you are just using it to create environments
on the fly. But for more long running infrastructure, it might not necessarily be
the right fit. So how does this relate to pull request workflows?
So some people have kind of asked, why don't you just use CI?
Like, why don't you use something like Circle CI, hook it into that? You can
do it. But there's a few kind of gotchas that Hashicorp themselves actually highlight
quite well in their own documentation, one of which is making sure that
when you plan on something, making sure that state hasn't changed.
In the meantime, commits haven't been added to that, prs haven't
been open, and plans run against them for the same project
somewhere else. How that gets approved
and actually trying to figure out which directory or which workspace
on the same directory to work on. So kind of tricky things
that you take for granted when you're working on it locally, but from an automated
standpoint, those have to be sort of ways for it to identify. So the
plan and apply synchronization is just if, say you're running it on CI and the
plan can happen on any given machine, how do you get that plan
out? File, make sure that it relates to a specific commit, and then have
the plan and then subsequent apply happen on the exact
same commit. So it's kind of a slightly odd
workflow from a CI perspective, which is supposed to validate each commit as
being golden and good. You'd have to write it to somewhere, cloud it back
up. And then there's kind of the issue of the approval step.
So if I've planned something out, how do I decide after
potentially minutes, hours or days whether that's something that I
want to do and know that things haven't changed in the meantime
for something that's supposed to be continuously integrating,
that's potentially hiccups. So it's not that you can, it's just
having a nice. This is what I said I wanted to do. Am I sure
this has been approved? I now want it. I'm now going to let it go
ahead. You could do it automatically, but that has dangers. So in
the context of a pr, how do you actually get feedback on what terraform is
doing in the background? So Atlantis has a couple comment commands
which you essentially comment on the pr, and it will trigger Atlantis plan and Atlantis
apply, which triggers terraform plan and terraform apply.
Correspondingly, it will show feedback of those commands actually as
comments itself. If you assign like a machine user in say,
GitHub, GitLab or BitBucket, Terraform cloud will
only provide the feedback in its own user interface. N zero will
also comment back, but doesn't have a corresponding status check, whereas Atlantis
has both. So you can kind of see like it'll be churning away in the
background. It'll eventually give you feedback on what's going on in
terms of locking. So one of the issues I kind of mentioned before is
if you're making changes to one project and someone else also makes changes
to a project at the same time, how do you decide who goes first?
Especially if you've both branched off of master or main? Atlantis has a
concept of project locking, and this is
separate from terraform state locking. So it will keep separate
track of this and go. If I have planned out a pr over here
and someone else tries to make changes to, say, development DNS,
it'll go okay, you're trying to essentially make a same change to
the same thing. This person was first, so they get to go first, and you'll
see those locks pop up in the UI. You'll get a notification on
a pull request that will basically say this plan has failed because you're not
allowed to do it because someone else is first in the queue. That'll then become
unlocked and bit will sort of show you. If you want to get this pushed
through, you have to get the owner of this other pr to go first.
Apply requirements essentially comes back to your version control system
workflow. So if you're used to using GitHub, whatever approval
workflow you use there could similarly apply, which is kind of nice because if you
use things like code owners and people are very familiar with a GitHub,
GitLab or BitBucket type workflow, this is essentially the
same. So if you need two approvals before it's good to go.
If you need code owners for specific people, for specific files, all of that works
exactly the same and then mergeable requirements, basically making
sure that it's not going to cause a code conflict. So exactly
the same as most people are used to. And that I think is one of
the kind of comforting things about it is that this is kind of a very
similar workflow that people are used to in a lot of cases. Not necessarily
all of those is good and well, but where does this actually happen? Where is
terraform actually running? In the case of Atlantis, it's deployed
into your infrastructure, so it runs from within. Webhooks are sent into
some exposed DNS point, so it
takes a little bit more configuration to set up. Terraform cloud, as I mentioned before,
can be run entirely in those cloud with a control plane in the cloud,
or agent pools that run in your infrastructure. So a hybrid model,
or you can run it entirely on premise if you play for the enterprise plan.
M zero on the other hand, runs entirely in the cloud. But one
thing that some people don't necessarily consider to start with is
that when you're normally working with terraform, you typically identify
as yourself. So I am an SRE.
I have these elevated credentials that work in
say Google Cloud, AWS, GitHub,
pagerduty, whatever provider that you're working with, you'll identify as
you and you get elevated permissions on the things that you have access to control.
Whereas if you're using a central executor, it typically has to be a service or
a robot account that you give permission to on
behalf of this one place. And then everyone essentially tells
this one thing, the central executor, what to do.
So that can be good and bad, depending on your viewpoint.
But creating these service accounts is just kind of one consideration, and you need to
figure out how to get those credentials into which is easier, say,
in those cloud solutions like terraform cloud and n zero with
Atlantis, you're going to have to figure but how to inject them. But being in
mind that these work exactly the same as the providers normally do.
So the configuration for that on your desktop, is it going to be exactly the
same for, say, if you run it in Kubernetes? How are you going to inject
those credentials into the Kubernetes pod if you're running it in Fargate,
which I have done, same kind of thing, you need to just figure out how
to get those credentials securely in there,
which entirely depends on your security posture, how strict you
need to be on that. So this is kind of a very basic example
of how you can inject various secrets keys
into an Atlantis pod. In Kubernetes.
You can either fetch these from vault, given the correct integration,
or say you could use Kubernetes external secrets. There's lots of different mechanisms.
You can use. Anything which you're normally using for making you sure your secrets are
more secure kind of applies here, but this is a bit more of a manual
approach in terms of how that runs in the background. You can kind of see
those is how Atlantis will figure out how to do things. So it has
a Yaml file, similar ish to the other two,
where you can basically say I want you to track these projects in those
places. When you see changes, I want you to do these things
and you can then apply customized workflows on top of that. But otherwise
it runs pretty much as you run it on the desktop. Go to this directory
init, plan and apply, and then print out the results.
If you did want custom execution, those are certain ways that you can do
that in Atlantis, much more limited in terraform cloud.
But M zero also supports custom flows, which is kind of nice.
So a potential weird example of this,
say, if you needed to get specific special credentials,
say for AWS, in this odd example,
you can run custom scripts. So I can inject scripts into the pod,
I can run essentially arbitrary commands before and after every corresponding
terraform command for the plan and apply. So if I need to generate
special tokens, modify tokens, if I need any
kind of homegrown lives weirdness as part of your workflow,
you can get that in there. So if any provider doesn't do what you want
out of the box and you're doing anything funky, you can pretty much get.
That's all set up to actually kind of show you how this looks and work.
So we're going to edit a zone file, I'm going to delete one of the
records that we've got, just as an example.
So let's pull that out of there. So this is Google Cloud DNS.
So you can see the Atlantis UI on the left hand side. Doesn't really do
an awful lot. So I'm going to commit that change to version control,
push that up, and then I'm going to create a pr
off the back of that to say this is what I want to delete.
I want my automation to make this change for me. So you see that pop
up. I'm just following the link, put in a comment, explain to my team members
what's going on and then hope
that the relevant people will come along, review it. Obviously I haven't technically
reviews in this example, but get the point. You see, the Atlantis plan
will happen in the status check. Eventually it will comment back and say this is
what I plan to do. I'm going to delete this DNS
record because that's what you said you wanted. So I'm like okay, great,
Atlantis apply, let's get this DNS record blasted.
You go to apply that and the comment you'll get back is you
need someone to approve this. I'm not going to do it,
so go and get someone. So special person
comes over, reviews, everything looks good. So let's comment
Atlantis apply again. The way that I've got my
workflow set up, once everything's good and
golden it'll apply and I set it to automatically
merge the branch for me and then delete it off the back of that in
terms of how the locking looks. So say we've
already got a pr up that says I want to delete this
record. I'm being to add the change in,
pop the record back in. So let's say,
yeah, I want this record back. Why is it gone? I needed it.
So we'll put up the branch, create a PI just
like before,
open that up and then we'll let Atlantis work away in
the background and you get an error message which will say this project is
currently locked by unapplied plan. You take a quick look in the UI and you
can see that pop up. It's for this repository,
this project. So you can click through onto that and go ah,
okay, so they need to go first that's locked.
Once that goes through, then I get my turn.
So as for n Zero, say I want to create a project environment.
I will go to a project template which is going to be essentially a project
somewhere in my repository, in a version control system. I'll say
I want you to make a new workspace, give me a new project environment that's
based off this project. So it's going to go through, clone out the repository,
go through very similar steps, initialize everything and then it will
give me a plan and we'll wait for approval. Say I'm
being to make these resources that are in this project, like this is what I
plan to do, do you want to do it? So manual approval, step in those
UI here, you'll go through there, think about
it some more and eventually apply it and create the environment for you.
See that everything was created and then we're good to go.
In terms of a pull request workflow. Say we're going to jump into
our zone file here again, delete a record, and then we're going
to commit that and push that up. Same workflow as usual,
but up a pull request.
So once the change is up, go through the same process as
before, add a comment for clarity.
And then once we've created it, you'll see that M Zero
will eventually comment back saying this is the plan of things that
I plan to do.
Takes a little while, and then success, we flip back, you'll see that
bit essentially comments back similar to how Atlantis did, saying I'm going to create
these resources for you.
Once that's all approved by somebody, I'll be able to get that merged
in. Then that will trigger an actual apply. So this is kind
of the difference between this, say an Atlantis and in fact terraform
cloud in that you will get the plan and kind of
like a preview of what changes are going to be made ahead of time.
And then once those have actually been merged back into your master or main branch,
that's when the apply happens. So just kind of a difference of what
happens before or after. You'll get the manual approval step here,
which appears in the UI itself saying this has all been merged into
main. This is what I'm going to now create
for this project environment for this workspace for this project. Are you sure?
Rolls ahead. Terraform apply creates that and then just gives
you those output. In terms of terraform
cloud, you can run a plan manually, say via the user interface
that looks fairly similar to the other two. In terms of what it looks like.
You'll see a plan. What I plan to do, you'll get a
manual confirmation step similar to m zero, which you have to agree to in
the terraform cloud UI apply finishes.
All good. And that's pretty similar. In terms
of how that works in a pull request workflow. Very kind of similar.
I'm going to delete a record from a DNS server again.
Let's delete it, get that committed, get that pushed up to a pull request.
So you'll see the change comes up. Let's create that,
add another comment for clarity.
And then you'll see that terraform
cloud will show the outputs of what it plans to
do off the back of that. So you'll see that show up as a status
check, but it won't actually comment back on the PR itself. Once I've
got an approval from someone, I'll get that merged in. So that's merged into Main,
I can then see similar to m zero, but more similar
to how Atlantis works. And I'll see the apply actually
come up here with their manual confirmation step
in the same way. Let's say, okay, these changes now remain,
it's good to go, let's run that apply, let's get that approved,
and then that's now applied and out into the wild.
So what are some of the advantages and drawbacks of a PR based workflow?
So some of the advantages are that, say, if you're working with
people that need to make infrastructure changes from time to time, but they don't necessarily
have everything checked out, set up good to go,
because that's what they do day to day. It allows people to kind of dip
in and out of making infrastructure contributions, which is nice for people that
need to make changes now and then. It also kind of adds a proper
peer review process before execution, which is nice. So I
imagine myself and probably many other people make
sure they actually terraform, apply certain things to make sure that things happen. Because as
anyone that's used more early stage terraform providers know that sometimes
just because something looks good on a plan doesn't necessarily mean it will work.
Also, there can potentially be conflicting things in,
say, AWS GCP that aren't necessarily apparent when
you try and make changes via the API. Sometimes it's nice to catch things like
this. In this case, say with Atlantis, you'll catch that
pre pr merge, which is kind of nice bit ties in nicely
into other workflow automated tools. So anything that else that hooks up to
your version control system, say, if you want to hook things into Jira,
like for full auditability, making sure that fires off this,
that and the other to any other systems, just to make sure that checks and
balances are in place, it potentially decreases your credential theft.
But the flip side of that is obviously now your
credentials are in one place and it can alleviate some
of your kind of toil bottlenecks. But a lot of that is going to depend
on people's familiarity and comfort with a peer
review process. So how kind of streamlined are your
prs flowing through normally? How are
people used to that kind of workflow as it is?
Also sort of how well documented is not only your own code
base, but say the providers that you're using. So if you're using some of
the providers that are less well used, less well known,
that can be a bit more difficult for people who are not familiar day to
day to kind of drop code in and making sure that
the code that's been written and preexist is actually easy enough to work
with. So if you're not making efficient use of modules,
if you haven't segregated your projects up into small
enough chunks, it can be a bit unwieldy to work with. And that can be
scary, especially for someone seeing feedback on a
pull request. I didn't ask to delete 100 servers.
What's going on can be scary. Some of the disadvantages though
is that say especially if you're developing a module, the feedback cycle can be
really slow if you're fully reliant on a central executor or remote
executor, rather to be previewing what you want to do.
So like me committing code, pushing it up, waiting for something in the cloud to
churn away and tell me if that's good or not is a lot slower than
me just developing it locally. So I typically develop modules
locally for reusable for other people and then once they're kind of tested and good
to go, start hooking those projects that use those modules
up to self service. One somewhat
controversial point, depending on how identity access
management is handled at your company, is that you then kind of move some
security controls from say like
my AWS account, my GCP account, to version control.
So you then sort of delegate power to my GitHub account,
my GitLab account. That may or may not be desirable or
bit may be completely anonymous depending on your security model.
As mentioned, having kind of a skeleton key, one thing that has credentials to
everything can potentially be a risk, but really
you can apply those same kind of secrets management technologies
that you would to the rest of your infrastructure. So in theory it's no more
or less vulnerable, depending also making sure you don't host it in places
that are like a big attack vector. So not putting in production
like we run it in a separate cluster, you've got to maintaining yet
another thing. So Atlantis is pretty easy to set up, but at the end of
the day it's something we do have to manage and keep an eye on.
And it can be problematic if it runs on some of the same infrastructure that
it controls. So trying to make cluster updates to the same cluster
that it's on, it's not really much of an issue, but it can catch
you out if you're not paying attention. So why did we choose Atlantis
open source? And being free to use was a big one. It's quite
well maintained and there's been a lot of contributions to it. So it's not
like a dead project. It's pretty active, to be honest, it's pretty easy
to use. There's not a huge amount of functionality in it, but of
the functionality that we needed, it covers most of our bases. And the flexibility
with custom workflows and being able to inject firebase configurations,
say for managing multiple kubernetes clusters is kind of useful,
which was less obvious to do with the other tools. But as I mentioned before,
it's been a little while since I evaluated terraform cloud and Mzero,
and they've been moving at very rapid pace, both of them.
So definitely worth checking out, sort of where their feature set lies at the moment.
So in summary, Atlantis is
great. It runs entirely on premise, which is quite nice for some people,
but you are going to have to set it up yourself, figure out how to
run it, and you're going to have to manage how secrets get into there.
So a little bit more to be concerned about. It's not out of those box.
M zero seems like a great solution for I want to allow
people to make sort of environments on the fly. I want developers
to quickly test out features with production like environments. I think it's an excellent,
or it looks like an excellent tool for that. Terraform cloud on the other hand,
is kind of the big guns enterprise solution. So you can
use that. Everything from it covers everything that Atlantis does all those
way up to managing big company stuff. So I
think the feature set there is endlessly growing
and I think for a lot of companies that's probably going to be the logical
one to choose. But it really depends on what you're after,
especially because their level of support. Know there are plenty of people at
Hashicorp that can help you out with that. Thank you very
much for listening. I'm Andrew Kirkpatrick. If anybody's interested in engineering
at Partnerstack, please take a look at our job vacancies.
The link to this slide deck is on slideshare and if you have any
questions I'm at magic, but on the Internet,
just come and ping me and ask me a question. Thank you very much and
hope you have a lovely day.