Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi there, thank you for joining my session.
Today we'll be talking about pragmatic security
automation and devsecops in the cloud.
So let me quickly introduce myself. I am Joshua Arvin
lat people call me arvs. I am the chief technology
officer of Nuworks Interactive Labs. I am
also an AWS machine learning hero and I'm the author of
the Tools Machine Learning with Amazon Sagemaker Cookbook
and machine learning Engineering on AWS.
So feel free to check this out on Amazon,
especially if you're interested in performing machine
learning experiments and deployments in the cloud.
For today, our talk will be
composed of three parts. The first one would be a quick introduction
to the different concepts which we will use throughout the talk.
The second part will talk about the different attacks
possible depending on the type of system
that you're dealing with in the cloud. And lastly,
the exciting part, we will talk about. A variety of security
automation strategies would be useful for you
and your career. So let's get started. So let's
start with the introduction. So why are we
starting with this one? We would like to start with the reality
where a lot of business owners often would deprioritize
security and compliance, because generally
a lot of business owners think that hey, there's nothing to
secure if there's no business. So let's just prioritize the short
term and long term financial objectives first. And then of
course we need to prioritize also the client and
customer happiness. And then finally, that's the time they
focus on the other things like operational compliance along
with security governance and other compliance requirements.
So as you can see here in the screen, the priority level
for most businesses for security would definitely be
low, and you might be surprised because this
may not be what you're thinking of. However,
again, most companies often have this sort of
priority level.
That said, a lot of devsecops and
even engineers dealing with the applications and systems
hosted in cloud environments are often not
trained to handle and mitigate
different types of security risk and challenges. For example,
if you're a developer working in a company, and that
developer is working with a team of data scientists and engineers
to build and manage ML powered systems,
developers may often use libraries
such as this one, where if we were to load
a model with payload inside it,
then there's a big chance that the system would get
compromised. So the challenge there is that developers often
just focus on the business requirements and they often try to
hit the deadlines. So even if there are warnings or security issues
like this, again, developers will deprioritize this
and focus on getting the application working first.
So what if a malicious user
decides to upload a model in
the Internet and the team decides to use
that model? So that model, not knowing that there are
certain types of attacks which could take advantage of these
types of vulnerabilities, that team would
fall victim to that sort of exploit
where when the application loads the model,
the server where the application is running would be compromised,
which after a couple of minutes or maybe an
hour or so, the entire account would be compromised as
well. So again, the challenge right now
is that security is often deprioritized, which means that
in terms of training, in terms of budget, there's generally
no one in the team who would be able to manage the
security risk involved when building different
types of applications in the cloud.
So if you were the attacker, how would you
try to attack that system? Of course,
from a builder's point of view,
you'll be telling yourself, okay,
it's not a good idea to use that library. However,
if you were the attacker, you would think of
the offensive side of things, and you would think of
the different strategies on how to compromise the system. So again,
the way the defenders think and the way the attackers think
are generally different. And it's important for security
engineers to be aware of both mindsets.
So if you were the attacker, you would
have a malicious file, which is in this case, maybe a
machine learning model. And that malicious file would contain
payload, which contains instructions. So when the
application loads that file, the application would
also be running the commands
in that payload. So in this case,
the payload instructs the server to connect back to an
attacker machine. So that attacker machine is actually a different
machine. So the resource affected would connect back to the
attacker machine using something like a reverse shell.
So again, I'll repeat that part, a reverse shell.
So what do I mean by that? So, once the reverse shell has been
triggered and the attacker machine is connected, CTO, the victim
machine or resource, the attacker could easily just perform
any sort of operations and actions in the victim machine
or resource as if it were his or her own machine.
So similar to how SSH works, we connect to a machine
and then we basically type in the commands that we need to run.
So once the attacker is able to compromise certain
resources and machines, the attacker can technically do
anything. So the attacker can convert that machine
into another attacker machine and attack
other systems. The attacker could also check
if there are valuable resources and let's
say, database records, which could be used to
compromise other accounts or even steal money.
So again, the goal of the defenders would be to ensure
that the system that they're taking care of would not fall
victim into these types of attacks.
So in the previous slide, we talked about a single
move or exploit, and it's basically a single step where a
reverse shell was triggered. In reality,
this is just one of the different steps involved
when dealing with security attacks. And there is
something called the cybersecurity attack chain,
where the attacker makes use of a variety of
different attacks in sequence in order
to accomplish what needs to be done from an offensive side
of things. So for example, maybe there should be
some sort of scanning happening at the start.
And then once certain vulnerabilities have been,
let's say, verified, then certain services in
the application may be exploited. And then once
those applications gets exploited, then maybe
the next step would be privilege escalation, or maybe lateral movement,
where other resources and other accounts
may be checked and used in order to take over
the entire cloud account.
So once certain resources have been accessed,
then again records and even databases
can be downloaded and other sorts of
moves can be performed by the attacker. So again,
it's not just one step, it's actually a sequence of steps
where attackers take advantage of vulnerabilities which
can be exploited.
So from a defender's point of view and
from a defender's standpoint, how cloud defenders
and security engineers prevent
these types of attacks, given that
a lot of the company's
budget and resources are actually spent on the building
part. So let's say that you had 40 hours
in a week, the company would allocate maybe
35 to 38 hours of your time to
building the systems, and maybe just one CTO,
2 hours for other things. So you
would tell yourself, hey, isn't that not
enough in order for us to check and audit the
security aspect? And if you're asking that question,
then yes, that's actually the right question to ask,
because it's true. It's true that the
attacks cannot be prevented by just spending 1
hour on reviewing what types of security attacks
are possible. So the best move in order for you to save time and
help you manage the different competing needs in your company, is to
use automation to your advantage.
And automation helps us speed things up,
where instead of us doing things manually, we're able to
focus on the work that we need to do manually and
have the rest perform automation. So what's
an example of this? So for example, we have an
automated pipeline. This automated pipelines basically performs
a sequence of programmed steps where,
let's say you push in some new code, you push in,
some new updates, the automated pipeline would run.
And let's say in the second step, different sorts of tools
and libraries would be used to assess the code.
So maybe the different libraries used would be checked.
Maybe if there are existing vulnerabilities and weaknesses
in the packages that you used, those would be flagged by
the second step. The third step may be
a manual approval step where additional checks may be performed
by a security engineer, maybe using
a set of tools. And those types of checks need
to be performed after the second step. So even
if the second step has been performing manually, has been performed automation,
the third step can be manual and
then the final step would be the deployment step. So once all checks have been
performed, then the application can now be deployed.
CTO production so again,
these types of pipelines are helpful, and this is
about combining the DevOps CICD process
with that security component in order to ensure
that before we actually deploy the changes in production,
we're able to ensure that the system is also
secure. In some cases,
especially for companies without a security pipeline
like this one, teams would be deploying
changes to production only to find out that after
running vulnerability assessment test that a certain component
in their application is vulnerable. So they would have to
roll that back and spend maybe one to two weeks resolving
and remediating those vulnerabilities and then perform the
deployment maybe a few days later. So again, that's not
the recommended process. One of the best ways would
be to incorporate security as early as possible
and use automated pipelines in order to
reduce the amount of time spent on security checks.
So now that we're done talking about certain concepts
which are very relevant, CTO, the different other slides in this
talk, let's now proceed with part two,
understanding what attacks are possible. So a lot
of us, when we're trying to learn about security automation and devsecops,
we try to learn the different concepts like DevOps,
we try to learn the different automation techniques and we try
to force the usage of these concepts in
our everyday lives. We try to copy what other companies
are doing, but in reality, what we should be doing is we
should try to customize the solution to what our company needs.
That's why it's important for us to understand what attacks
are possible in our own system.
So for example, for this one, this is a system utilizing
Kubernetes inside our AWS account.
So here it's easy for us to spot. Hey,
isn't the cloud nine easy CTO instance high
risk, meaning that given that it's in the public subnet,
is it prone to attacks? So yes, it might be true that it might
be prone to attacks, but at the same time, a lot of us are
unaware that Kubernetes,
especially when not configured properly, can also
be prone to attacks. So security engineers must
be aware that developers and DevOps engineers trying
to use Kubernetes might be overwhelmed with a different
combination of steps in order to manage that
framework or by that tool. So security engineers must
be aware that maybe the one deployed in production might be
using the security standard
defaults, which the impact of that would be
that for that setup to be insecure,
given that the DevOps team didn't configure
the production set up properly, and a lot of teams
actually encounter that problem, they would try CTO,
use that shiny tool and they would try to include that
in their resumes, and then after one month, two months,
their system would be attacked because they forgot
to secure the production configuration.
So again, tools and services like
this can easily be secured as long as we properly
perform the steps to secure this. So again, teams forget
that crucial step.
So let's say that we have the type of system, we have to be aware
of the different types of attacks possible on
cloud resources as well. So in addition to Kubernetes,
as you can see, we're using eks elastic Kubernetes service
and we're using EC, two instances there.
So the different types of attacks and risk involved there would include,
let's say ineffective logging and monitoring. It may include also
sensitive data exposure, especially for misconfigured
resources. In terms of security
misconfiguration, similar to what I said earlier,
there might be security misconfigurations in the car container
orchestration tool itself. And in addition to
that, maybe the network security configuration has
not been processed well. Also,
and in terms of the inclusion of, let's say, CI CD pipelines,
especially when these accounts have these types of
resources as well, a lot of teams forget the
security of that also. And then in terms of secret
storage, a lot of teams have no idea where to store the
security credentials. So for example,
they'll ask themselves, oh, the best practice for this one is
to put everything in environment variables. So again,
there's a debate going around when talking about these
types of best practices, but the real best practice
here is looking for a solution
which would work best even if a malicious actor would
try to attack that system. So again,
best practices from an engineering end may not
really be best practices, especially if proven wrong by
real attacks.
In addition, CTO the example shared earlier,
there are also applications utilizing the serverless
mindset where different types of
serverless services allow developers and
engineers to build applications without managing servers.
So of course in reality there's actually a server there,
but that's the responsibilities of the cloud provider.
So here we only need to worry about the custom part inside
the services. For example, the serverless application makes
use of a function as a service such
as lambda, and it utilizes other services like
API, gateway and Cloudwatch.
So in terms of the surface area, the surface
area would be different and the attacks on serverless applications
would be different as well. So given that the team
would focus more on the code part, then a big
portion of the security of serverless applications
would focus more on the code part. So attacks there
would focus mainly on code injection, and it may include
SQL injection as well. So you might think, hey,
isn't SQL injection already old school
something which is an attack from the past?
So if a developer trying to utilize serverless
applications have no idea about SQL injection,
then they might try to simply load
the request input and put it
directly inside an SQL statement. So the
danger there is that developers would think, hey,
isn't all of this secure already? There's nothing to worry about when it comes
to security because we're utilizing serverless resources.
The unfortunate part there is that developers have no idea
that serverless applications are also vulnerable,
and the security level of these applications and resources
depend on the developers implementing the code.
So there's also something called denial of wallet attack.
So if you're using something like lambda, then a well
crafted attack would be able to use
that lambda function that compute resource and have
it running an infinite number of times.
So that would mean that instead of that company paying
for, let's say $50 per month, the company would be
surprised that they would be paying for let's say $1
million. So there are a lot of examples
of that happening all around the world, and some examples involve
developers making mistakes. However, in some cases attackers
would do this where
if they're not able to perform code injection, if they're not
able to have some sort of remote code execution,
especially on your setup, if they see that they can perform attacks
like denial of service and denial of wallet attacks, then they'll do
that as well. And in some cases, if you're not aware,
there's also broken authentication mechanisms which
can be taken advantage of. There are some companies who are
not aware that when they use and have credentials
in the front end part of a serverless application,
then the correct set of steps will allow
the attacker to get the credentials and
perform privilege escalation, especially if the credentials used
are bound CTo an account which have excessive
permissions. So again, the attacks
on serverless applications are a bit different, but again, when you're
using serverless, it does not mean that your
system is automatically secure.
How about machine learning applications and systems?
So let's say you're using a machine learning service,
and the input parameters to your service would include
let's say the model artifacts, some custom code, docker container
images, a bit of configuration and a bit
more. And that machine learning service is supposed to
produce a server, a machine learning inference
endpoint, which is used to perform predictions and inference.
So if you have a machine learning system like that,
you should be aware that in addition to the infrastructure attacks,
there are also attacks possible when it comes to data
privacy and model privacy. So again,
even if the infrastructure has been secured properly, it does
not mean that you're already protected. A well crafted
input can easily take advantage
of the weaknesses in your implementation.
And again, this would affect data privacy and model privacy.
So these attacks would include membership inference attack,
model inversion attack, attribute inference attack, and more.
So there kind of scary, right? So if this is the first
time you're hearing about these types of attacks, then you would be
surprised that attackers actually know more and they have actually
specialized on these types of knowledge.
Now let's focus on the third part,
the exciting part, security automation strategies.
Now that we have a good idea on what we need to do
whenever we're dealing with security requirements, let's now focus
on the different tips and techniques when dealing with security
automation requirements. So the first question we need to tasks ourselves is,
do we need to automate everything? The answer to
this one is whenever it makes sense,
whenever we're doing something over and over again, and we're spending
a lot of time doing something repetitive, when in fact
we could have been doing something more important,
then yes, it may be a good idea to automate
certain parts of your processes. However,
it does not mean that we have to automate everything.
Okay, so let's say that you have a team of five
engineers, and they would be automating,
let's say, for a certain
part of the process, and they would be spending, let's say, one month.
To automate that one, you would have to compute
the cost associated to automating that
part of the process. So of course you would need to take into account the
salaries of the team members performing
that automation work. So again, that includes
a bit of financial knowledge, because in addition to the salaries,
you would have to think about the other expenses
of the company. So again, there's a formula there.
But as mentioned, make sure that we automate the
right stuff. And speaking
of automation, the next question here is, is it
a good idea to use automated pipelines, especially when
trying CTO secure the output of the development team?
So in most cases, the answer to this
one would be a yes. Because instead of
your team trying to learn all the types of security attacks,
trying to specialize in security while they're learning how to code properly,
there might be a better way to do that by incorporating
security in the CI CD pipeline, where after the
code has been pushed, CTO production before the application
gets deployed, then a lot of security checks would
be running before the actual deployment step.
So the only risk CTO this one, is that a lot of companies forget
that automated pipelines need to be secured. Also,
because there is something called the poisoned pipeline
execution, where an attacker will take
advantage of the weaknesses of the CI
CD pipelines. So what happens here is that a
well crafted attack would use,
and would use commands, and these commands would be
run inside that pipelines. So let's
say that the code is pushed to a repository,
and then maybe the configuration is included there. Also, what if
the configuration has been poisoned by the attacker?
So, yes. So this may include a sequence
of attacks in order to perform that attack. But again,
that's one of the dangers when you're using an automated pipeline.
So make sure that your pipeline has been secured as
well. So how do we do
it? We incorporate the principle of least privilege.
And when you're dealing with cloud resources,
there are generally different types of
IAM entities. So for example,
one IAM entity would correspond or map to a real
human, but there are IAM entities which
would map to cloud resources.
So instead of your cloud resources being assigned
IAM roles and permissions, which are overly permissive,
the best move there would be to limit the
privileges and permissions there only to what is needed
by that resource. So let's say that the attack.