Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, thanks for joining me for my talk about
automated serverless security testing. My name is Tal.
Let's start. So why am I giving you this
talk? About five years ago I joined labs,
a startup based in Israel that developed a
serverless runtime protection solution. Before that,
I never heard about the the topic serverless
before. After two years on
the road, we got acquired by Checkpoint
security, and after that I decided to
shift further left and create my own company called Cloud Essence,
a cloud native security testing
company that got acquired, by contrast, security last year
you can find me on social media, typically with my handler
apps with four. With the four,
right. Why is it this talk? Interesting.
Forrester predicts that one out of four
of you by the end of the year will use serverless regularly.
So I think it's an important topic to talk about,
and it's something we should be aware of. The security implications
and challenges in serverless and serverless testing
might be different. Let's see.
Typically, we already know that the cloud transformation
has begun. We know everyone start talking about cloud and cloud development.
Cloud native. Of course, there are companies like iRobot
and Skyscanner which are pioneering maybe this type
of development, and they are ahead of the
curve. But we can see even big organization coming
ahead with cloud native development. And even if now they are
in a kind of a hybrid solution between monolith and cloud
native applications, they're going to move
more and more and we're going to see more and more adoption into cloud
development. Okay, but what is serverless
really? Serverless is a lot of things. First of
all, it's a different architecture. No more big monoliths,
one big flow application. Instead, many, many resources.
Independent OpenSource, which are configured
to talk to another and other services in order to create
the logic of the application. But each of these resources is
independent, and we have to take care of that.
Cycles are different. Developers devsecops,
you might have heard hyper agile
development cycles. No more waterfalls,
quick iterations and quick
time to value. Processes are different.
Typically in the cloud, you better automate everything because
you do not own the infrastructureascode. Then if
you don't really automate the things,
automate everything. It's really hard to get visibility and get
information from the right place at the right time.
So mostly we're automating things to get
the information and the data that we need.
The decision making is also changes, so it's less about top
down. This is the time, this is the place
to do something. Developers get more and more responsibility.
Part of it is also security, and we'll touch that drives
bottom up decision making, letting the developers get more and more
responsibility.
Serverless architecture. Well, this is a big picture, a small
picture, sorry, of a medium sized, maybe even a
small medium sized application that is built in serverless.
You can see maybe a couple of dozen APIs
and certain functions and some other resources
may be summed up to, I don't know, 200, 300 resources.
This is a small application. I've seen customers
with millions of functions and resources.
I cannot even start to imagine how it would look. The problem
is that if you don't have this kind of look that we provide or
some kind of visibility into what you have, it's really hard to
understand what is connected to what, who is talking
to who and what is my risks. Where are my risks here? Because if
I'm a security expert or even a development
manager, it's hard for me to really understand what
is going on in my system. I mean, lambda functions and other
functions pop up on a daily basis, if not
more into production. Even so, it's really hard to follow that.
That also means that we have to take care of each of these resources
in a separate way. Sometimes for security reasons.
We'll have to monitor each one, to authenticate each one,
to perform zero trust,
or any type of security that we
want to apply into each of those resources separately.
Well, serverless, just to understand. It's less
about a synchronous flow,
it's more about an event driven architecture. Something happens
inside your cloud. It could be a file
that was uploaded, downloaded, deleted, an API call, which is
kind of common. A table or entry
inside your database that was changed an Iot rule,
a log, an analytic, everything that you can imagine,
almost anything that you can imagine,
something happened. It runs your code and your code interacts
with other services and opensource in the cloud. The problem is
that where your code is could
also be your mistakes. And if you're making big mistakes,
that could really end up in a cloud disaster, data breaches
and whatnot. In general, AWS lambdas
work like this. Something happened, triggered the function.
AWS pins up a container for you and we'll talk about.
It's not really a full container. It's like a runtime
environment ready for you. It runs your code.
When the code finish,
the container dies. There is no more container.
What about some security aspects about it? So I'll
touch mostly about AWS and lambda here because it's
the most common one. But you can think about some of the aspects in
other cloud providers as well, like Azure, GCP,
Alibaba and some others. So in
AWS lambda, the environment is a read only environment.
If you need to write into somewhere it's going to be temp.
It's the only write permission directory
that also have some security aspect. We'll talk
about it in just a few seconds. The environments
is not really wired to the interfaces. I mean you can connect to
services, APIs, whatever you want, you can connect to the outside
world. Getting inside access is
not possible. I mean you cannot ssh to the runtime,
you can just run your code inside, that's it.
The data is temporary, meaning that when the execution
ends and you're in runtime dies, the data that was
there is terminated, is deleted. That is
true, but for performance reasons,
the cloud provider, in this case AWS, recycles environments.
So in order to save time of
spinning up environment new environments, if a request coming in or an
event happened, it executes your code. Then another one happens.
That cloud be even dozens or even hundreds at the same
time, randomly, the cloud provider will take
a run environment they already ran because it's up and just
give it to the next coming event.
That means that the data that was before, if it was not deleted by
you, by the code is still there.
And if you have some security issues
in your code, someone can even access this data and exfiltrate
it. What other security interest
aspects we can think about? It is the code itself lies
or resides in the environment. So in order for your
lambda to run, the container comes up, your code is inside and
it runs your code. So the code is there. If I have access to
the runtime, I have access to your code. The keys
are also inside. That means the keys that are basically
the permission keys for the lambda inside the
cloud. This is what lets your lambda function communicate
with other services and resources.
This is a big challenge in security specifically for
lambda and in the cloud in general.
And they're inside the environments. That means that if I
get access to the environment, I get access to the keys.
I can do some, maybe even bad things inside
your cloud.
Serverless security testing a thing well, this is
pretty old, but you can see a continuous rise in serverless computing
trend. This is a Google trend map.
It goes and it continues to go up. There are
some maybe during the COVID
period there was maybe less,
but it's picking up and growing
with time. But you can see that for three, four years
the servers security search on Google was
between zero to one or two at the top
one time and I'm pretty sure
that from here, around here, all the
times that you see one is me. So no
one really talks about it. And I think we should have more awareness about server
security, because even though it still contains
some of the previous security aspects, application security,
there are some challenges that we need to discuss.
Okay, so let's say it's a thing. Can we apply the
traditional application security to serverless? If so,
we don't really need this talk because we can do whatever we've
been doing before. Right? Let's see. Let's inspect
that. So what are the biggest challenges? The first
one is the
provision or the policy given to the function,
which is a huge thing or a big challenge, because if
we're talking about one function, it's relatively easy because you can look
inside the code, do some code reviews, see the API
calls, write, but item, translate it into put item.
Actually this is the good
one. The put item here, the X is because it was an animated
before it was converted into PDF.
So before the put item here, there was a wild card,
the star asterisk, which means the function can
do any action inside your,
inside the cloud, which means that if someone has access to
this function or the code or the runtime,
they can do whatever they want inside your databases,
even if it's a database that it's not even related
to your application. Because also here there was
a wild card, a star meaning access to any
table, whether the list privilege what we
calls, which is the right permission set, is to set put item
as a specific action. That means that if the function is
even if it's vulnerable and someone can run your code
or execute your code, or maybe give an arbitrary
code for the code to run, it will be blocked
by the cloud if it's not the specific policy
that was given. So if I'm going to change that into delete or
scan, even in the code, even if
I change the code, if the policy states just put item,
that means I will be blocked. And you should do it also on the resource
level. So instead of putting a wild card here,
you should specify the exact table
in this case, which is taken from the environment variable. So we
should take the environment variable value and put it inside
the policy resource. Well, when you do
it to one function, it's really easy, but when you have to do it at
scale, it becomes a problem. What happens
if you don't trust me is this. The developer
will go to stack overflow or any other website
will look for my
lambda is unauthorized to perform dynamodb scan. Okay,
I'll put this error in some forums and
I'll get hey, I work with an Amazon engineer and
it turns out the problem was the policy configurations. It should be dynamodb
Star. No, it should not be dynamodb Star.
Of course this gives the function tools many permission.
It just need to do a scan. Let's see another example.
This was taken from stack overflow, right?
I don't even remember the question, but someone said I
solved this by adding AWS lambda full access provision
to the lambda. Just go to the lambda im role,
blah blah, specify everything and add the provision.
That should do it. No, that should not do it. You know why?
Because AWS lambda full access policy looks
like this and this is a
tranquate version of it. There are more provision, I just put the
important ones or the risky ones. Cloudwatch Star dynamodb
Star events star Lambda star. You can
execute any code logs Star s three star
resource star. Lucky for us
AWS deprecated this. So no,
AWs lambda full access should not be the solution for
your lambda code.
Okay, we talked about the policy,
let's talk. But other security challenges we have in
a monolith application, usually we have a synchronous
API request coming in through
the load balancer, through the API gateway, whatever that is.
We can put all our security
tools and security capabilities in this point.
So whatever comes in does through input validation,
through output, filtering, through DLP,
through IPS, through firewalls.
So you're always almost protected, right? At least
90%. When you talk about serverless,
you lose the perimeter. That means that the attacker
can really get into your code from different
things that you haven't thought about before. It can be through an API?
Yeah, of course. But it can also be from someone uploading
a file, someone performing analytics
code, commits, log processing, database changes,
and it just execute your code. And there is no middle between
database, the database and your code. So if someone
changed the database, you cannot say, hey, before you run my code,
transfer this data to me. No, this is not controlled by
you, so you have to put the security inside the
code. You remember this now
protect each and every one of these. Well,
if it's not automated, it's not going to happen.
There are some other security, several security risks.
We're not going to talk about all of them. I'll refer you to information
about them, I'll just mention some of them. So event injection is
basically someone tacking your function with arbitrary code.
Broken authentications are functions that are not performing
any type of authentication, just relying on the incoming data
sensitive data exposure. Of course, lambda contains sensitive
data like keys and codes and
some secrets in your environment variables. So if
you're not testing yourself, you might be at risk.
Over privileged function, we talked about it. Vulnerable dependencies,
well, that's not new just now. It's in a
lambda function. Insufficient logging and monitoring. Well,
AWS logs and monitors pretty much everything. You just need to connect
to the right location and collect the right data and matrices.
Open resources are lambdas and other services like s
three buckets API that are unprotected, unconfigured,
misconfigured, allowing anyone to access them.
Denial of service and denial of wallet
are the ability for someone to either block
your lambda functions because of the limitations they have,
or letting you pay for any lambda executions.
Insecure shared space we discussed this earlier about the
data in slash temp that is shared between random
executions and of course insecure secret management, because your
lambda can have keys and secrets inside the code or
the configuration which are not protected. Well,
can security scale on serverless?
Well, it can, but there are some challenges.
There are a lot of services, lambda is just one of them, but it connects
to many other, there are frequent development, it's not a monolith
application with downtime. Serverless functions go
to production on daily basis. What is
connected to what? We discussed this, it's hard to know if
you're not the developer that wrote the specific function,
what it is connected to. And in this case,
of course, even before there are many developers, less appsec
and security teams, of course. So it's hard to follow what's
important. Well, my lambda could have permission to
do something meaningful in my cloud, but it might not
be connected to anything that allows an attacker to access it,
or vice versa. I have a lambda which has a code injection.
Okay, very risks,
but if the lambda permissions allow it just to write into the
logs, that means that even
if someone access the code runs arbitrary code,
the function can only write logs. Not that it's not important,
but it's less risky than someone reading
data or modifying my files
on an s three bucket. And even more so, it's hard to
know what's important. Is the security the same?
Well, we talked about some aspects, it's not exactly the same.
We'll see about some other aspects in a few minutes.
And there is another question. Who takes care of
the infrastructureascode lambdas are in the
cloud, permissions are configuration. Could be
an appsec team, a security team, could be the developer,
could be the DevOps team, could be the cloud engineering.
I've seen basically everything from everything, so it's
just hard to understand and who
takes responsibility in these cases.
All right, so we talked about the security aspects
of serverless,
but how do you test for security?
I want shift left, right. We don't want to just say, hey, we have a
tool in production runtime protection. We're good. No, we want
to know that we're shipping secure code. So how
do we test security in modern CI
CD pipeline? Well, let's take the traditional one.
I want a SAS here that runs on
every commit I want to is maybe something
more accurate that runs on integration
tests. So some security tests in the integration in the e two e test.
And I want to test like a desk test.
When the product is ready, it's shipped. I have a
website staging whatever it is, and I want to test
it. Well, these are the traditional tools and I'd
say those are not working well for serverless
for several reasons. The normal ones are,
that even happened before.
But that SAS or static analysis
gives a lot of false positive because it doesn't have a context.
It just frees text. So text.
So it's hard to understand what's important, what not. That also
means that the developer needs to work a lot to filter them and
let the security team configure what is important and what
is not. Because if I'm going to test for all the
vulnerabilities or the security policies, I'm going to get
thousands of results. Meaningless, really. All right,
so let's do an is test interactive application security
test. Well, that is good. The coverage is
a problem because you have to write tests in order to cover to get
coverage. And the security teams need a lot of work to instrument
your code in order for it to work. But then
something doesn't work and you don't know if it's the is
plugin instrumentation or your code is our latency.
So it's not really working so good,
especially when we talk about cloud native environment
das. Okay, those are good.
They're not really CI CD tools.
It's really hard to operate them inside the pipeline.
It requires a lot of work from both the engineering and the security teams.
Usually they don't find anything meaningless. Sometimes they
do, but their coverage is relatively low. There is
a lot of work between communication
that needs to be done, between the development, the engineering
and the security, because they need to know when they
can tests, what they can test on what environments.
Basically you need to keep the environment alive with new data,
calls the time, then frees it let the security team test,
give you the results. The developer will go over the results,
try to fix or understand what's going on. Continuing to
talk with the development, the security teams fix.
Go back to the security team, say, hey, I fixed this,
can you retest? Yes. Next week we have another cycle,
let's retest everything together. And there
is a time that has passed and the testing was not
done. And you need to ship your lambda today.
So doesn't really work. What we need is something else.
The problems are that if we want to use those security testing
in a cloud native environment, we're going to get more problems.
First of all, there is not just code.
All the tools pretty much are ignorant of the environment and
the context. What is the environment and the context?
Lambda is not just code. It is connected to an infrastructure,
to the cloud. And the cloud means a lot of things that we need to
know, like configuration and resources and services.
And you need to understand that a lambda is
not an app. It starts somewhere,
runs the code, it finish. Then there is another
service that picks up, maybe the lambda write into the database.
But then when it writes into the database, you have a
configuration that runs another data, that pulls the data from the database,
performs some action, and submit a report.
So it's not something that you can really test
like this. Also, tools are completely blind
to known edge devices. All the security tools are built
to support synchronous
application with some kind of can entry point HTTP or some kind
of traffic coming in. So if you want to test
or fuzz your code, let's say take a dust, right? A dynamic
tester, you need to give it can endpoint to start working.
Some lambdas don't have entry
points. Not entry points, sorry. Some lambdas don't have endpoints, they don't
have URLs, they don't have APIs.
I seen a system with, let's say something small,
maybe 200 functions. Yeah, ten of them had APIs.
Some of them, I'd say 90, 80% of
them don't have APIs. So you cannot test them in a traditional way.
All of these, the issues that
we talked about really block the development and disrupt the
CACD. That means it's very hard to scale in the pace
of cloud native development and they're not
good enough. And when we're in the cloud, we should get better.
So how should we do security existing for serverless?
Let's take an example. This is a
tiny application, really just three lambda functions
taken from Amazon.com. It's based on the irobots
from Roomba, right? It's just the registration service
you bought can irobot you open it for
the first time and it sends one request,
register your robot and there is a process,
there is a lambda that process, something with IoT writes
to the logs, put data into the queue,
and then another function picks up this queue,
continue to run, send it to another lambda and this
lambda communicate with other services, IoT services.
Really very simple. Let's see how we
should test this. Easy,
right? We can scan the image, right?
If you do have an image and
you want to just run an SCA on it, right,
it's going to give you 10% coverage,
50% coverage, maybe even less, I don't know.
And you just find potential problems
that you imported because it doesn't really mean you're vulnerable.
You just imported some issues. I'm not saying it's not important, it is
important, but it doesn't give you any coverage
for your code, your configuration, your cloud,
zero. Usually those things are
even provided out of the box by the cloud provider so
you can use them. And I think you should, but it's not enough.
So what we should do, I know,
infrastructureascode as code. We all use infrastructureascode as code now,
right? Terraform, pulumi, serverless framework, whatever that
is. That's great. Shift left
as far as you can go. But again,
you get limited visibility, right?
Because you just see configurations, zero code
coverage. No one will tell you there you have a problem with your code.
It will just tell you this line is vulnerable because you did not add
encryption, which is good, I'm very API for
it, but it's not enough. You get no logic, no prioritization,
and it's IAC dependent. So you really need a
solution that is built for your infrastructureascode as code. And again you
get zero code coverage. That's not enough,
right? To get code coverage, let's start using
is a modern app tech tool, maybe the most
accurate and reliable one, really enables developers devsecops,
but there are no servers to instrument, right? We're running on
a lambda function. Trying to run an IST on a lambda function is really
an overkill and hasn't worked before. So let's
try another solution. Let's run a saft
static analysis security testing. Well,
looking into this, really the
saft will see three different apps,
just three, because there are three lambdas, because it cannot have the
full overflow flow, because the code is not connected to each other.
This code does not continue here
inside the code. It needs to understand that there is
a configurations that says to write to this queue and
this function reads from this queue and then connect them together.
But it's not possible because it's not in the code, it's in the
configuration. So it doesn't see a source
or a sync, doesn't understand where is the databases
or the queues or anything. So really SAS will give
you bad results, false positive and
really false negative because it doesn't see things.
So what should we do? Run a dast,
a dynamic application, security testing? Well,
you could, and you'll be able to test this API
specifically because that's the only one with a URL
or an API.
So this function maybe, but I'm not sure what
actually you will get from it because I understand, if I
understand this correctly, it's not a synchronous flow,
right? So the roomba will send an API request which
will return or 200, okay,
403, unauthorized, 404, whatever that is.
But the rest of the application and the process doesn't happen yet.
So the API fuzzer
will get always or okay or unauthorized.
You can test some things, I'm not saying you cannot,
but most of the flows and the coverage will not be able to
run. But there is a solution.
Because we're in the cloud, we should do things differently
and better.
For example, what we do is we build something that
connects into the cloud. So once you connect into the cloud,
you can get all the information from the cloud, you don't need to
do anything. So three clicks, you get your
template, connect your cloud formation template, or whatever infrastructureascode
is code you're using, you run it, you connect
to the cloud, you get the right permission to do it. You can run
discovery, get all the information, all the resources, calls, the relationships,
calls, the interfaces, the policies or services in the environment, and connect
them together into the graph that I showed you before. Then you
can start analyzing your weaknesses,
your risky points, your code, your attack surfaces, and try to
understand where there might be problems.
And because the cloud is built in a way
that every service or everything
works with something that is pre built by the cloud,
you can also simulate those things. So what we do is we automate
security simulations on lambda functions in this case.
So let's say your lambda, get an API call
and write into the database. This is what we're going to tell the function
to do. Take this input, it's an API call.
It's not, but we can simulate that and try to
write into the database. And let's see what happens. Maybe I can write a file,
maybe I can access different tables, maybe I can delete
data and then I can also check if I actually did
it because I'm inside the cloud, right? So let's say I'm
trying to upload a file, I can
check if the file was uploaded because I'm there.
If I have the right provision, then of course if something happened,
we can report. And the best being
here, other than the three clicks instrumentation,
let's call it, is that you can do it continuously.
You can continue to monitor the environment. You don't need to run asynchronous
or point in time scanned. We continuously monitor
the run the cloud. If now you're going to push
or deploy a new code or a new configurations, we'll pick
it up and we'll test it automatically so you don't have
to do anything else.
Everything happens autonomously in the background.
New code, new can, new configuration, new scan.
You fixed your vulnerabilities, we'll retest it
automatically, and if you fixed it, we'll just eliminate the issue
so you don't have to even interact with the security team on that.
This is an illustration of what we're going to do. So the developers creates
new, deploys new code, new API with a new function.
We're going to test this specific flow and identify
potential vulnerabilities. And once we do, we'll know
also what else in your cloud is
at risk. If someone managed to do that,
we'll know into the specific table,
the specific action inside the table. There is
also a nice example here, what I showed
in black hat two years ago,
where I hacked a lambda function with my voice
talking to an Alexa device. Well, this is
something that developers did not expect, right? Because it's
not something that you're used to before, but you should take
that into consideration now. All right,
we're getting close. So our tool can automatically
give you vulnerabilities and maybe
even better policies out of the box. Copy paste
into your environment, get a list privilege permission for each of your
functions without doing anything. We're scanning the code,
we're emulating the code, we're looking into the policy, we're seeing
what the function actually needs, and then what we give to you.
Okay. Of course, I cannot cover everything. So what you can
do to learn more, first, there is the OS service, top ten
project. Trying to OS, if you're unfamiliar
with it, is an open organization, the most famous one for
application security. And there is a project I lead
together with some colleagues around the serverless
world, or industry trying to
identify the top risks
for serverless. Right now there is an open call, so if
your organization works with serverless and you have someone with
some insight into security issues, security risks,
please click this or go into this address filling
the forms we'll take it into consideration. All the data that
is sent is anonymous. Of course we'll collect and it's public so
we want to get the best results from the industry.
Lastly, there is another open source which
you can deploy on your cloud with just
three clicks. It's a DVSA. It's a damn vulnerable serverless application
that I created, completely serverless and
you can install it with just three clicks. Really,
you just need can AWS account and the right permissions to install.
Just make sure you do not install it into a production or any account with
sensitive information because it's a vulnerable application and
it's potentially going to give someone access to your data.
Go here. Learn more. There are videos, tutorials and you
can learn how to secure and attack your serverless applications.
That's it. Thank you very much for participating in this call
talk and you're welcome to shoot me an email anytime
if thanks.