Transcript
This transcript was autogenerated. To make changes, submit a PR.
Security. Hi, everyone. It's great to be back at
Conf 42. My name is Josh Stella.
I am the co founder and CTO of Fugue.
We are a cloud security software company.
I have spent at this point, pretty much the last decade
of my life, both at AWS and at Fugue,
focused on cloud security. ##ity from
a practical perspective, and what I mean by practical
is, what are hackers actually doing and how does
it actually hurt you? That's all that really matters in cloud security.
So today's focus is going to minimizing the blast
radius of a, of a cloud breach.
So you may or may not be familiar with this term.
I don't know where I picked it up. I've been using it for many years.
Blast radius refers to how much damage
has begun done by the breach. So you can focus
on the penetration aspect of the breach, and we often do that. We want
to prevent attackers from gaining access
to our resources.
But more importantly, how much damage do
they do? And damage can be expressed
in lots of different ways.
There's the amount of data stolen, sensitivity of
the data stolen, what can be done with the stolen data.
Those all assume a data exfiltration or
data theft kind of breach. There are other kinds
of blast radius that I would argue are typically much
worse. So, for example, ransomware has
one of the ugliest profiles, minimizing the blast
radius of a, any category of breach,
depending on how resilient you are to it.
So today we're going to examine,
we've only got about a half hour, so we're going to examine
ones real world breach
that I would argue had a very large blast radius.
It's still unclear the details about
exactly what was accessed, but we have some clues that we'll go through.
The reason we're going to do that, and by the way, I am going to
name names because real world breaches are the only way
to actually understand cloud security topics.
Generalization is
basically useless. You really have to look at what hackers
are actually doing to have a practical, useful approach
to cloud security. So we'll be looking at
the twitch breach that happened, I believe,
last month at this point. And in that twitch
breach, there's quite a large quantity
of data and quite a few types of data that
were stolen and then posted on
four chan, so we'll get into that in some detail.
The reason we're going to do that is to talk about
how to make sure that doesn't happen to you.
Okay, so three slides today.
This is number ones, and let's
go to slide number two. If I have my.
There we go. And the last slide just has my email address on it.
We're not going to be doing slideware today. We are going to be spending
most of our time at a whiteboard
and reviewing some content online
around that particular breach. The reason we're using to
be spending so much time at the whiteboard today and not at a terminal
or something similar, is because
minimizing the blast radius of a
function of one trick, it is
a function of the design of the system. It is a function
of how you are organizing your
computing resources and what API keys
have access to what kinds of data or other
resources. So it's very architectural. All right,
but before we really drill into that, I have
to watch time. That's why I'm looking over here. Before we really drill
into that, I want to show you some data from our
latest state of cloud security survey.
We do this every year at Fug. This year we did it with our
good friends and partners over at Sonotype.
And what we do is we go ask 300 organizations
that are operating at scale on the
cloud what they're thinking about, what they're seeing,
their concerns, et cetera. So this particular set of
data are around what our respondents
replied to, and we asked what are the most common cloud misconfiguration
incidents? An incident in this case typically
means not like a hacker breaking in necessarily,
although that would be one. But more typically it
would be when
misconfigurations happen in these services and hopefully
are detected prior to a hacker exploiting them.
So I was really pleased to see iam top the
list this time. That is
a good thing. This is the first year where it has been
the most popular response, and we track
all of these cloud breaches and how they're done. And I can tell you that
IAM is a major factor in most,
if not all of them. It is
the network in the cloud. You need to think of it
as a network in the cloud. It is how your resources communicate
with each other. And as we're thinking about blast radius,
what we're talking about is limiting the amount of access
from any one point. Right? And IAM is central
to that. Other things on here, like security group and firewall rules,
that's typically more oriented toward
talking about techniques to get penetration for hackers than
blast radius issues, although it can be related.
Encryption is another area on here that's talked about
at rest and in transit in the cloud.
Intransit encryption is much, actually less
important within the application than
in the data center days, it was. I mean, you should turn it on,
you should use it. But the attackers are not
like reading packets off of your network.
I have yet to see a real world cloud
breach where those kinds of network data center
oriented approaches were being taken. It's really
about the control plane APIs in the cloud and getting
access via API calls to the
data at rest much more than watching
data in transit. And at rest, there are
a lot of mistakes you can make. And actually the cloud providers
make it kind of easy to have a false sense of security about
your at rest encryption. And we have classes on that
you can find on our YouTube channel and so on. So when you're talking about
blast radius, you're also going to be thinking about your encryption techniques,
right, and how you're managing your keys and
how much data any given set of keys can decrypt
that is picked up from an at rest data source, like for example,
a database snapshot or an s three bucket.
Okay, let's jump into
the whiteboard for a second here. Let me see if I have my
screens arranged, how I think I do. No, wrong way. Okay,
so we're going to go to the whiteboard, and I want to talk a little
bit about conceptually,
the notion of blast radius containment or damage
containment. And I'm going to change to a
non computing type of engineering,
or in this case architecture, to maybe
give you a mental model that isn't confused by the devils
that live in the details of computing that we all live in,
day in, day out. So the person who designs
a ship is called a naval architect.
And so I'm going to design a ship here. It's going to
be a bad ship because I'm just in Photoshop
and I'm not a naval architect, but let's say I've got a ship
here. I dont know. It's an ugly
ship, but it's a ship. Now,
one of your principal goals in architecting
a vessel is to not let so much water
in that it sinks, right. The ship is
a hole of air with a
typically steel skin that floats in
water. So you don't want water in. That's pretty obvious.
So let's draw our water here.
So how do naval architects address this?
They do it by segmenting the
interior of the vessel. Okay,
so if you imagine that you've got bulkheads
here between segmenting the compartments
in the ship such that if,
I don't know, an iceberg gets struck and
it struck nose on, this compartment will fill
with water. But the remainder of these will
not. And so you have happy.
Well, maybe not happy, but happier than otherwise living
people on the ship. And it doesn't sink, right? That's the
idea. When you think about the USS coal,
which was, of course, suicide,
attacked at port, and a large
explosive set off against that youll.
That explosive did breach a significant
section of that youll, but the vessel did not using.
Because the blast radius,
if you will, that was the damage effect was
limited. And that's what we're trying to do when we
are talking about building computer systems
that limit blast radius, okay? We're trying to limit it
to a controllable part of
our computing environments. All right,
let's take a look real quick at this
twitch breach. This is a really interesting breach because
of the multiplicity of data. You know
what, that text is small. I'm going to go to screen read mode.
All right, so this is from Mitnick Security,
their analysis of it. You can see here that on
October 6, Twitch announced that
they were, in fact, breaches, as is often the case with
cloud breaches. I suspect Twitch
found out, the world found out, but it when hackers
proudly posted all the data that they took on four
chan, and that is not atypical when you're
talking about cloud breaches, to not have an
understanding of what happened until the
hackers who pulled off the hack actually
do something with it. We see this a lot. Not in our customers.
None of our customers actually have been hacked since using fugue,
but we see it a lot in the industry where
folks are just unaware that they've been breached,
even major breaches like this until the
hackers brag about it. So in this case, more than ones
hundred gigabytes of leaked data was publicly
posted online to four chan on Wednesday. All right,
let's look at what kinds of data.
So, so far, it's 100 gigs of data. That might not be
a big deal, right? We're not going to measure blast radius just
by size of bytes. If I manage to steal
all product images
from Amazon.com, who cares?
That's going to be massive quantities of data.
But it doesn't matter because it's not sensitive data.
It's intended to be public facing, et cetera.
On the other hand, I personally was affected by
this when the Chinese broke
into the database in which all of my personal details
were kept for my security clearances that I've had in the
past, and all that stuff was stolen. That's small data,
but they know everything about me and just about everyone
else who has carried a clearance over probably
a decade or two. So small data, huge blast
radius. An interesting story on that one.
I was at a conference speaking with a very
senior NiST.
I guess he's a scientist, maybe an engineer, but he's one of
the people who developed a lot of
the NIST controls for security. Okay. He is central
to that. He might be retired now, I don't know.
And he quipped that, and I don't know if this is because he had
specific factual information to this end or it was a speculation
on his part, but he quipped that the reason he thought
the Chinese had stolen
that data and then a few weeks later, someone broke into
a website called Ashley Madison where people
were cheating on their partners was
to tie those data together. Because the way you get spies,
the way you flip people often, is by having incriminating
information about them. So when you want to talk about blast radius,
if his hypothesis is correct,
we probably won't know for a decade or more if a
number of Americans with access to sensitive information are
being bribed into sharing it. Its size doesn't
matter. It's the content. And what can
be dont with the content that matters when you're talking about blast radius.
So here, it's, I think it was 128 gigs.
It says over 100 here, but let's look at what it was.
So amongst the posted data included three years of
payment information showing how much Twitch compensated
its elite gamers, which caused quite a stir online
over the high earnings of a few select top streamers.
If you didn't know that streamers with millions of followers make
a lot of money, you haven't been paying attention. I would
argue that. Who cares? This is meaningless data.
Tiny blast radius folks
making a million dollars or $5 million a year. Now more people know the
exact figure. We kind of all knew. Who cares?
Yeah. The leak revealed that twitch paid more than 108,000
annually to 13. Okay, well, I won't read all that.
I think, as a security practitioner, I don't really
care about this. I would not lose sleep
if a client were to call me and say, oh, my God,
our payment information to top streamers was leaked.
What should we do? Probably nothing, right? The world will
continue, but it gets much more interesting.
All right,
let's see if I can find.
Oh, this isn't the article. Let's go to this other article I found
that goes into more of what was actually stolen.
Where are we?
Yeah, this is interesting. So the twitch leak,
which apparently motivated the disclosure and
allegedly contains the source code from almost 6000
internal git repositories, including the
entirety of twitch tv, various twitch clients.
We know that these are mobile clients,
console clients, desktop operating system clients,
proprietary sdks,
internal red teaming tools, and then
creator payout reports dating back to 2019 and more.
Okay, this minimizing the blast radius
of a much more interesting
and really ugly.
Let's go back to our whiteboard and let's erase this guy.
So what we've got here, we don't have exactly
all the details on this breach. I got to watch time because
I tend to enjoy talking about this stuff and
I will go over. All right, we're still good. So the
breach here included payment information,
right, for their streamers.
Apparently most or all of this
data we don't know was stored in
git repositories. We know there were, according to
that article anyway, approximately 6000 git
repos.
Okay, so you had payment info. We had proprietary
sdks, source code.
So we know from other reporting
or it has been reported that these included
AWS service,
internal APIs and sdks for AWS
services. So not just twitch here. And by the way, twitch is owned
of course, by Amazon. So if anyone
should be really good at security on the cloud,
twitch should be right up there. And I don't doubt that they are
in terms of a lot of the things most people
think about. They clearly were not thinking though about blast
radius as it related to this vulnerability that was exploited.
So you've got SDK code and source
code, including AWS services in
the piece that I just pulled up, it didn't have this in there.
But in other reporting on this,
it has been claimed that there
was a leak of Amazon source code for
a competitor, for steps called vapor.
Okay, so here we've got business intelligence
here we've got highly sensitive proprietary business information
that's being all in the same leak.
And we know from the reporting on
this and what Twitch said happened is
that they had a server and that server
was misconfigured and a
bad actor, we'll give them some devil horns here,
a bad actor broke into that
server and from there got payment info
back to 2019. Why is that
in a git repo? I don't think that was in a git repo. I'm skeptical
that that was in a git repo. Okay.
Proprietary SDKs, all of Twitch
TV, all of
the clients, right. You see where
I'm going with this Amazon source code
for still secretive projects that haven't launched
yet. So this is blast radius,
right? This is breadth of access.
So a couple questions I would ask
is why in the world does this one server
have access to all of these different kinds
of things, right? Not just different quantities of things, but different
kinds of things. It doesn't make much sense. There's not
much segmentation there. Now there are a couple of ways this
could have played out. Maybe they really
do put all of this in git repos. A lot of this is source code.
Why a server that was Internet facing in any capacity
had any access to source code repos
is another question. It probably, by the way, didn't I
mentioned earlier that the initial penetration
is less interesting? That's the story of this server.
This is how the hacker penetrated, right? This is the interesting
part of the story. It's all this blast radius stuff.
It's not how they got in. So very often what
we see in these scenarios is
the hacker will get in through a misconfigured server or
they'll find some API keys in an external repo or in
a disk image or something. But then once they're on the server,
they don't really care about that server. Okay, nobody cares
about your servers anymore. They don't mean anything.
Protecting your operating systems and so on. The only reason you're doing that anymore
is to prevent people from perching on those to get to cloud control
plane APIs. In this case certainly
at least the ability to get into 6000
get repos. So how might we segment this?
The first thing is let's pick out the things that are.
It's like when you're a kid and there's five animals in
a car and which one doesn't belong? Okay, which one doesn't belong?
Definitely this payment info leaps out. Now to me,
this is the one data type here, the one
collection of data that's kind of understandable that
a server that was public facing
would have any access to, right? Because let's say
you're one of those streamers and you want to log into twitch
tv and see how much money you
made over the last year because you've got to pay taxes.
That's wrong because you're going to get a tax document. But youll take my point.
This is application data, right?
This is understandable to be something breachable
via a public facing IP address. In my opinion,
it's not great, but it's also not the end of the world. And I've mentioned
earlier, I think if this had been the only thing that
was leaked and stolen, I personally
as a practitioner, wouldn't give this much thought I'd probably look
again at how we're doing encryption,
right? Because, for example,
I youll want to make sure that data,
that we're truly sensitive, particularly PII data, things like
that. I mean, the risk in this is more
lawsuit honestly than it is anything practical.
Okay? You could say getting sued is bad, that is true.
But in terms of just the ugliness of the other data
that were stolen here, that's the lowest item on the list.
Okay, source code for twitch tv.
Well, let's just start there. We've got source code here
for twitch tv and also for clients.
Now I suppose it is possible that
and probably likely in some scenarios that
certain engineers, certain programmers working
on, say, the Xbox client, would also
need access to the main web host resources code.
I don't know what their architecture is,
but I could see it.
It feels a little bit of a stretch to me. I mean, honestly,
if you're doing a modern services architecture,
everything is exposed as APIs. You shouldn't be reading the source,
necessarily needing to read the source code behind
the internal APIs you're using to put your
application together. I would argue even that if you have to,
you've done a really bad job of developing APIs
and documenting them, and that you probably shouldn't do
that if you have lots of teams, they should live and die by
the contract of their APIs, right? And documentation thereof.
Let's assume for a moment that from this server,
our bad guy, our hacker, got access to just
one set of credentials.
Probably, given that this is an Amazon company hosted on AWS,
likely those are IAM credentials that have access to
these repositories or something similar. Let's say they got a
hold of one set that had access to both the clients and
twitch tv. Why you
could have segmentation there. If you're not thinking about segmentation
within your engineering team, what you're doing
is creating a really attractive vector.
And that's another interesting thing about the twitch breach.
None of this is an attack on any production systems or databases that
we know of. A lot of times when we're
working with folks who are just really getting their
heads around cloud security, they think, well, my production environment
is the one the hackers are going to go for. Very often it's
not. The attackers prefer Dev, they prefer
non prod, because production,
because people think the way I was just describing, well, I have
to protect production more.
Dev has really nasty blast radius effects,
depending on what kind of data you have in these dev environments. And it doesn't
just have to be source code. Very often it's copies of databases.
Okay, but back to our ship that we don't want to have sync.
We should be segmenting these things if group
A, if team a works
on clients, and Team B, and of course, there's a
bunch of folks here. I'm just one person per team here
for now. Team B is working on
the web application. Why are they allowed
to see each other's source code? I would say dont do that
segment. And by the way, when you're segmenting,
you're not going to predict everything a hacker might
think to do. So the lesson
from this isn't, oh, man, we should have separate
repos for source code for different parts of the system. There are
companies that have a single git repo, okay, so there are different kinds
of segmentation you can practice, but what
youll want to be doing is figuring out specifically how to
perform that segmentation. Now, we're running low
on time here. I only have a couple more minutes.
So these are the two
that really are ugly, in my view. And I
only have one color pen here. I'll get
my whiteboard better next time with multiple colors.
But why in the world would
there be source code for AWS services?
Not twitch services, but AWS services?
And by the way, if you're pointing back at the Mitnick article and saying,
well, they didn't say that, I've been researching this a lot, and I've read a
lot of sources, and I can't tell you where every single one is reporting from.
I personally did not download the archive from four
chan. But you can do that. I don't know if it's still on four chan,
but it's out there on the Internet. You can download it and look at it,
and a lot of people have. So why in the world
would that access to git repos get me
a parent company's highly prized
source code for public facing services that
every hacker in the world wants to understand how they operate
in order to try to breach them. That's madness. That should
not be accessible with this one breach server
and some API keys. Right? So this is now
an iceberg that is sliding along the
side of our ship, right? This is how the Titanic sunk, is they had a
kind of early form of this highly
segmented youll structure such that because
it dragged along the iceberg, it penetrated enough of
them, and it sank the ship. All right, Amazon will be fine.
They're not sinking. Twitch will be fine. But my
point here is the diversity of these data
create. In this case, I would argue, one of the largest potential
blast radii of breaches in recent years.
Because who knows what state
actors are going to do with this resources code. Who knows what
sophisticated hacking consortia are going to do with these things?
And what's the recovery,
right? I'm assuming if you're here, you write
software or are involved in the creation of software.
It's not easy, right? It's hard. And so it's
really hard to say, well, we're going to have to throw all that
away. I seriously doubt that's what happened here. So that's what we
kind of have time for today. So the moral of the story,
and I hope I touched on a number of things you should be thinking about.
The first thing is, and let me
just see if I can make some notes here,
I got to get my whiteboard game up on this machine.
All right. The first is understand what
you have that's valuable.
Understand potential
blast radius or potential damage,
and then the things that are high in potential damage.
Segment,
segment access segment,
important things.
All right? Youll can't protect everything perfectly.
If we did, we wouldn't have functional transactional systems.
And so this gets into data classification to
some degree, but it also gets into thinking like a hacker. Like, what's a
hacker really going to do with that data?
I'll go back to, if you think about the
Ashley Madison breach, right? The Ashley Madison people think, well, the data
is of a bunch of people who are
lying to their partners. That's kind of sensitive.
But you combine that with potentially the
security clearance data for folks,
clearance holders, and that becomes a truly
ugly kind of blast radius.
And hackers are clever like that. And of course, by the way, I've never
used Ashley Madison. I've been married for 30 years, so that's
not me. So three,
you need to think about this architecturally,
by which I mean youll developers should not
be employing
designs and techniques like, for example, the ability to list s
three buckets in production environments. They shouldn't be doing
that as a design part of the system that breaks segmentation.
Okay? You want to limit. And so it's
very easy to do intentional design choices
that expand blast radius. Very often you
see this, and this is number four,
worry about Dev.
Not just prod, worry about Dev and stage and test,
because very often this is where we do things like use long
lived, highly privileged
accounts and tokens and so on and keys in
order for convenience. Well, hackers like getting
into Dev. Okay? So you need to think about segmentation in Dev as well.
All right, I am pretty sure I am all up
on time here, so we're
going to go to the last of these. I am
Josh Stella. Josh at Fugue Co. I do try hard
to, I think I, like everyone gets spammed a bazillion times a day in email.
But if you want to reach out and talk to me or ask a question
or get a conversation going, feel free. And we
are at www. Dont Fugue Co.
Enjoy the rest of your conference and thank you for your time.