Conf42 DevSecOps 2021 - Online

Minimizing the Blast Radius of a Cloud Breach

Video size:

Abstract

Today’s cloud attacks don’t exploit a single misconfiguration, but rather a series of them. Josh will walk through a process for understanding the blast radius of potential security events in your environment, and steps you can take to prevent minor ones from becoming catastrophic breaches.

The recent Twitch breach may have begun with a lone server misconfiguration, but it’s blast radius reached everything from sensitive customer data to source code for yet-to-be-released applications. Today’s cloud attacks don’t exploit a single misconfiguration, but rather a series of them.

In this session, Josh Stella will walk through a process for understanding the blast radius of a variety of potential security events in your environment, and steps you can take to prevent minor ones from becoming catastrophic breaches.

You’ll walk away from this session with an understanding of how to:

  • Evaluate your Identity and Access Management (IAM) resources for weaknesses that attackers can exploit
  • Employ penetration testing methodologies to assess the blast radius of public-facing resource misconfigurations
  • Harden your cloud security posture using policy as code to address complex, multi-resource “blast radius” risks

Summary

  • Josh Stella: Today's focus is going to minimize the blast radius of a, of a cloud breach. Blast radius refers to how much damage has begun done by the breach. Real world breaches are the only way to actually understand cloud security topics.
  • We're not going to be doing slideware today. We are going to spend most of our time at a whiteboard and reviewing some content online around that particular breach. The reason we're using to be spending so much time at the whiteboard today is because minimizing the blast radius of a function of one trick.
  • Fugitive releases data from its latest state of cloud security survey. IAM is a major factor in most, if not all of them. It is the network in the cloud. Other things like security group and firewall rules can also be related.
  • Encryption is another area on here that's talked about at rest and in transit in the cloud. Intransit encryption is much, actually less important within the application than in the data center days. I have yet to see a real world cloud breach where those kinds of network data center oriented approaches were being taken.
  • A naval architect's goal is to not let so much water in that it sinks. How do naval architects address this? They do it by segmenting the interior of the vessel. That's what we're trying to do when we are talking about building computer systems that limit blast radius.
  • More than ones hundred gigabytes of leaked data was publicly posted online to four chan on Wednesday. Among the posted data included three years of payment information showing how much Twitch compensated its elite gamers. We see it a lot in the industry where folks are just unaware that they've been breached until the hackers brag about it.
  • The leak allegedly contains the source code from almost 6000 internal git repositories. Includes the entirety of twitch tv, various twitch clients. Also creator payout reports dating back to 2019 and more. We don't have exactly all the details on this breach.
  • The breach included payment information, right, for their streamers. Most or all of this data we don't know was stored in git repositories. Why a server that was Internet facing in any capacity had access to source code repos is another question.
  • Why in the world would there be source code for AWS services? Not twitch services, but AWS services. If you're not thinking about segmentation within your engineering team, what you're doing is creating a really attractive vector. None of this is an attack on any production systems or databases.
  • The first thing is understand what you have that's valuable. Understand potential blast radius or potential damage. Segment, segment access segment, important things. And worry about Dev as well. Hackers like getting into Dev.
  • Josh Stella: If you want to reach out and talk to me or ask a question or get a conversation going, feel free. Enjoy the rest of your conference and thank you for your time.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Security. Hi, everyone. It's great to be back at Conf 42. My name is Josh Stella. I am the co founder and CTO of Fugue. We are a cloud security software company. I have spent at this point, pretty much the last decade of my life, both at AWS and at Fugue, focused on cloud security. ##ity from a practical perspective, and what I mean by practical is, what are hackers actually doing and how does it actually hurt you? That's all that really matters in cloud security. So today's focus is going to minimizing the blast radius of a, of a cloud breach. So you may or may not be familiar with this term. I don't know where I picked it up. I've been using it for many years. Blast radius refers to how much damage has begun done by the breach. So you can focus on the penetration aspect of the breach, and we often do that. We want to prevent attackers from gaining access to our resources. But more importantly, how much damage do they do? And damage can be expressed in lots of different ways. There's the amount of data stolen, sensitivity of the data stolen, what can be done with the stolen data. Those all assume a data exfiltration or data theft kind of breach. There are other kinds of blast radius that I would argue are typically much worse. So, for example, ransomware has one of the ugliest profiles, minimizing the blast radius of a, any category of breach, depending on how resilient you are to it. So today we're going to examine, we've only got about a half hour, so we're going to examine ones real world breach that I would argue had a very large blast radius. It's still unclear the details about exactly what was accessed, but we have some clues that we'll go through. The reason we're going to do that, and by the way, I am going to name names because real world breaches are the only way to actually understand cloud security topics. Generalization is basically useless. You really have to look at what hackers are actually doing to have a practical, useful approach to cloud security. So we'll be looking at the twitch breach that happened, I believe, last month at this point. And in that twitch breach, there's quite a large quantity of data and quite a few types of data that were stolen and then posted on four chan, so we'll get into that in some detail. The reason we're going to do that is to talk about how to make sure that doesn't happen to you. Okay, so three slides today. This is number ones, and let's go to slide number two. If I have my. There we go. And the last slide just has my email address on it. We're not going to be doing slideware today. We are going to be spending most of our time at a whiteboard and reviewing some content online around that particular breach. The reason we're using to be spending so much time at the whiteboard today and not at a terminal or something similar, is because minimizing the blast radius of a function of one trick, it is a function of the design of the system. It is a function of how you are organizing your computing resources and what API keys have access to what kinds of data or other resources. So it's very architectural. All right, but before we really drill into that, I have to watch time. That's why I'm looking over here. Before we really drill into that, I want to show you some data from our latest state of cloud security survey. We do this every year at Fug. This year we did it with our good friends and partners over at Sonotype. And what we do is we go ask 300 organizations that are operating at scale on the cloud what they're thinking about, what they're seeing, their concerns, et cetera. So this particular set of data are around what our respondents replied to, and we asked what are the most common cloud misconfiguration incidents? An incident in this case typically means not like a hacker breaking in necessarily, although that would be one. But more typically it would be when misconfigurations happen in these services and hopefully are detected prior to a hacker exploiting them. So I was really pleased to see iam top the list this time. That is a good thing. This is the first year where it has been the most popular response, and we track all of these cloud breaches and how they're done. And I can tell you that IAM is a major factor in most, if not all of them. It is the network in the cloud. You need to think of it as a network in the cloud. It is how your resources communicate with each other. And as we're thinking about blast radius, what we're talking about is limiting the amount of access from any one point. Right? And IAM is central to that. Other things on here, like security group and firewall rules, that's typically more oriented toward talking about techniques to get penetration for hackers than blast radius issues, although it can be related. Encryption is another area on here that's talked about at rest and in transit in the cloud. Intransit encryption is much, actually less important within the application than in the data center days, it was. I mean, you should turn it on, you should use it. But the attackers are not like reading packets off of your network. I have yet to see a real world cloud breach where those kinds of network data center oriented approaches were being taken. It's really about the control plane APIs in the cloud and getting access via API calls to the data at rest much more than watching data in transit. And at rest, there are a lot of mistakes you can make. And actually the cloud providers make it kind of easy to have a false sense of security about your at rest encryption. And we have classes on that you can find on our YouTube channel and so on. So when you're talking about blast radius, you're also going to be thinking about your encryption techniques, right, and how you're managing your keys and how much data any given set of keys can decrypt that is picked up from an at rest data source, like for example, a database snapshot or an s three bucket. Okay, let's jump into the whiteboard for a second here. Let me see if I have my screens arranged, how I think I do. No, wrong way. Okay, so we're going to go to the whiteboard, and I want to talk a little bit about conceptually, the notion of blast radius containment or damage containment. And I'm going to change to a non computing type of engineering, or in this case architecture, to maybe give you a mental model that isn't confused by the devils that live in the details of computing that we all live in, day in, day out. So the person who designs a ship is called a naval architect. And so I'm going to design a ship here. It's going to be a bad ship because I'm just in Photoshop and I'm not a naval architect, but let's say I've got a ship here. I dont know. It's an ugly ship, but it's a ship. Now, one of your principal goals in architecting a vessel is to not let so much water in that it sinks, right. The ship is a hole of air with a typically steel skin that floats in water. So you don't want water in. That's pretty obvious. So let's draw our water here. So how do naval architects address this? They do it by segmenting the interior of the vessel. Okay, so if you imagine that you've got bulkheads here between segmenting the compartments in the ship such that if, I don't know, an iceberg gets struck and it struck nose on, this compartment will fill with water. But the remainder of these will not. And so you have happy. Well, maybe not happy, but happier than otherwise living people on the ship. And it doesn't sink, right? That's the idea. When you think about the USS coal, which was, of course, suicide, attacked at port, and a large explosive set off against that youll. That explosive did breach a significant section of that youll, but the vessel did not using. Because the blast radius, if you will, that was the damage effect was limited. And that's what we're trying to do when we are talking about building computer systems that limit blast radius, okay? We're trying to limit it to a controllable part of our computing environments. All right, let's take a look real quick at this twitch breach. This is a really interesting breach because of the multiplicity of data. You know what, that text is small. I'm going to go to screen read mode. All right, so this is from Mitnick Security, their analysis of it. You can see here that on October 6, Twitch announced that they were, in fact, breaches, as is often the case with cloud breaches. I suspect Twitch found out, the world found out, but it when hackers proudly posted all the data that they took on four chan, and that is not atypical when you're talking about cloud breaches, to not have an understanding of what happened until the hackers who pulled off the hack actually do something with it. We see this a lot. Not in our customers. None of our customers actually have been hacked since using fugue, but we see it a lot in the industry where folks are just unaware that they've been breached, even major breaches like this until the hackers brag about it. So in this case, more than ones hundred gigabytes of leaked data was publicly posted online to four chan on Wednesday. All right, let's look at what kinds of data. So, so far, it's 100 gigs of data. That might not be a big deal, right? We're not going to measure blast radius just by size of bytes. If I manage to steal all product images from Amazon.com, who cares? That's going to be massive quantities of data. But it doesn't matter because it's not sensitive data. It's intended to be public facing, et cetera. On the other hand, I personally was affected by this when the Chinese broke into the database in which all of my personal details were kept for my security clearances that I've had in the past, and all that stuff was stolen. That's small data, but they know everything about me and just about everyone else who has carried a clearance over probably a decade or two. So small data, huge blast radius. An interesting story on that one. I was at a conference speaking with a very senior NiST. I guess he's a scientist, maybe an engineer, but he's one of the people who developed a lot of the NIST controls for security. Okay. He is central to that. He might be retired now, I don't know. And he quipped that, and I don't know if this is because he had specific factual information to this end or it was a speculation on his part, but he quipped that the reason he thought the Chinese had stolen that data and then a few weeks later, someone broke into a website called Ashley Madison where people were cheating on their partners was to tie those data together. Because the way you get spies, the way you flip people often, is by having incriminating information about them. So when you want to talk about blast radius, if his hypothesis is correct, we probably won't know for a decade or more if a number of Americans with access to sensitive information are being bribed into sharing it. Its size doesn't matter. It's the content. And what can be dont with the content that matters when you're talking about blast radius. So here, it's, I think it was 128 gigs. It says over 100 here, but let's look at what it was. So amongst the posted data included three years of payment information showing how much Twitch compensated its elite gamers, which caused quite a stir online over the high earnings of a few select top streamers. If you didn't know that streamers with millions of followers make a lot of money, you haven't been paying attention. I would argue that. Who cares? This is meaningless data. Tiny blast radius folks making a million dollars or $5 million a year. Now more people know the exact figure. We kind of all knew. Who cares? Yeah. The leak revealed that twitch paid more than 108,000 annually to 13. Okay, well, I won't read all that. I think, as a security practitioner, I don't really care about this. I would not lose sleep if a client were to call me and say, oh, my God, our payment information to top streamers was leaked. What should we do? Probably nothing, right? The world will continue, but it gets much more interesting. All right, let's see if I can find. Oh, this isn't the article. Let's go to this other article I found that goes into more of what was actually stolen. Where are we? Yeah, this is interesting. So the twitch leak, which apparently motivated the disclosure and allegedly contains the source code from almost 6000 internal git repositories, including the entirety of twitch tv, various twitch clients. We know that these are mobile clients, console clients, desktop operating system clients, proprietary sdks, internal red teaming tools, and then creator payout reports dating back to 2019 and more. Okay, this minimizing the blast radius of a much more interesting and really ugly. Let's go back to our whiteboard and let's erase this guy. So what we've got here, we don't have exactly all the details on this breach. I got to watch time because I tend to enjoy talking about this stuff and I will go over. All right, we're still good. So the breach here included payment information, right, for their streamers. Apparently most or all of this data we don't know was stored in git repositories. We know there were, according to that article anyway, approximately 6000 git repos. Okay, so you had payment info. We had proprietary sdks, source code. So we know from other reporting or it has been reported that these included AWS service, internal APIs and sdks for AWS services. So not just twitch here. And by the way, twitch is owned of course, by Amazon. So if anyone should be really good at security on the cloud, twitch should be right up there. And I don't doubt that they are in terms of a lot of the things most people think about. They clearly were not thinking though about blast radius as it related to this vulnerability that was exploited. So you've got SDK code and source code, including AWS services in the piece that I just pulled up, it didn't have this in there. But in other reporting on this, it has been claimed that there was a leak of Amazon source code for a competitor, for steps called vapor. Okay, so here we've got business intelligence here we've got highly sensitive proprietary business information that's being all in the same leak. And we know from the reporting on this and what Twitch said happened is that they had a server and that server was misconfigured and a bad actor, we'll give them some devil horns here, a bad actor broke into that server and from there got payment info back to 2019. Why is that in a git repo? I don't think that was in a git repo. I'm skeptical that that was in a git repo. Okay. Proprietary SDKs, all of Twitch TV, all of the clients, right. You see where I'm going with this Amazon source code for still secretive projects that haven't launched yet. So this is blast radius, right? This is breadth of access. So a couple questions I would ask is why in the world does this one server have access to all of these different kinds of things, right? Not just different quantities of things, but different kinds of things. It doesn't make much sense. There's not much segmentation there. Now there are a couple of ways this could have played out. Maybe they really do put all of this in git repos. A lot of this is source code. Why a server that was Internet facing in any capacity had any access to source code repos is another question. It probably, by the way, didn't I mentioned earlier that the initial penetration is less interesting? That's the story of this server. This is how the hacker penetrated, right? This is the interesting part of the story. It's all this blast radius stuff. It's not how they got in. So very often what we see in these scenarios is the hacker will get in through a misconfigured server or they'll find some API keys in an external repo or in a disk image or something. But then once they're on the server, they don't really care about that server. Okay, nobody cares about your servers anymore. They don't mean anything. Protecting your operating systems and so on. The only reason you're doing that anymore is to prevent people from perching on those to get to cloud control plane APIs. In this case certainly at least the ability to get into 6000 get repos. So how might we segment this? The first thing is let's pick out the things that are. It's like when you're a kid and there's five animals in a car and which one doesn't belong? Okay, which one doesn't belong? Definitely this payment info leaps out. Now to me, this is the one data type here, the one collection of data that's kind of understandable that a server that was public facing would have any access to, right? Because let's say you're one of those streamers and you want to log into twitch tv and see how much money you made over the last year because you've got to pay taxes. That's wrong because you're going to get a tax document. But youll take my point. This is application data, right? This is understandable to be something breachable via a public facing IP address. In my opinion, it's not great, but it's also not the end of the world. And I've mentioned earlier, I think if this had been the only thing that was leaked and stolen, I personally as a practitioner, wouldn't give this much thought I'd probably look again at how we're doing encryption, right? Because, for example, I youll want to make sure that data, that we're truly sensitive, particularly PII data, things like that. I mean, the risk in this is more lawsuit honestly than it is anything practical. Okay? You could say getting sued is bad, that is true. But in terms of just the ugliness of the other data that were stolen here, that's the lowest item on the list. Okay, source code for twitch tv. Well, let's just start there. We've got source code here for twitch tv and also for clients. Now I suppose it is possible that and probably likely in some scenarios that certain engineers, certain programmers working on, say, the Xbox client, would also need access to the main web host resources code. I don't know what their architecture is, but I could see it. It feels a little bit of a stretch to me. I mean, honestly, if you're doing a modern services architecture, everything is exposed as APIs. You shouldn't be reading the source, necessarily needing to read the source code behind the internal APIs you're using to put your application together. I would argue even that if you have to, you've done a really bad job of developing APIs and documenting them, and that you probably shouldn't do that if you have lots of teams, they should live and die by the contract of their APIs, right? And documentation thereof. Let's assume for a moment that from this server, our bad guy, our hacker, got access to just one set of credentials. Probably, given that this is an Amazon company hosted on AWS, likely those are IAM credentials that have access to these repositories or something similar. Let's say they got a hold of one set that had access to both the clients and twitch tv. Why you could have segmentation there. If you're not thinking about segmentation within your engineering team, what you're doing is creating a really attractive vector. And that's another interesting thing about the twitch breach. None of this is an attack on any production systems or databases that we know of. A lot of times when we're working with folks who are just really getting their heads around cloud security, they think, well, my production environment is the one the hackers are going to go for. Very often it's not. The attackers prefer Dev, they prefer non prod, because production, because people think the way I was just describing, well, I have to protect production more. Dev has really nasty blast radius effects, depending on what kind of data you have in these dev environments. And it doesn't just have to be source code. Very often it's copies of databases. Okay, but back to our ship that we don't want to have sync. We should be segmenting these things if group A, if team a works on clients, and Team B, and of course, there's a bunch of folks here. I'm just one person per team here for now. Team B is working on the web application. Why are they allowed to see each other's source code? I would say dont do that segment. And by the way, when you're segmenting, you're not going to predict everything a hacker might think to do. So the lesson from this isn't, oh, man, we should have separate repos for source code for different parts of the system. There are companies that have a single git repo, okay, so there are different kinds of segmentation you can practice, but what youll want to be doing is figuring out specifically how to perform that segmentation. Now, we're running low on time here. I only have a couple more minutes. So these are the two that really are ugly, in my view. And I only have one color pen here. I'll get my whiteboard better next time with multiple colors. But why in the world would there be source code for AWS services? Not twitch services, but AWS services? And by the way, if you're pointing back at the Mitnick article and saying, well, they didn't say that, I've been researching this a lot, and I've read a lot of sources, and I can't tell you where every single one is reporting from. I personally did not download the archive from four chan. But you can do that. I don't know if it's still on four chan, but it's out there on the Internet. You can download it and look at it, and a lot of people have. So why in the world would that access to git repos get me a parent company's highly prized source code for public facing services that every hacker in the world wants to understand how they operate in order to try to breach them. That's madness. That should not be accessible with this one breach server and some API keys. Right? So this is now an iceberg that is sliding along the side of our ship, right? This is how the Titanic sunk, is they had a kind of early form of this highly segmented youll structure such that because it dragged along the iceberg, it penetrated enough of them, and it sank the ship. All right, Amazon will be fine. They're not sinking. Twitch will be fine. But my point here is the diversity of these data create. In this case, I would argue, one of the largest potential blast radii of breaches in recent years. Because who knows what state actors are going to do with this resources code. Who knows what sophisticated hacking consortia are going to do with these things? And what's the recovery, right? I'm assuming if you're here, you write software or are involved in the creation of software. It's not easy, right? It's hard. And so it's really hard to say, well, we're going to have to throw all that away. I seriously doubt that's what happened here. So that's what we kind of have time for today. So the moral of the story, and I hope I touched on a number of things you should be thinking about. The first thing is, and let me just see if I can make some notes here, I got to get my whiteboard game up on this machine. All right. The first is understand what you have that's valuable. Understand potential blast radius or potential damage, and then the things that are high in potential damage. Segment, segment access segment, important things. All right? Youll can't protect everything perfectly. If we did, we wouldn't have functional transactional systems. And so this gets into data classification to some degree, but it also gets into thinking like a hacker. Like, what's a hacker really going to do with that data? I'll go back to, if you think about the Ashley Madison breach, right? The Ashley Madison people think, well, the data is of a bunch of people who are lying to their partners. That's kind of sensitive. But you combine that with potentially the security clearance data for folks, clearance holders, and that becomes a truly ugly kind of blast radius. And hackers are clever like that. And of course, by the way, I've never used Ashley Madison. I've been married for 30 years, so that's not me. So three, you need to think about this architecturally, by which I mean youll developers should not be employing designs and techniques like, for example, the ability to list s three buckets in production environments. They shouldn't be doing that as a design part of the system that breaks segmentation. Okay? You want to limit. And so it's very easy to do intentional design choices that expand blast radius. Very often you see this, and this is number four, worry about Dev. Not just prod, worry about Dev and stage and test, because very often this is where we do things like use long lived, highly privileged accounts and tokens and so on and keys in order for convenience. Well, hackers like getting into Dev. Okay? So you need to think about segmentation in Dev as well. All right, I am pretty sure I am all up on time here, so we're going to go to the last of these. I am Josh Stella. Josh at Fugue Co. I do try hard to, I think I, like everyone gets spammed a bazillion times a day in email. But if you want to reach out and talk to me or ask a question or get a conversation going, feel free. And we are at www. Dont Fugue Co. Enjoy the rest of your conference and thank you for your time.
...

Josh Stella

CEO & CTO @ Fugue

Josh Stella's LinkedIn account Josh Stella's twitter account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)