Who Goes There? Actively Detecting Intruders With Honeytokens

Video size:

Abstract

Ever wish you could set traps for intruders in your environment? While you can’t rig explosions or rolling boulders when someone attacks your servers, you can set up false credentials that trigger alarms you can act against. That is the whole idea behind honeytokens! This talk will get you started.

Summary

You. Actively detecting intruders with cyber deception tools here at Comp 42. I hope you enjoy all the amazing content from all the creators and all the providers out there. Let's go ahead and get started.
Dwayne McDaniel: Attackers want your credentials. We're up against a lot of threats out there in the world. If they get those credentials, some bad things can happen. He tells a couple of horror stories and they might be a little disconcerting, a little scary.
Hackers are stealing data from open networks, government networks, military networks. The idea of honeypots has become a mainstream conversation in computer security. In 2016, AWS added tokens into their system. This is a definitive moment in the history of hacking.
Honey token is a decoy credential that doesn't allow any real access. It looks identical to a real credential, to an attacker or to anybody else. If it's used, it exposes that it's being used through an alert. Here's how you build them.
Put honey tokens in private environments. Anywhere where someone outside of your organization shouldn't have access. Do think in terms of scaling this with automation. It's not a one off exercise.
Start thinking about honey tokens if you're going to deploy them. Think long term and think how do we do this at a regular pace. If you added one new honey token per sprint, you'll get a lot of coverage in a year's time.
It's really hard to win if you only play defense. What we need to do is start actively kicking people out faster. Check out the security repo podcast. Highly recommend it.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

You. Hey everybody, welcome to my talk. Who goes there? Actively detecting intruders with cyber deception tools here at Comp 42. I'm very excited to be part of the lineup. I hope you enjoy all the amazing content from all the creators and all the providers out there. Let's go ahead and get started. So, I'm Dwayne. I live in Chicago, Illinois. I've been a developer advocate since 2016. You can go out and hear hear me co host a podcast called the Security Repo podcast. We have some really awesome hosts, really awesome guests, I should say some awesome hosts too, but some awesome guests telling the world about great things in the world of security, from physical security to pen testing to API security and of course, code security. Hit me up on the Internet out there, McDwayne, at most places, including GitHub. And feel free to email me. Dwayne mcdaniel@gitguardian.com I work for Gitguardian. We are a code security platform focused on helping companies eliminate the problem of hard coded credentials, finding where those plaintext credentials appear, and, well, it's giving you a path to do something about it. We also make honey tokens, which will come up later. But real quick, before I go any further, I need to deploy something. So I'm going to go ahead and copy these credentials and go to GitHub, and we're going to edit this and paste those in. And just because I feel like it, I'm going to go ahead and make those a comment, and we'll commit that change and we'll come back to that later. Attackers want your credentials. We know this. We all know this. We're up against a lot of threats out there in the world, and if they get those credentials, some bad things can happen. I'm going to tell a couple of horror stories and they might be a little disconcerting, a little scary. And if you get a little scared, and this is true of anytime you get a little scared, feel free to recite the Benning jesuit litany against fear. I'm a huge dune fan, and this is one of the greatest things that came out of that series. I think you will remain. Only the fear will be gone at the end. So just take a deep breath. And this is true of, again, anytime you do leak something or you think that you are being breached. So uber last year they had attack, they had a super admin, got phished. Now they have MFA. So it's not like they weren't taking security seriously. They think that with that flood of MFA requests, multi factor authentication requests to that admin. His thumb slipped, he got tired, eventually just clicked the wrong button. Once the attacker was in, finds a bunch of powershell scripts chock full of credentials to everything else, including their psychotic pam, which allowed access to hacker one and Slack and their Google Drive and everything else. We don't know exactly what this attacker took, but we do know this story because they didn't take him seriously. They thought it was some prankster, and the next person talked to was the New York Times. And you can go read that story from the New York Times. AstraZeneca here's an interesting one, where a hard coded credential caused a problem for them. When a developer pushed a test environment credential out to public GitHub repo, where it was discovered and used by outsiders, and you might be thinking, okay, well, what's the big deal? It's in a test environment. Well, another developer had pushed actual customer data into that test environment. Perfect storm, because they don't know exactly what was stolen. They don't know exactly what all customers were affected over the year period where this was true. This was in public GitHub, so it was very easy to detect. And those credentials were used by, well, they don't know exactly how many times by who. Circle CI, maybe you lived through this. They had a remote developer who had an insecure system. He had a plex server that had never been patched on his remote working box. Attacker is just broadly attacking plex that day. Finds that's the vulnerability into that particular computer realizes, hey, this computer can access the circleci internal network, plants some malware, and it starts stealing credentials anywhere it can find them from heap dumps, from memory from anywhere it can find them pasted in plain text anywhere. Uses those credentials to then get into customer applications and start planning the same malware that started stealing things the same day that they announced, hey, customers, we had to rotate all of these API keys. This was January 3, I believe, of 2023, same day they announced that. And security researchers said, hey, all of my honey tokens went off inside Circle CI. Something's gone wrong. And that's what we're talking about today. Now, all of these stories involved hard coded credentials because we know that that is what attackers want. If they're following the standard attack path, then it's that initial breach. Live off the land, figure out what's there, laterally expand escalate privileges, find what data you can, exfiltrate that out, and, well, do whatever nasty business you're going to do with it. We know they're acting faster than ever before because they know we're defending faster than ever before. We know how they behave, though. We know that path, and we can start using that against them. That's the important takeaway from all of the data. All of the Verizon DbIR, the SOFOS reporting, the CISO reporting, all of the other acronyms out there in security. All that reporting says they behave generally the same way. We know exactly what they want, too. They want your data so they can ransom it or sell it out there on the Internet. They also want your machine resources to either crypto mine or to sell access to those machines to other malicious people to do other things with, like DDoS attacks or their own crypto mining and anything that leads back to those abilities and those data or systems. We know this is a problem that we as developers aren't making the attackers lives harder, we're making it easier because we keep leaving plain text credentials around again, lateral expansion and escalation are general themes we see in almost all attacks. Last year, we found over 10 million hard coded credentials added to GitHub public repos. Here at Gitguardian, we look at every single commit that hits GitHub public through the API. And last year it was over a billion commits. And out of that we found 10 million hard coded credentials, and we found out about one out of every ten developers has done this. You can read the full report to get all the fine details of what was stolen and what was exposed and what potential attackers could have stolen and used to leverage to get into attack. But point is, we know that this is what they're after. We know that once they're inside, they're always going to be looking for those hard coded credentials, and that's what we can use against them. That is our advantage as blue teamers, as defense. We have been using cyber deception for a long time. We actually have been using deception for a long time. Let's look back through history. Let's start a little bit before the Internet existed and go back to that first famous story of deception where the Trojans were fighting the Akkadians. Homer details all this out. And the Odyssey or the Iliad, I'm sorry, the Iliad. Sorry, the trojan horse. We all kind of know this, and we're still living with trojan horses today. It looks like it's something, but inside is something else, and it's malicious. Fast forward a little bit and this becomes a very common tactic in war that will appear strong where we're weak, and weak, where we're strong and lure people to lower the defenses when they shouldn't lower the defenses and attack them that way. Sun Tzu might or might not have said this, but it is in the art of war. The ghost army. One of my favorite stories from World War II. We didn't have enough tanks and bombs and planes at the beginning of the war. We just didn't. The US involvement in the war, I should say. We were building them and mobilizing as fast as we could. So the US military turned to Hollywood and said, hey, Hollywood, can you build us a bunch of traps that look like planes and tanks from a distance? Remember in 1942, reconnaissance relies on binoculars and high flying planes, and not up close. We don't have radar, we don't have satellites, we don't have drones. So this looks good enough from a distance. So Hollywood built us a bunch of balloons, and that's inflatable tanks and planes so we could position them and play a lot of loud noises so it sounded like they were staging in one direction. Meanwhile, we snuck the actual planes and tanks that we built around in another direction and, well, eventually won World War II. Great documentary about that, by the way. Speaking of great documentaries, one of my favorites I've seen recently is the KGB. The computer in me. It's a documentary from 1990. That's the actual name of it. That's a screenshot from the opening credits. It's a Nova special, Nova from PBS, the public broadcasting station here in our public broadcast system here in the United States. It's based on his book the Cuckoo's egg, which is a really good book unto itself, and I highly recommend checking it out. Long story short, this is where we get the term honeypot. Cliff stole, still alive, still awesome dude. He is working at the Lawrence Berkeley National Laboratory, and he's investigating this missing $0.75 in billing. It costs $300 an hour to rent these machines from Lawrence Berkeley National Laboratory, so you can do your research on it. Long story short, he ends up finding that it's somebody in eastern Europe, eastern Germany, I should say, who is stealing any data they can find on open networks, us government networks, military networks, and in this case, a university network. He won't stay on the line long enough for them to get a good traps on exactly who this person is. So his girlfriend, Cliff Stole's girlfriend at the time, suggests, hey, what if we put a bunch of fake data on the system and lure them in? Cliff Stole said, that's a great idea. He does this, calls it a honeypot because it's sticky. Download speeds in 1985 are very slow. So by the time this person figures out, hey, this is all just a bunch of junk, they've already been caught. Won't spoil the ending of this, but go watch that documentary. It's absolutely amazing and fascinating. And his book's pretty good, too. Fast forward a little bit in time, and honey tokens kind of takes off. As a concept. We get to Fred Cohen deception toolkit 91, which is the first description of how to build a honeypot system inside of your network. You can go refine this documentation today. Basically, the idea is, if it's a system that's not in use, let's turn it on and wait for people to try to access it. And we'll catch those people and they won't know what they're supposed to be getting into and what they're not really big step in the history of computer security. Fast forward a little bit further and this idea keeps catching on and people keep reinventing the wheel. But then someone finally says, hey, here's a commercial version of this enterprise. It honey pots are a great proven idea. Here's one off the shelf. And I think this marks a really important point in the history of hacking, because he says something. Alfred Uger says hackers aren't kids on a digital joyride. It's clear their motives, financial gain. That's as true today as it was when he said it. But it marks this turning point. The term hacking comes from MIT. It was originally meant as engineering students who played elaborate pranks, like mostly harmless. They built a car on top of the roof of this building, and nobody to this day knows exactly how they did it. Very clever. They hooked a fire hydrant up to a drinking fountain. Hilariously, just little fun pranks hijinks. Well, this is the point where we've gone from phone freaks and people as kind of victimless crimes to, hey, they're starting to steal our stuff for real. They're not free riders. They're not getting a free phone call long distance. They are actually stealing data. They're stealing money. Fast forward a little bit more. And honey pots have become a mainstream conversation in computer security. Augusto Destabaros in 2003 writes inside of a message board. This is the exact message. But he says he's more playing with this idea called honey tokens. So instead of an entire system a honey pot, it's just information that shouldn't be flowing over the network. It's a piece of data that shouldn't move, in other words. And that's where honey token comes from. And it changes that part of the conversation of like, honey pot. Now it's a subpart of that to a token that shouldn't be touched. Fast forward a little bit further. And I think we've now reached the modern definition, which I'll properly define a little bit later in the slide deck. But Finkst, a company out of South Africa, builds the system called canary Tokens, and in 2016 they add AWS tokens into their system. And I think this is a very definitive moment in the history of what we're talking about, where a token goes from this idea of a piece of data that shouldn't move to really combining with tokens like JWTs or bearer tokens, in this case, AWS token, to really be something for someone's trying to get in using this and set off an alert. Fast forward to 2023 at RSA. I was very fortunate enough to see this talk, see Kevin Mandia talk about second line defense. Whole pointer's presentation is we can build elaborate walls, we can build these elaborate defenses and wafts, but they're going to get in. We just know this. We have to assume that we can be breached and assume that breach is happening all the time. So we need early warning signs. And you can see it at the very bottom of this picture. He says, honey tokens are your early warning signs. We have now reached this is mainstream. This is Google Cloud saying, this is how you protect yourself. And that gets us to where we're at today. And what I'm going to talk about for the rest of this session. What exactly is a honey token? Well, we talked about the original definition from Gusto and way we watched that merge with other tokens. And here's where I think we are today. This is the definition we use internally at Gitguardian. Honey token is a decoy credential that doesn't allow any real access. Importantly, it looks identical to a real credential, to an attacker or to anybody else. If it's used, it exposes that it's being used through an alert and giving you at least the ip address of the person trying to use it. This is how you build them. This is one way to build them. This is an approach. This is a GG canary. This is an open source repository that Gitguardian built that uses terraform and AWS. But the concept is very straightforward. Let's create users in the system that have no rights whatsoever. If you are going to use this, I would advise building this in a different entire region than your other tools that you're using or other deployments on AWS just for safety. Really isolate it as much as possible. But you want a list of users who have no credentials. Let's take those users and build a lambda function that uses cloud trail to watch for those credentials trying to be used. Create that event in an s three bucket or from the logging throw that s three bucket lambda. Does the triangulation of does that name on the list match one of our honey token credentials from the list? If it does, send either a slack message or an email using SES or sendgrid. This is the product GG Canary out of the box open source this is exactly how it works. Quick note, you will not find this exact diagram inside the repository. This comes from a blog post. If you just google Gitguardian GG Canary, it's the first blog post that pops up about it from the Gitguardian blog. But this is the idea. Is this the only way to build them? Absolutely not. This is how we built this one. And that gets me to my next point. Honey tokens can be built by hand. There's a lot of open source uses, open source repos we're going to talk about, and then there's a lot of stuff off the shelf just really quickly. There are a lot more open source options than what I'm talking, than what I'm showing here. But these are the main ones that I drew inspiration from for this talk. But before that, there's the idea you can just build these now that you have the concept in your head. Yeah, there's a lot of ways to approach this. If you can have some kind of a logging and some kind of alert system to tell you someone's trying to use it, you can build it to your imagination. We showed you the diagram we used for GG Shield, and you can go see the code for GG Shield or not GG Shield. I'm sorry, GG Canary. Look at GG Canary and tear it apart and see how it works. And if you like terraform and AWS, maybe that's the right one for you. Space Siren is another one that's very interesting history. It's forked off of something called space crab. Very interesting project. I'd highly recommend going out and checking that history for fun. Just if you like researching security histories but turned into Space Siren. That's the modern, still maintained thing today. It uses AWS directly. If you have a little bit of AWS know how, you'll do fine with it, but it's a jumping off point, I would say. And then thanks Canary tokens. If you can deploy it in Docker, and if you want to maintain your own infrastructure and run this yourself, that's a good one too. They all work. It's just a matter of what do you want to support and how do you want to build it. And that how do you want to support it is a very important question, because if the answer is I don't want to support this, I just want to use it, well, then you're going to start using the commercial options, and there's a lot of them out there. The free one that I think everybody should start with. If you're new to this idea and you've never seen a honey token in action, canarytokens.org, go make a honey token, a one off honey token, and see how it works. You'll produce not just an AWS credential, but you can make a fake credit card, a fake SQLite server or SQL lite file, a fake PDF, a fake email, and they're not real. But if someone tries it to use it for any reason, you get an alert and you can see what that alert looks like through their system. It's really cool, but it's a one off. And if you're working like one or two projects or one or two places you'd ever want to put a canary token, it's a really good free option. If you want to do that at scale because you're an enterprise, they sell that. It's called canary tools. You can go to that website and thanks to will gladly sell you commercial version of this at scale. If you're a gitguardian customer or you're planning to be a gitguardian customer, or it sounds like a good idea to you and you want to use GitGuardian, we make one too. It's called honey token module. The GitGuardian honey token module. We have a platform play, so module is the add on. It does require you to have a GitGuardian account which is free for individual users, teams up to 25, and for open source use. But this isn't a good fit for open source use, and I'll talk about why here in a few slides. If you're a Microsoft fan, they have this built into Sentinel. If you're using Azure, I don't know a lot about it other than there are documents for it, but your mileage may vary. Go dig through the documents and talk to your rep. If it sounds interesting and you're already using Sentinel. I wouldn't say go use Sentinel just for this yet, though. There's a lot of great reasons to use Sentinel, though. If you're a crowdstrike customer, they got one too. Go talk to your reps. I have no idea what it actually looks like, and I don't know what it's called internally, but I do know there was a blog post about them having honey tokens proofpoint, which I have talked to. Really interesting company, very broad play that they have. But one of their many, many tools is identity threat defense shadow, which is a honey token play. It's more of a honey pot play, but you can use it as a honey token system as well. But like I said, there's tons of options. If you're already a customer of any of these companies, go talk to your rep. It might even be a free add on up to a limit, but your mileage may vary on all that. So now we've talked about how to get them, how to build them, generally what they are, how to architect them. How do you use these things? Well, we think there are some best practices around this. Put honey tokens in private environments. Anywhere where someone outside of your organization shouldn't have access, or anyone outside of a team shouldn't have access. That's a good place. So your private code repositories, then you know if someone gets in, you have a breach on your hands, and that's not good. Same thing with your CI environments. Like we saw the real world example of Circle CI. That's a real tweet. You can go and find that. A third party researcher says, hey, all my honey tokens went off. Something's going on with Circle CI, and he knew before the announcement came out, your messaging systems, your project management systems, anywhere internal, there's no legitimate use for these things. So if someone internally does find one and just uses it just to use it, that's a whole different conversation than a breach. But it's still a security concern. Like, why is this person doing that? It's an educational moment at best, and it's a breach at worst. Put them in your vault systems, because if someone breaches your vault, that's a very bad day. That means they have access to literally everything, and you don't want that. Back to my point earlier on open source use, you don't want to put these in public places. And the whole nature of open source is it's public. The main reason. Why is that all? Not just these platforms, but there's a lot of platforms in the world, good and bad. And a lot of bots out there that are constantly scanning the Internet, trying to find hard coded credentials that they can harvest, and they're also looking for other things. But that's a big thing that they're doing now is let's find and validate these credentials. And if you put them in a private repository and all of a sudden a public scanner hits it, you know you got a leak on your hands. If you put it in a private environment and all of a sudden a public scanner hits it, something's gone horribly wrong. And you know, you need to deal with that right now and respond very quickly. And that's the whole point of this, is we can respond faster and cut those dwell times or those breach times and leak times down as much as possible and mitigate the situation as best we can as fast as we can, use a one to one ratio. This is another huge time saver you might be tempted to. I have this one honey token, I'm going to put it everywhere. Then if it goes off, you don't know what triggered it or what specific repo or environment has been infiltrated or leaked. So if you have 100 repositories and a Jira instance, and you put the same honey token everywhere and it goes off, now you have to triangulate which of those caused that. Keep it real simple, put it in one place and then create a new one to put somewhere else. Pretty straightforward. Do think in terms of scaling this with automation, doing this is a one off exercise. Again, if you have only a couple small places to put them and you're done, good. But if you're thinking I have a whole enterprise secure, I have hundreds of repos and I have thousands of developers, and I have way too many internal systems to count, then you're going to start thinking like, how do I spin these up and put them somewhere? That example down below, not sure how useful it is. I keep meaning to document it better, but it's just a simple script that shows, hey, here's a tool to create a honey token, and here's some logic to insert it into a git repo. That's all it is, but just a jumping off point to like, okay, that's how we can think of automation. You are probably on the blue team if you're watching this, or you might be someone who's just interested generally in security. Unless you are specifically a law enforcement agent, don't go after these people. When these go off, you'll get an IP address and I'll show these going off here in a little bit, but know that your job is really to protect your stuff. So think in terms of I got to get this IP address out of here, I got to make sure that the breach is stopped, I got to make sure that anything someone got into is secured, and any credentials that get leaked get rotated. Think in those terms, not I'm going to go hunt these people down and stop them. Because the truth is, that's not your job, unless specifically you are tasked with doing that. Then good luck. And like with everything else in computer science, like with every other technology, this is a journey. It's not a one off exercise. If you treat it like a one off exercise, you'll get some return on investment, but you'll burn yourself out trying to do it all at once, or you will do a couple of places and just never think about this again. So start thinking about honey tokens if you're going to deploy them, if you're going to embrace this as a strategy, which I highly encourage you to do, think long term and think how do we do this at a regular pace? Is this a once a week things? Is it once a quarter thing? Is once a sprint thing? If you added one new honey token per sprint, and your average sprint is three weeks, then you're going to get a lot of coverage in a year's time. And hopefully they never go off. Once they're set. They're set until that AWS instance completely vanishes, until you take them down. It's a good time to go back and check and see what happened to that honey token, because that's what I did earlier. I didn't explain myself, but what I did is I took a honey token I created through the gitguardian honey token module, which is an AWS token, AWS key id and access key pair. And I added it as a comment to a public repo that I originally created back in July. I'm recycling an old repo, I'll be honest with you. So over here in my email, I know it's a little hard to see, so I'll make it a little bit bigger. See that? Hey, I have these alerts that comp 42 was triggered. My honey token for comp 42 that it created. That's literally what I deployed that got triggered 26 minutes ago. I deployed this 26 minutes ago. Let's go look at the honey token itself. And I can see all of these twelve so far system scans have triggered this. If we can look at the first one. The first one was AWS itself. AWS knows this is a problem. They are constantly scanning the Internet trying to see has anyone leaked one of our credentials? And that has fired off some internal stuff at AWS. I might get an email about that. An email will have been sent to someone about that. Well, that's us, that's gitguardian. We are constantly scanning as well because we put this on GitHub public and then we see, oh, here's two people in the US that use truffle hog as the user agent, and there's the same person in India did it. They scanned and validated. So just the act of reading isn't going to trigger this. They tried to validate, they tried to use it. They were getting caller identity, which is a very common call to make. Just to who am I? Just to make sure to see if this worked. Don't worry, these all came back invalid, but they were read from here immediately. So that's what that looks like. So what do I do about that? Well, in this case, what I'm going to do about it is I'm simply going to revoke the honey token. Revoke the honey token? It's been triggered, so it's not that useful anymore. People are going to mark it as invalid if they're trying to create it on a list, and then I'm going to clean up after myself and just get rid of it here. But this is what I'm doing. I could rewrite my history here and remove all instances of that, but it was never good anyway. So there's no harm in leaving it in my code base, especially my code history. If I was a real credential, I would have cleaned it out of my code base, my history, my git history. But again, this is a honey token, it doesn't really matter. So in conclusion, wrapping things up, if we just think in terms of defense, it's really hard to win if you only play defense. What we need to do is start actively kicking people out faster, cut the dwell time, put the attacker on their heels, make them think, wow, I don't know what to do next because the second we do that, we win. Because as soon as they don't know what to do, as soon as it no longer matches the playbook they downloaded that they're using for an attack, they're going to run out of ideas and go somewhere else where that playbook will work. So we need to act quicker and kick them off a network and make them think, I don't know how to further engage if it's an advanced, persistent threat like a Lazarus group or North Korea is attacking you. I don't know if this will work, but I'd say the vast majority of attacks, this is going to work great. It's figuring out who they are and boot them out, honey. Tokens are decoys. They look real because they are real, but they just don't give any access. But they give you an alert. You should use them wherever you got private data and wherever you got private code. There's a lot of options. You can build them yourself. There's tons of great articles and resources out there if you're going to go the DIY route. And of course there's commercial options. The ones I showed are just the tip of the iceberg. There's a lot more solutions out there. I highly am personally kind of skewed toward Gitguardian because I work here. But there's a lot of good options and don't put them publicly unless you are specifically trying to just gather ip addresses from people scanning you. And there's not a lot of use to that in all reality, because again, your job should be to defend, not to actively go after people that are scraping the Internet. Because there's a lot of people scraping the Internet. Anyway, I've been Dwayne, I live in Chicago. I've been a developer advocate since 2016. Check out the security repo podcast. Highly recommend it. We have a lot of great guests and talk about a very wide range of security subjects. Hit me up on the Internet, McDwayne. Pretty much everywhere, including GitHub. And if you ever want to hit me up about anything, you can reach out to me at dwayne mcDaniel@getguardian.com. Thank you so much for listening. Had a blast making this and I hope that everybody has a great rest of your day and enjoys all the amazing content here at comp 42. Thank you.

Slides

Download slides (PDF)

See all 33 talks at this event!

Conf42 DevSecOps 2023 - Online

November 30 2023

Who Goes There? Actively Detecting Intruders With Honeytokens

Video size:

Abstract

Summary

Transcript

Slides

Dwayne McDaniel

Developer Advocate @ GitGuardian

Join the community!

Featured event

2025

2024

Info

Conf42 DevSecOps 2023 - Online

November 30 2023

Who Goes There? Actively Detecting Intruders With Honeytokens

Video size:

Abstract

Summary

Transcript

Slides

Dwayne McDaniel

Developer Advocate @ GitGuardian

Join the community!