Transcript
This transcript was autogenerated. To make changes, submit a PR.
You. Hey everybody, welcome to my
talk. Who goes there? Actively detecting intruders with
cyber deception tools here at Comp 42. I'm very excited
to be part of the lineup. I hope you enjoy all the amazing content from
all the creators and all the providers out there. Let's go ahead and get started.
So, I'm Dwayne. I live in Chicago, Illinois. I've been a
developer advocate since 2016. You can go out and hear hear me
co host a podcast called the Security Repo podcast. We have some really
awesome hosts, really awesome guests, I should say some awesome hosts too,
but some awesome guests telling the world about great things
in the world of security, from physical security to pen testing to
API security and of course, code security. Hit me up
on the Internet out there, McDwayne, at most places, including GitHub.
And feel free to email me. Dwayne mcdaniel@gitguardian.com
I work for Gitguardian. We are a code security platform focused
on helping companies eliminate the problem of hard coded
credentials, finding where those plaintext credentials appear,
and, well, it's giving you a path to do something about it.
We also make honey tokens, which will come up later. But real
quick, before I go any further, I need to deploy something. So I'm going to
go ahead and copy these credentials and go to GitHub,
and we're going to edit this and paste
those in. And just because I feel like it, I'm going to go ahead and
make those a comment, and we'll commit that
change and we'll come back to
that later. Attackers want your credentials. We know this.
We all know this. We're up against a
lot of threats out there in the world, and if they get those credentials,
some bad things can happen. I'm going to tell a couple of
horror stories and they might be a little disconcerting, a little
scary. And if you get a little scared, and this is true of anytime you
get a little scared, feel free to recite the Benning jesuit litany against
fear. I'm a huge dune fan, and this is one of
the greatest things that came out of that series. I think you will remain.
Only the fear will be gone at the end.
So just take a deep breath. And this is true of, again, anytime you
do leak something or you think that you are being breached.
So uber last year they had attack,
they had a super admin, got phished. Now they
have MFA. So it's not like they weren't taking security
seriously. They think that with
that flood of MFA requests, multi factor authentication
requests to that admin. His thumb slipped, he got
tired, eventually just clicked the wrong button. Once the
attacker was in, finds a bunch of powershell scripts
chock full of credentials to everything else, including their psychotic
pam, which allowed access to hacker one and
Slack and their Google Drive and everything else.
We don't know exactly what this attacker took, but we do know this story because
they didn't take him seriously. They thought it was some prankster,
and the next person talked to was the New York Times. And you can go
read that story from the New York Times.
AstraZeneca here's an interesting one, where a
hard coded credential caused a problem for them.
When a developer pushed a test environment credential out
to public GitHub repo, where it was discovered and used
by outsiders, and you might be thinking, okay, well, what's the big deal?
It's in a test environment. Well, another developer had pushed
actual customer data into that test environment.
Perfect storm, because they don't know exactly what was stolen.
They don't know exactly what all customers were affected over the year
period where this was true. This was in public
GitHub, so it was very easy to detect. And those
credentials were used by, well, they don't know exactly how
many times by who. Circle CI, maybe you lived through this.
They had a remote developer who had
an insecure system. He had a plex server that had never been
patched on his remote working box.
Attacker is just broadly attacking plex that day.
Finds that's the vulnerability into that particular computer realizes,
hey, this computer can access the circleci internal
network, plants some malware, and it starts
stealing credentials anywhere it can find them from heap dumps,
from memory from anywhere it can find them
pasted in plain text anywhere. Uses those credentials to
then get into customer applications and start planning the same malware that
started stealing things the same day that they announced,
hey, customers, we had to rotate all of these API keys.
This was January 3, I believe, of 2023,
same day they announced that. And security researchers said,
hey, all of my honey tokens went off inside Circle CI.
Something's gone wrong. And that's
what we're talking about today. Now, all of these stories involved hard coded credentials
because we know that that is what attackers want. If they're following
the standard attack path, then it's that initial breach.
Live off the land, figure out what's there, laterally expand escalate
privileges, find what data you can, exfiltrate that out,
and, well, do whatever nasty business you're going to do with it.
We know they're acting faster than ever before because they know we're
defending faster than ever before. We know how they behave,
though. We know that path, and we can start using that against them.
That's the important takeaway from all of the data. All of
the Verizon DbIR, the SOFOS reporting, the CISO
reporting, all of the other acronyms out there in security.
All that reporting says they behave generally the same way.
We know exactly what they want, too.
They want your data so they can ransom it or sell it out there on
the Internet. They also want your machine resources to either crypto
mine or to sell access to those machines
to other malicious people to do other things
with, like DDoS attacks or their own crypto mining and
anything that leads back to those
abilities and those data or systems.
We know this is a problem that we
as developers aren't making the attackers lives harder, we're making it easier
because we keep leaving plain text credentials around
again, lateral expansion and escalation
are general themes we see in almost all attacks.
Last year, we found over 10 million hard coded credentials added
to GitHub public repos. Here at Gitguardian,
we look at every single commit that hits GitHub public through the
API. And last year it was over a billion commits.
And out of that we found 10 million hard coded credentials, and we found out
about one out of every ten developers has done
this. You can read the full report to get all the fine details of what
was stolen and what was exposed and what
potential attackers could have stolen and used
to leverage to get into attack.
But point is, we know that this is what they're after.
We know that once they're inside, they're always going to be looking for those hard
coded credentials, and that's what we can use against them. That is
our advantage as blue teamers, as defense.
We have been using cyber deception for a long time. We actually have been
using deception for a long time. Let's look
back through history. Let's start a little bit before the Internet existed
and go back to that first famous story of deception where
the Trojans were fighting the Akkadians.
Homer details all this out. And the
Odyssey or the Iliad, I'm sorry, the Iliad.
Sorry, the trojan horse. We all kind of know this, and we're still living with
trojan horses today. It looks like it's something, but inside is
something else, and it's malicious. Fast forward a little bit
and this becomes a very common tactic in war
that will appear strong where we're weak, and weak, where we're strong and
lure people to lower the defenses when they shouldn't lower
the defenses and attack them that way.
Sun Tzu might or might not have said this, but it is in the art
of war. The ghost army. One of my favorite
stories from World War II. We didn't have enough tanks and bombs
and planes at the beginning of the war. We just didn't.
The US involvement in the war, I should say. We were building them and mobilizing
as fast as we could. So the US military turned to Hollywood and said,
hey, Hollywood, can you build us a bunch of traps that look like
planes and tanks from a distance? Remember in 1942,
reconnaissance relies on binoculars and high
flying planes, and not up close.
We don't have radar, we don't have satellites, we don't have drones. So this looks
good enough from a distance. So Hollywood built us a
bunch of balloons, and that's inflatable tanks and
planes so we could position them and play a lot of
loud noises so it sounded like they were staging in one direction.
Meanwhile, we snuck the actual planes and tanks that we built
around in another direction and, well, eventually won World War II.
Great documentary about that, by the way. Speaking of great documentaries,
one of my favorites I've seen recently is
the KGB. The computer in me. It's a documentary from
1990. That's the actual name of it. That's a screenshot from the
opening credits. It's a Nova special, Nova from PBS,
the public broadcasting station here in our public broadcast
system here in the United States.
It's based on his book the Cuckoo's egg, which is a really good book unto
itself, and I highly recommend checking it out. Long story short,
this is where we get the term honeypot. Cliff stole, still alive,
still awesome dude. He is working at the Lawrence Berkeley
National Laboratory, and he's investigating this missing
$0.75 in billing. It costs $300
an hour to rent these machines from Lawrence Berkeley National Laboratory,
so you can do your research on it. Long story short, he ends
up finding that it's somebody in eastern Europe,
eastern Germany, I should say, who is stealing
any data they can find on open networks, us government
networks, military networks, and in this case, a university
network. He won't stay on the line long enough for them
to get a good traps on exactly who this person is.
So his girlfriend, Cliff Stole's girlfriend at the time, suggests,
hey, what if we put a bunch of fake data on the system and
lure them in? Cliff Stole said, that's a great idea. He does this,
calls it a honeypot because it's sticky.
Download speeds in 1985 are very slow.
So by the time this person figures out, hey, this is all just a bunch
of junk, they've already been caught. Won't spoil the ending of this,
but go watch that documentary. It's absolutely amazing and fascinating.
And his book's pretty good, too. Fast forward a little bit in time,
and honey tokens kind of takes off. As a concept. We get to
Fred Cohen deception toolkit 91, which is the
first description of how to
build a honeypot system inside
of your network. You can go refine
this documentation today. Basically, the idea is, if it's a
system that's not in use, let's turn it on and
wait for people to try to access it. And we'll catch those people and they
won't know what they're supposed to be getting into and what they're not really
big step in the history of computer security.
Fast forward a little bit further and this idea keeps
catching on and people keep reinventing the wheel. But then someone finally
says, hey, here's a commercial version of this
enterprise. It honey pots are a great
proven idea. Here's one off the shelf. And I think this marks
a really important point in the history of
hacking, because he says something. Alfred Uger
says hackers aren't kids on a digital joyride. It's clear
their motives, financial gain. That's as true today as it
was when he said it. But it marks this turning point. The term hacking
comes from MIT. It was originally meant as engineering
students who played elaborate pranks,
like mostly harmless. They built a car on top
of the roof of this building, and nobody to this day knows exactly how they
did it. Very clever. They hooked a fire hydrant
up to a drinking fountain.
Hilariously,
just little fun pranks hijinks.
Well, this is the point where we've gone from
phone freaks and people as kind of victimless crimes
to, hey, they're starting to steal our stuff for real.
They're not free riders. They're not getting a free phone call long
distance. They are actually stealing data. They're stealing money.
Fast forward a little bit more. And honey
pots have become a mainstream conversation in computer
security. Augusto Destabaros in 2003
writes inside of a message board. This is
the exact message. But he says he's more playing with this idea called honey tokens.
So instead of an entire system a honey pot,
it's just information that shouldn't be flowing over
the network. It's a piece of data that shouldn't move, in other words. And that's
where honey token comes from. And it changes
that part of the conversation of like, honey pot. Now it's a subpart
of that to a token that shouldn't be touched. Fast forward
a little bit further. And I think we've now reached the modern
definition, which I'll properly define a little bit later
in the slide deck. But Finkst,
a company out of South Africa, builds the
system called canary Tokens, and in 2016 they
add AWS tokens into
their system. And I think this is a very definitive moment in the
history of what we're talking about, where a token goes from
this idea of a piece of data that shouldn't move to really
combining with tokens like JWTs or bearer
tokens, in this case,
AWS token, to really be something
for someone's trying to get in using this
and set off an alert. Fast forward to 2023 at
RSA. I was very fortunate enough to see this talk, see Kevin
Mandia talk about second
line defense. Whole pointer's presentation is we
can build elaborate walls, we can build these elaborate defenses and wafts,
but they're going to get in. We just know this. We have to
assume that we can be breached and assume that breach is happening all the time.
So we need early warning signs. And you can see it at the
very bottom of this picture. He says, honey tokens are your early warning
signs. We have now reached this is mainstream.
This is Google Cloud saying, this is how you protect yourself.
And that gets us to where we're at today. And what I'm going to talk
about for the rest of this session. What exactly is a honey token?
Well, we talked about the original definition from Gusto and
way we watched that merge with other tokens.
And here's where I think we are today. This is the definition we use
internally at Gitguardian. Honey token is a decoy credential
that doesn't allow any real access.
Importantly, it looks identical to a real
credential, to an attacker or to anybody else.
If it's used, it exposes that it's
being used through an alert and giving
you at least the ip address of the person trying to
use it. This is how you build them. This is one way to build
them. This is an approach. This is a GG canary. This is an
open source repository that Gitguardian built that
uses terraform and AWS. But the concept is very straightforward.
Let's create users in the system that have no
rights whatsoever. If you are going to use this, I would advise building
this in a different entire region than your
other tools that you're using or other
deployments on AWS just for safety. Really isolate it as
much as possible. But you want a list of
users who have no credentials.
Let's take those users and build a lambda
function that uses cloud trail to watch for
those credentials trying to be used. Create that event
in an s three bucket or from the logging throw that s three bucket
lambda. Does the triangulation of
does that name on the list match one of
our honey token credentials from the list? If it
does, send either a slack
message or an email using SES or sendgrid.
This is the product GG Canary out of the box open source this is
exactly how it works. Quick note, you will not find this exact diagram
inside the repository. This comes from a blog post. If you
just google Gitguardian GG Canary, it's the first
blog post that pops up about it from the Gitguardian blog. But this
is the idea. Is this the only way to build them? Absolutely not.
This is how we built this one. And that gets me to my next point.
Honey tokens can be built by hand. There's a
lot of open source uses, open source repos we're going to talk about,
and then there's a lot of stuff off the shelf just
really quickly. There are a lot more open source options than what I'm talking,
than what I'm showing here. But these are the main ones that I
drew inspiration from for this talk. But before that,
there's the idea you can just build these now that
you have the concept in your head. Yeah, there's a lot of ways
to approach this. If you can have some kind of a logging and
some kind of alert system to tell you someone's trying to use it,
you can build it to your imagination. We showed you the
diagram we used for GG Shield, and you can go see the code
for GG Shield or not GG Shield. I'm sorry, GG Canary.
Look at GG Canary and tear it apart and see how it works. And if
you like terraform and AWS, maybe that's the right one for you.
Space Siren is another one that's very interesting history.
It's forked off of something called space crab. Very interesting project.
I'd highly recommend going out and checking that history for fun.
Just if you like researching security histories but turned into
Space Siren. That's the modern, still maintained thing
today. It uses AWS directly.
If you have a little bit of AWS know how, you'll do fine with it,
but it's a jumping off point, I would say. And then thanks Canary tokens.
If you can deploy it in Docker, and if you want to maintain your own
infrastructure and run this yourself, that's a
good one too. They all work. It's just a matter of
what do you want to support and how do you want to build it.
And that how do you want to support it is a very important question,
because if the answer is I don't want to support this, I just want to
use it, well, then you're going to start using the commercial
options, and there's a lot of them out there. The free one that I
think everybody should start with. If you're new to this idea and you've never seen
a honey token in action,
canarytokens.org, go make a honey token,
a one off honey token, and see how it works.
You'll produce not just an AWS credential,
but you can make a fake credit card, a fake SQLite server or
SQL lite file, a fake PDF, a fake email,
and they're not real. But if someone tries it to use it for any reason,
you get an alert and you can see what that alert looks like through their
system. It's really cool, but it's a one off.
And if you're working like one or two projects or one or two places you'd
ever want to put a canary token, it's a really good free option. If you
want to do that at scale because you're an enterprise, they sell that. It's called
canary tools. You can go to that website and thanks to will
gladly sell you commercial version of this at scale.
If you're a gitguardian customer or you're planning to be a gitguardian customer,
or it sounds like a good idea to you and you want to use GitGuardian,
we make one too. It's called honey token module. The GitGuardian
honey token module. We have a platform play, so module is
the add on. It does require you to have a GitGuardian account
which is free for individual users, teams up to 25,
and for open source use. But this isn't a good fit for open source use,
and I'll talk about why here in a few slides.
If you're a Microsoft fan, they have this built into Sentinel.
If you're using Azure, I don't know a lot about it
other than there are documents for it, but your mileage may vary.
Go dig through the documents and talk to your rep. If it sounds interesting
and you're already using Sentinel. I wouldn't say go use Sentinel just for this yet,
though. There's a lot of great reasons to use Sentinel, though. If you're a
crowdstrike customer, they got one too. Go talk to your reps. I have no
idea what it actually looks like, and I don't know what it's called internally,
but I do know there was a blog post about them having honey tokens
proofpoint, which I have talked to. Really interesting company,
very broad play that they have. But one of their many,
many tools is identity threat defense shadow,
which is a honey token play. It's more of a honey
pot play, but you can use it as a honey token system as well.
But like I said, there's tons of options. If you're already a customer of any
of these companies, go talk to your rep. It might even be a free
add on up to a limit, but your mileage may vary on
all that. So now we've talked about how to get them, how to build them,
generally what they are, how to architect them. How do
you use these things? Well, we think there are some best practices around this.
Put honey tokens in private environments.
Anywhere where someone outside of your organization shouldn't
have access, or anyone outside of a team shouldn't have access.
That's a good place. So your private code repositories,
then you know if someone gets in, you have a breach on your hands,
and that's not good. Same thing with your CI environments. Like we saw the real
world example of Circle CI. That's a real tweet.
You can go and find that. A third party researcher says,
hey, all my honey tokens went off. Something's going on with Circle CI,
and he knew before the announcement came
out, your messaging systems,
your project management systems, anywhere internal,
there's no legitimate use for these things. So if someone internally does find one
and just uses it just to use it, that's a whole different conversation
than a breach. But it's still a security concern. Like, why is this person doing
that? It's an educational moment at best,
and it's a breach at worst. Put them
in your vault systems, because if someone breaches your
vault, that's a very bad day. That means they have access to
literally everything, and you don't want that. Back to my point earlier on
open source use, you don't want to put these in public places. And the whole
nature of open source is it's public. The main reason.
Why is that all? Not just these
platforms, but there's a lot of platforms in the world, good and bad.
And a lot of bots out there that are constantly scanning the Internet,
trying to find hard coded credentials that they can
harvest, and they're also looking for other things. But that's a big
thing that they're doing now is let's find and validate these credentials.
And if you put them in a private repository and
all of a sudden a public scanner hits it, you know you got a leak
on your hands. If you put it in a private environment and all of a
sudden a public scanner hits it, something's gone horribly wrong.
And you know, you need to deal with that right now and respond very quickly.
And that's the whole point of this, is we can respond faster and cut
those dwell times or those breach times and leak times down
as much as possible and mitigate the situation as
best we can as fast as we can, use a one to one ratio.
This is another huge time saver you might be
tempted to. I have this one honey token, I'm going to put it everywhere.
Then if it goes off, you don't know what triggered
it or what specific repo or environment has been
infiltrated or leaked. So if you have 100 repositories
and a Jira instance, and you put the same honey token everywhere and it goes
off, now you have to triangulate which of those caused
that. Keep it real simple, put it in one place and
then create a new one to put somewhere else. Pretty straightforward.
Do think in terms of scaling this with automation,
doing this is a one off exercise. Again, if you
have only a couple small places to put them and
you're done, good. But if you're thinking
I have a whole enterprise secure, I have hundreds of repos and I have thousands
of developers, and I have way too many internal systems to
count, then you're going to start thinking like, how do I spin
these up and put them somewhere? That example down below,
not sure how useful it is. I keep meaning to document it better,
but it's just a simple script that shows, hey, here's a tool to create a
honey token, and here's some logic to insert it into
a git repo. That's all it is, but just a jumping off point
to like, okay, that's how we can think of automation.
You are probably on the blue team if you're watching this, or you might be
someone who's just interested generally in security.
Unless you are specifically a law enforcement agent,
don't go after these people. When these go off, you'll get an
IP address and I'll show these going off here in a little
bit, but know that your job is really to protect
your stuff. So think in terms of I got to get
this IP address out of here, I got to make sure that the breach is
stopped, I got to make sure that anything someone got into is secured,
and any credentials that get leaked get rotated. Think in those terms,
not I'm going to go hunt these people down and stop them.
Because the truth is, that's not your job, unless specifically
you are tasked with doing that. Then good luck.
And like with everything else in computer science, like with every other technology,
this is a journey. It's not a one off exercise.
If you treat it like a one off exercise, you'll get some return on investment,
but you'll burn yourself out trying to do it all at once,
or you will do a couple of places and just never think about this again.
So start thinking about honey tokens if you're going to deploy them,
if you're going to embrace this as a strategy, which I highly encourage you
to do, think long term and think how do we
do this at a regular pace? Is this a
once a week things? Is it once a quarter thing? Is once a sprint thing?
If you added one new honey token per
sprint, and your average sprint is three weeks, then you're
going to get a lot of coverage in a year's time. And hopefully they never
go off. Once they're set. They're set until that AWS
instance completely vanishes, until you take them down.
It's a good time to go back and check and see what happened to that
honey token, because that's what I did earlier. I didn't explain myself, but what I
did is I took a honey token I created through the gitguardian
honey token module, which is an AWS token,
AWS key id and access key pair.
And I added it as a comment to a public repo that
I originally created back in July. I'm recycling an old repo, I'll be honest with
you. So over here in my email, I know it's a little hard
to see, so I'll make it a little bit bigger. See that? Hey, I have
these alerts that comp 42 was triggered.
My honey token for comp 42 that it created. That's literally
what I deployed that got triggered 26
minutes ago. I deployed this 26 minutes ago.
Let's go look at the honey token itself. And I can see
all of these twelve so far
system scans have triggered this. If we can look at the
first one. The first one was AWS itself.
AWS knows this is a problem. They are constantly scanning
the Internet trying to see has anyone leaked one of our credentials?
And that has fired off some internal stuff at AWS.
I might get an email about that. An email will
have been sent to someone about that. Well, that's us,
that's gitguardian. We are constantly scanning as well because we put this on GitHub public
and then we see, oh, here's two people in the US that
use truffle hog as the user agent, and there's
the same person in India did it. They scanned
and validated. So just the act of reading
isn't going to trigger this. They tried to validate, they tried to use it.
They were getting caller identity, which is a very common
call to make. Just to who am I? Just to make sure
to see if this worked. Don't worry, these all came back invalid,
but they were read from here immediately.
So that's what that looks like. So what do I do about that? Well,
in this case, what I'm going to do about it is I'm simply going to
revoke the honey token. Revoke the honey token? It's been triggered,
so it's not that useful anymore. People are going to mark it
as invalid if they're trying to create it on a list, and then I'm going
to clean up after myself and just get rid of it here. But this is
what I'm doing. I could rewrite my history here
and remove all instances of that, but it was
never good anyway. So there's no harm in leaving it in my code base,
especially my code history. If I was a real credential, I would have cleaned it
out of my code base, my history, my git history. But again, this is
a honey token, it doesn't really matter. So in conclusion, wrapping things up,
if we just think in terms of defense, it's really hard
to win if you only play defense. What we need to
do is start actively kicking people out faster,
cut the dwell time, put the attacker on their heels,
make them think, wow, I don't know what to do next because the second we
do that, we win. Because as soon as they don't know what
to do, as soon as it no longer matches the playbook they downloaded that they're
using for an attack, they're going to run out of ideas and
go somewhere else where that playbook will work.
So we need to act quicker and kick
them off a network and make them think, I don't know how to
further engage if it's an advanced, persistent threat like
a Lazarus group or North Korea is attacking you. I don't know if this
will work, but I'd say the vast majority of attacks, this is
going to work great. It's figuring out who they are and boot them out,
honey. Tokens are decoys. They look real because they are real, but they just
don't give any access. But they give you an alert. You should use
them wherever you got private data and wherever you
got private code. There's a lot of options.
You can build them yourself. There's tons of great articles and
resources out there if you're going to go the DIY route. And of course there's
commercial options. The ones I showed are just the tip of the iceberg. There's a
lot more solutions out there. I highly am personally kind
of skewed toward Gitguardian because I work here. But there's a lot of good
options and don't put them publicly unless you are specifically trying to
just gather ip addresses from people scanning you. And there's
not a lot of use to that in all reality, because again, your job should
be to defend, not to actively go after people that are scraping
the Internet. Because there's a lot of people scraping the Internet.
Anyway, I've been Dwayne, I live in Chicago. I've been a developer advocate
since 2016. Check out the security repo podcast. Highly recommend it.
We have a lot of great guests and talk about a very wide
range of security subjects. Hit me up on the Internet, McDwayne.
Pretty much everywhere, including GitHub. And if you ever want
to hit me up about anything, you can reach out to me at dwayne mcDaniel@getguardian.com.
Thank you so much for listening. Had a blast making this and
I hope that everybody has a great rest of your day and enjoys all the
amazing content here at comp 42.
Thank you.