Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hey everybody, welcome to my talk. Stop committing your secrets.
Get hooks to the rescue here at Comp 42 death sec
ops 2022. So, I'm Dwayne, I live
in Chicago, been a developer advocate since 2016, and you can
find me on Twitter at mcDwayne. I'm happy to talk to you about anything we
talk about today and other stuff besides, like improv,
karaoke, and rock and roll. I work for a company called GitGuardian.
They'll come up later in this talk as well. But we are a
security platform mostly concerned with secret detection,
secret sprawl detection and remediation, as well as infrastructure as
code and some other areas as we progressed.
Really what we're talking about today is the eternal battle of
cat and mouse game, of hackers trying to get at your data
and your information. One of the ways that
they have to get in is through finding hard coded
secrets laying around. And just to put this in perspective,
a few case studies here. Uber, they got
owned by a teenage hacker earlier in 2022.
He phished for initial access rights,
then immediately found a lot of hard
coded credentials in Powershell scripts. Then he went to the New York Times and said,
hey, look what I did. Toyota. They had
a secret data server key
in a public repo by accident for about five years because
a subcontractor pushed something public that shouldn't have been public.
296,000 customers have been affected.
Samsung, 160 gigs of stolen data. And when
that was pushed out to public, it was discovered that over 6000 keys for
API keys, passwords, credentials, over 6000 secrets had been hard
coded throughout their code base. No proof that it
did lead to a second data breach reported
in September. But odds are it didn't help things.
AstraZeneca, this is a very recent case.
They pushed hard coded credentials to a test environment,
and then through user error, they're calling it actual patient
data, ended up on that test environment.
It's kind of a nexus of a lot of bad things happening at
once. The credentials were exposed for over a year. We're not sure
at this point in time, as of this recording, exactly the ramifications
of this and exactly how many customers were affected.
You can have the best security set up in the world and the best security
teams, but if you just leave those keys out there, it's pretty easy
to get in. And we probably wouldn't do
this. That's a little bit of a silly example. Whatever's behind that door can't be
that valuable. Like maybe staplers or binders or, I don't know,
but while we don't do this, developer advocate
do this, and not for malicious reasons, we need to test to
see if that API endpoint is up. We need to test if that credential actually
works. So we hard code it and if we immediately
take it back out then there's no issue.
And the problem is though that we do leave these in
and by the time we think to take them out, it's far too late.
Far too late mean that we've already shared it out there on GitHub or
GitLab or wherever our remote servers are.
In 2021 we found at GitGuardian we found over 6
million keys or credentials
secrets just laying in public repos. This is a
huge increase over the previous year. We put this
out in a state of secrets for our report. You can read this if you
want. The disturbing thing here is that
this is actually increased year over year.
We're finding about three out of every thousand commits contain something
that shouldn't probably be in there, some kind of secret. And this is just in
the public. So who's responsible for all this? Well,
ultimately everyone is. If you touch the software development lifecycle,
you are responsible for making sure your secrets don't end
up in the wrong place. And if it's just the security team,
we're never going to win this fight, even in the best organizations.
Alex ray from Hacker one tells us it's 100 to one.
The security team is outnumbered by developers. So we
really need to shift it to the developers who are at the forefront of this
battle. But do it with a tool they already have access to that they
already love. Developers love and hate git,
but we use it day in and day out. I say almost all developers there
with the asterisks, last time I checked it was like 93.6% of all
developers touch git on a daily basis. And git's awesome.
But it doesn't make your code more or less secure in and of itself.
It's the stupid content tracker. It does what it does exceptionally well.
It does give you a way to add some security.
Git ignore please use git ignore files. Tell Git
to ignore certain types of files or certain paths,
and then we can start storing our access keys in
places like secrets JSon or AWS file
directories, places that we're never going to check into
our source control. And if we
combine that with things like Hashicorp vault or Azure key vaults,
I'm just throwing out two names, not trying to plug either.
But then we don't need to hard code secrets anymore and we can prevent them
from getting into our source control in a perfect world.
That's the end of the talk and we go about our day. But we
solve this problem of hard coding secrets and we solve the data that
says this is a growing problem, not a shrinking one.
So I personally don't believe that. The issue is that we've tested
a secret. We have to test secrets. We just have to. Sometimes the
problem is that we forget to take them back out and then we push those
secrets somewhere and then we have a bad time.
In theory you clean remove a secret from a pushed
commit, but it's not easy,
can be downright painful in a number of ways. One, just physically figuring
out where all that secret went to what all branches,
but also you got to talk to your team now. You're going to need to
rotate keys somewhere. Someone's going to have to stop what they're doing and
go deal with this now. And no one's going to look good in that process.
Again, painful. What we need is a robot.
A robot that reliably, every time we try to commit a secret,
just stops us. And Git gives us a way to build
that robot. And again, developers love git.
So here we go. Git hooks
is an automation platform built into Git that I think is
wildly underused out there. We can use it for so many awesome things,
but at the heart of it git hooks is this.
You can build your own contraptions that when git does a thing, it will fire
off one of your scripts. That's pretty much it.
There are 17 hooks available. Go over to githooks.com,
bookmark that. It's a great resource for building all sorts of cool
things with GitHub. But the ones we're really concerned about
are stopping us from making the commit in the first place. Those are those
pre commit hooks. And additionally, from the
server perspective, from our git remote perspective,
whoever owns that can stop those commits from even getting there in the first place.
Because we can start using pre receive hooks to say, well,
if a secret is hard coded in here, don't even let it on board,
let's just stop it where it is.
And git comes with examples for all this. Unfortunately,
they're really kind of hard to parse if you're not familiar with git
very deeply like revparse, and you don't
have the exact use case that Linus, Torval or gitster do to
manage a large project. Fortunately, though,
scripts are just scripts you can make it do anything. You can write it,
whatever language you prefer, whatever scripting language you can
access in your environments. This is actually something I do in my
personal projects. I make my git logs
tell me jokes. So when I do git commits, they spit out
a dad joke at me. Props to Ed Thompson for building
this into git dad and giving the code open source.
So an ideal solution would look like this. If we're building that robot,
every time I go to make a commit, every time I
go to push that commit around, we should have
something check to see if a credential is in there, if a secret is in
there. If it is, throw an error and don't let me make that
commit. You can build this yourself.
Nothing to stop you. If you got enough time and patience. Git grep is
a great way to go about it. Grep is awesome. You got to know Regex,
but you can make it look for any kind of pattern here. This is what
an AWS access key pattern
looks like. It's 20 characters long and it contains all caps and nums.
You can just make password equals is a pattern you're looking
for. And yeah, sure enough, it will catch those things. The problem though is
if you build it, then you have to build it and
maintain it and deal with false positives and
know what allowed and what's not allowed and start dealing
with all of the fine tuning
of it. And then to spread this to your team, to evangelize your team,
good luck. But hey, you're not the only
person facing this. There's a lot of other people have already tried building solutions
for this and I'm going to talk about a few of them today. Open source
to the rescue. Because open source is awesome and everybody should be using open source
tools wherever they can. I firmly believe so.
There's a lot of them. If you just Google solutions for this, for stop
committing hard coded secrets, prevent hard coded secrets, open source,
you're going to find a lot of solutions. I'm going to talk about the big
three that I think are the big three from my point of view
in the world. And there are ways to.
Some tools have built in ways to do this through
their proprietary offerings, but we're not going to talk about those
today. The big three, I think are AWS labs
has git secrets. It lets you triple check before you make the commit.
Anything that is an AWS looking secret,
you can extend it and people have extended it, but it does require
a good amount of knowledge of regex and specifically the patterns
you want to look for. So if you're using Google Cloud or Azure DevOps,
you're going to want to know what those patterns look like pretty intimately to be
able to adopt it to your use case. And again, people have extended
it, so go out and definitely take a look at it if you're just
getting started with this, but that's where it stops. It triple checks those
before commit and then you're done. Shuffle hog is a name you probably run
into. It is an open source framework, mostly known for their GitHub
action integration, but they do have a pre commit hook integration,
but it does require you to use the pre commit framework, which is
awesome. It's an awesome framework, it's open source, it's cool, but it is
required. And also just some people
report that it's high false positives. Your mileage may vary.
I'm not here to give any judgment about it, just reporting what I have found
from my research. I've never actually used truffle hog in production myself.
Gitguardian makes GG shield, which can be used
at the pre commit level, the pre push level,
and the pre receive hook level to make
sure you're not passing around those commits if they do get
hard coded in. Now, it does require a GitGuardian account,
which is a platform you can use for free.
For personal and open source use. There is an API limit to that,
a thousand calls a month. So if you're doing a lot of commits, this isn't
free, but for most people it's going to probably be free, especially if
you're working on open source projects. And I like it,
partially because it checks for 350 known patterns and
you can extend it. But again, extending it means regex fun.
So what does this look like in action? I'm not going to go through all
of them. They all pretty much look the same at the end. After installing
and configuring, all of them are straightforward to do.
I must say, pre commit hook setup is the only one that threw a
little bit tricky at me. Other than that they're all pretty straightforward
to get going, but they all do the same thing.
You hard code a secret somewhere, in this case a Yaml file,
and I go to commit it. And in this case Gitguardian is the one
saying, hey, we found a secret here, and just stops.
It fails. And that's the
mission. That's what hooks is doing for me
every time. I personally don't worry about hard coding secrets anymore,
because if I accidentally do, I'm not even going to commit it in the first
place. So in conclusion, don't hard code your
secrets. If you do end up hardcoding
secrets, don't commit those secrets. And the best way
to do that is just use some automation and some off the shelf tools.
I love open source, and you should too.
So I've been Dwayne, I'm a developer advocate,
have been since 2016. Hit me up on Twitter for any of
the questions you have because, well, this is pre recorded and yeah,
you can reach me out there. And thanks again for coming to my talk.
Stop committing your secrets. Get hooks to the rescue here at comp 42
devsecops 2022.