Transcript
This transcript was autogenerated. To make changes, submit a PR.
There. I'm John Peck. I've been a software developer for about 25
years and now work in enterprise advocacy at GitHub.
And that means that I get to spend a decent amount of each week speaking
to some of the largest companies in the world about
all their software development problems. And security,
as you can imagine, is one of their biggest concerns.
But before I get into some of their troubles,
you might find yourself asking,
what does GitHub know about security?
Well, as the largest developer platform on the planet,
we house millions of public projects,
but also an immense number of private ones as
well. In fact, 90% of the Fortune 100 trust
us to manage and secure their code for them.
So as we've expanded GitHub to include all parts
of the development pipeline, from project planning to developer
collaboration, to cloud ides and AI
pair programming, we've also woven application security
right into that framework. This helps everyone from individual
open source developers to the largest enterprises,
know that their projects are secure from the day they are first
streamed up all the way to when they go out into production.
So why is it so important to securing
our code from day one? First, let's take a
look at the state of application security today, and where
the industry has been headed. If we look back over the last four
to six years of application development, even today we see
a disturbing trend. The number of vulnerabilities continues to
increase linearly with the number of lines of code in
any given project. In other words,
every line of code written today still has the same risk
of introducing a vulnerability as it did in 2016.
More code, more problems. And by
the way, flaws in applications are consistently the
number one attack vector for breaches. So yes,
we need to keep doing all the other securing stuff, network protection,
identity verification, all of that. But the biggest
risk is the code itself. And that's
tough, because in most projects, 80% of the
code comes from outside of your own development team, outside of
the company. Open source is great because it means we
are not wasting time consistently rewriting
the wheel. Most projects wouldn't be possible
without open source, but at the same time, you can't control
how these open source dependencies were created.
You can't define their security standards or directly manage their
policies, so you're going to have to figure out how to secure them
at the point of ingestion, when they're incorporated into
your own projects. And what about those developers that are
on your team now? They're probably awesome, but chances
are they also haven't had sufficient, if any,
security training. Nearly half of developers haven't so
they're going to need some help ensuring that each line they write isn't
introducing new vulnerabilities. And we want to do that in a
way which does not slow them down. Which brings me
to my third point here. As companies introduces security tools,
they often do so in a way which slows down application development
and doesn't really do that much to improve security.
The vast majority of companies just bolt on additional tools
which introduce additional noise and friction into the
development process. And this creates a war between the security team
and the developers. Consider this common scenario. As a
developer, I've just completed all my feature points and I've sent
my code off to production. So I'm at the end of my
two week sprint. I'm ready to refocus on the next set of awesome features I'm
going to be adding to the product. But then security comes along.
They push back with a huge code review and a ton of potential vulnerabilities
I need to patch up. And my next sprint is just trashed.
I won't be building those new features. Instead I'm going to be
wasting all of that time unraveling code that I wrote two weeks ago,
patching it up, reintegrating it all along with all
my other teams changes, and then disappointing my
project manager. Not so good.
So this happens all too often,
and it's the reason so many teams push back on security
policies. And it's why so many companies end up releasing vulnerabilities code.
Because in fact, when they hit that decision point, either release
a product with potential vulnerabilities or halt
the presses and patch it up instead. Half of companies
choose more or less to go ahead and release.
There's customers waiting on those new features and it's just not worth disappointing
them. That's a logical business decision,
but it's risky and it's a potentially very expensive one,
because the longer you wait to secure a vulnerability, the more painful it
is. If you can identify and patch a vulnerability as
it's being coded, that's quite cheap. It's basically the developer's time to
get the notification, change a few lines of code, but it costs
ten times as much if that vulnerabilities gets as far as QA,
100 times as much as if it goes out into production. Because now
you're rolling back old versions of the product, you're notifying customers, you're integrating
months old code with new patches. And if somebody
discovers and exploits that vulnerability that made it to
production, well, that's the sort of thing that brings down
entire companies, it bankrupts them or destroys customer trust
for decades to come. So how
can we go about shifting left? How can we avoid this pain and
costs and delays, and ensure vulnerabilities never progress
beyond that development stage?
Put that way, the answer seems pretty obvious. We need to build
the power to identify and fix security flaws right
into the hands of the developer. We call this developer first
security. And to do it properly, you need a couple of key things.
First, you need to be able to see,
to observe a big number of projects, to deeply
understand not just their security state,
but also how developers interact with these projects,
where it's effective for them to receive an action on securing related
information, what kind of nudges they need, what level of detail they
need. Second, you need to already be
the key tool that most developers work with day in and
day out as they build their projects, so that you fit naturally
into their daily workflow, instead of being yet another bolt
on tool that they have to figure out how to work with, or more often,
how to ignore. So we as GitHub,
we saw not just an opportunity, but really a responsibility to
help developers find out about and fix security flaws
right as they were being created. When developers start
building a new set of features, see, they create a new branch of the code
base, a variation that isn't part of the final product yet.
Right, but it soon will be. It's our responsibility to
make sure that before that branch ever gets merged back into the main code,
it's vulnerability free. We do that in
three key ways. First, we scan
every single change to the list of dependencies
they're bringing in, pretty much regardless of what language
they're working in. So whether a developer adds a risky node module to
the manifest or uses an insecure version of PHP in their docker
file, we immediately notice and we prompt
them to fix it. This is impressively effective.
We found that dependency based vulnerabilities are fixed four
and a half times faster than average when
you use this approach. Now,
dependency scanning isn't magic, so what makes that
improvement so strong? Three key things.
One, the dependency update notifications come
from a tool that they already know and trust, and they're used to
responding to their DevOps tool itself, in this case
GitHub. Two, the notifications
are immediately actionable. They're not just telling the
developer what's wrong, they're actually surfacing as a pull request,
which means the developer only has to click a single button
to fix the vulnerability. And then
three, even then, some developers we know
may be hesitant to click that button to merge the fix because
they're wondering how much cleanup they're going to have to do. Does this update
change function signatures or change the behavior of the package
that we're upgrading? Am I going to have to spend a few hours updating
my code to accommodate that change?
So to address that, we don't just give them the minimum
possible change in order to get them securing. We do that, of course,
but we also provide a compatibility score which
lets them know how likely this is to just work
with no further changes. This is something that GitHub is really uniquely positioned
to do because we've got over 200 million projects
out there on GitHub.com, and many of them have given us
permission to see as they make changes
in security updates, how many of their unit tests keep on
passing. We can use that information to calculate
a compatibility score and then put
that right into the update notification when a
security patch is required on your own project,
that gives developers the confidence when that compatibility score
is high to just go ahead and merge that security update right
away. That greatly increases compliance and helps keep
your projects secure.
All right, now let's move on to another type of check.
GitHub scans every single push for security
tokens. Now, I know, I know, when I say security tokens, you're going to say,
well, my company already has key managers and secret stores,
so no piece of raw code ever should have a
security token embedded directly into it, right? But we're
all human, we've all done it. At some point,
using the secret manager was just too slow or too annoying,
and we directly pasted a security token right into our code
that we're working on thinking, okay, I'm coding to remove it
before I commit. But then we forget this
is a common mistake that almost all developers make, and GitHub
doesn't ever want those tokens to get compiled into an end
product where they could be leaked out into the world.
So it prevents this by blocking these tokens from
ever getting off of the individual developer's box.
If a token is dropped into code and then the developer attempts to
push that code up to GitHub, we immediately block it.
And then we let the developer know that they've either got to remove
that token from their code or they need to get it added
to an allow list before we will allow that code to be submitted.
And of course we also scan all the historic code as
well.
So in secret scanning, one of the biggest problems is
that the false positives can get really high, right if you
misidentify strings which aren't secrets, like,
for example, blocking a developer from putting a grid
or a hash seed or some other random character string into your
code. Right. When this happens and we block those, even though
they're not really secrets, developers just can't do their work.
So adding end up turning off secret scanning entirely,
it's pointless, right? So what we chose
as GitHub was not to just look for eccentricity,
randomness in those character strings. Instead,
we actually worked directly with dozens of
different secret providers and build extremely precise
pattern matches for each of their individual types of
tokens. What does that do? It brings that false positive rate
way down, and it brings the effectiveness way up
because our scanning is trustworthy. Oh, and by the way, if you
want to scan for your own custom secret patterns, you can test and add
these right inside of the tool as well.
All right, now, there's one last type of check
that we think is critical. But before I go into detail on that,
I want to tell you a little story. You see,
in 2011, NASA began one of its early Mars
exploration programs, Curiosity. This was a two
and a half billion dollar project and promised gigantic
advances for science. But after the mission had already
launched, they kept running a manual code
review at NASA, and they discovered that their developers
had made a critical error which could prevent the rover's
parachutes from deploying,
literally crashing the program.
Right. Now, fortunately, they found this bug while
the rocket was still in flight. And I didn't know
they heres able to do this until I heard about this.
They actually were able to send a patch
over the air, well over space to the
rocket and have it update the code on that rover
before it ever reached Mars. This saved the mission,
saved a two and a half billion dollar project,
and security tools that they used to do
that, the ones that are now part of GitHub, those security
tools they used, automatically found and fixed 30 different variations
of that same flaw that they hadn't found in the
manual review, and also patched those with the over
the air update.
What, you might ask, is this magic tool? It's something called codeqL.
It's a language which lets GitHub look not just at
the actual text, the actual structure of the code, but at the
meaning of the code. It examines new code your
developers just wrote, compiles it into something that could be executed
but isn't, and as well as all the components they've added in.
And it lets us say things like, does there exist
anywhere in this code base a circular object reference
or a place where text comes in through an API
or a form or some other entry point, which then goes through some
execution path and eventually hits a database without being sanitized.
Which of course would allow an attacker to penetrate the database.
Right? There's over 2000 different code
queries like this which ship with GitHub
code scanning, and they cover the whole OASP top ten and
beyond. When you
put your code in, GitHub code scanning
compiles that code and runs these tests at every pull
request, tracing through the actual execution paths
of your program and analyzing them for bad patterns and then
alerting the developer and if needed, blocking that pull request.
So with all of these in place,
your secret scanning your dependency, scanning your code,
QL code scanning, developers know that the code they
bring in, as well as the code they've just written, all of these are secure
long before it approaches production. They don't have to fear
facing some mile long vulnerability list two weeks
after they've finished coding. By the time they've finished developing
the particular feature they're working on, the code is already secured
and ready to go.
That it does wonders for your development pipeline. It means that
you spend less time stopping production,
more time building features. Ultimately, it means that you ship features
thousands of times faster out to your customers, all the
while remaining secure. So what I encourage you to
think about is how right now you are going about
implementing a developer first approach to your security,
and how you can guarantee that your
code is vulnerabilities free before it ever reaches that main
branch of code, so that your developers don't have to be at
odds with the security team.
And of course, if you want to learn more about GitHub's
general approach to security, just head on over to
slash security. There you can read and go into depth about everything I've
mentioned here, plus a whole ton of other features, things like enterprise
level security overview dashboards, or immutable audit logs,
or our security research lab. Thank you so much for
spending this time with me today, and I hope you enjoy the rest of the
conference.