Securing the Software Factory

Video size:

Abstract

Code written today introduces just as many vulnerabilities as it did in 2016. Instead of adding more post-hoc tools, simplify your approach: empower developers to secure code from the minute they start coding, and change the role of CISO from that of enforcer into one of companion/guide. Here’s how.

Summary

John Peck: The number of vulnerabilities continues to increase linearly with the number of lines of code in any given project. He says flaws in applications are consistently the number one attack vector for breaches. Peck: As companies introduce security tools, they often do so in a way which slows down application development.
GitHub scans every single push for security tokens. One of the biggest problems is that the false positives can get really high. GitHub worked directly with dozens of different secret providers and build precise pattern matches for each of their individual types of tokens. It brings that false positive rate way down.
GitHub uses codeqL to look not just at the actual text, but at the meaning of the code. It runs tests at every pull request, tracing through the actual execution paths of your program. Developers know that the code they bring in is secure long before it approaches production.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

There. I'm John Peck. I've been a software developer for about 25 years and now work in enterprise advocacy at GitHub. And that means that I get to spend a decent amount of each week speaking to some of the largest companies in the world about all their software development problems. And security, as you can imagine, is one of their biggest concerns. But before I get into some of their troubles, you might find yourself asking, what does GitHub know about security? Well, as the largest developer platform on the planet, we house millions of public projects, but also an immense number of private ones as well. In fact, 90% of the Fortune 100 trust us to manage and secure their code for them. So as we've expanded GitHub to include all parts of the development pipeline, from project planning to developer collaboration, to cloud ides and AI pair programming, we've also woven application security right into that framework. This helps everyone from individual open source developers to the largest enterprises, know that their projects are secure from the day they are first streamed up all the way to when they go out into production. So why is it so important to securing our code from day one? First, let's take a look at the state of application security today, and where the industry has been headed. If we look back over the last four to six years of application development, even today we see a disturbing trend. The number of vulnerabilities continues to increase linearly with the number of lines of code in any given project. In other words, every line of code written today still has the same risk of introducing a vulnerability as it did in 2016. More code, more problems. And by the way, flaws in applications are consistently the number one attack vector for breaches. So yes, we need to keep doing all the other securing stuff, network protection, identity verification, all of that. But the biggest risk is the code itself. And that's tough, because in most projects, 80% of the code comes from outside of your own development team, outside of the company. Open source is great because it means we are not wasting time consistently rewriting the wheel. Most projects wouldn't be possible without open source, but at the same time, you can't control how these open source dependencies were created. You can't define their security standards or directly manage their policies, so you're going to have to figure out how to secure them at the point of ingestion, when they're incorporated into your own projects. And what about those developers that are on your team now? They're probably awesome, but chances are they also haven't had sufficient, if any, security training. Nearly half of developers haven't so they're going to need some help ensuring that each line they write isn't introducing new vulnerabilities. And we want to do that in a way which does not slow them down. Which brings me to my third point here. As companies introduces security tools, they often do so in a way which slows down application development and doesn't really do that much to improve security. The vast majority of companies just bolt on additional tools which introduce additional noise and friction into the development process. And this creates a war between the security team and the developers. Consider this common scenario. As a developer, I've just completed all my feature points and I've sent my code off to production. So I'm at the end of my two week sprint. I'm ready to refocus on the next set of awesome features I'm going to be adding to the product. But then security comes along. They push back with a huge code review and a ton of potential vulnerabilities I need to patch up. And my next sprint is just trashed. I won't be building those new features. Instead I'm going to be wasting all of that time unraveling code that I wrote two weeks ago, patching it up, reintegrating it all along with all my other teams changes, and then disappointing my project manager. Not so good. So this happens all too often, and it's the reason so many teams push back on security policies. And it's why so many companies end up releasing vulnerabilities code. Because in fact, when they hit that decision point, either release a product with potential vulnerabilities or halt the presses and patch it up instead. Half of companies choose more or less to go ahead and release. There's customers waiting on those new features and it's just not worth disappointing them. That's a logical business decision, but it's risky and it's a potentially very expensive one, because the longer you wait to secure a vulnerability, the more painful it is. If you can identify and patch a vulnerability as it's being coded, that's quite cheap. It's basically the developer's time to get the notification, change a few lines of code, but it costs ten times as much if that vulnerabilities gets as far as QA, 100 times as much as if it goes out into production. Because now you're rolling back old versions of the product, you're notifying customers, you're integrating months old code with new patches. And if somebody discovers and exploits that vulnerability that made it to production, well, that's the sort of thing that brings down entire companies, it bankrupts them or destroys customer trust for decades to come. So how can we go about shifting left? How can we avoid this pain and costs and delays, and ensure vulnerabilities never progress beyond that development stage? Put that way, the answer seems pretty obvious. We need to build the power to identify and fix security flaws right into the hands of the developer. We call this developer first security. And to do it properly, you need a couple of key things. First, you need to be able to see, to observe a big number of projects, to deeply understand not just their security state, but also how developers interact with these projects, where it's effective for them to receive an action on securing related information, what kind of nudges they need, what level of detail they need. Second, you need to already be the key tool that most developers work with day in and day out as they build their projects, so that you fit naturally into their daily workflow, instead of being yet another bolt on tool that they have to figure out how to work with, or more often, how to ignore. So we as GitHub, we saw not just an opportunity, but really a responsibility to help developers find out about and fix security flaws right as they were being created. When developers start building a new set of features, see, they create a new branch of the code base, a variation that isn't part of the final product yet. Right, but it soon will be. It's our responsibility to make sure that before that branch ever gets merged back into the main code, it's vulnerability free. We do that in three key ways. First, we scan every single change to the list of dependencies they're bringing in, pretty much regardless of what language they're working in. So whether a developer adds a risky node module to the manifest or uses an insecure version of PHP in their docker file, we immediately notice and we prompt them to fix it. This is impressively effective. We found that dependency based vulnerabilities are fixed four and a half times faster than average when you use this approach. Now, dependency scanning isn't magic, so what makes that improvement so strong? Three key things. One, the dependency update notifications come from a tool that they already know and trust, and they're used to responding to their DevOps tool itself, in this case GitHub. Two, the notifications are immediately actionable. They're not just telling the developer what's wrong, they're actually surfacing as a pull request, which means the developer only has to click a single button to fix the vulnerability. And then three, even then, some developers we know may be hesitant to click that button to merge the fix because they're wondering how much cleanup they're going to have to do. Does this update change function signatures or change the behavior of the package that we're upgrading? Am I going to have to spend a few hours updating my code to accommodate that change? So to address that, we don't just give them the minimum possible change in order to get them securing. We do that, of course, but we also provide a compatibility score which lets them know how likely this is to just work with no further changes. This is something that GitHub is really uniquely positioned to do because we've got over 200 million projects out there on GitHub.com, and many of them have given us permission to see as they make changes in security updates, how many of their unit tests keep on passing. We can use that information to calculate a compatibility score and then put that right into the update notification when a security patch is required on your own project, that gives developers the confidence when that compatibility score is high to just go ahead and merge that security update right away. That greatly increases compliance and helps keep your projects secure. All right, now let's move on to another type of check. GitHub scans every single push for security tokens. Now, I know, I know, when I say security tokens, you're going to say, well, my company already has key managers and secret stores, so no piece of raw code ever should have a security token embedded directly into it, right? But we're all human, we've all done it. At some point, using the secret manager was just too slow or too annoying, and we directly pasted a security token right into our code that we're working on thinking, okay, I'm coding to remove it before I commit. But then we forget this is a common mistake that almost all developers make, and GitHub doesn't ever want those tokens to get compiled into an end product where they could be leaked out into the world. So it prevents this by blocking these tokens from ever getting off of the individual developer's box. If a token is dropped into code and then the developer attempts to push that code up to GitHub, we immediately block it. And then we let the developer know that they've either got to remove that token from their code or they need to get it added to an allow list before we will allow that code to be submitted. And of course we also scan all the historic code as well. So in secret scanning, one of the biggest problems is that the false positives can get really high, right if you misidentify strings which aren't secrets, like, for example, blocking a developer from putting a grid or a hash seed or some other random character string into your code. Right. When this happens and we block those, even though they're not really secrets, developers just can't do their work. So adding end up turning off secret scanning entirely, it's pointless, right? So what we chose as GitHub was not to just look for eccentricity, randomness in those character strings. Instead, we actually worked directly with dozens of different secret providers and build extremely precise pattern matches for each of their individual types of tokens. What does that do? It brings that false positive rate way down, and it brings the effectiveness way up because our scanning is trustworthy. Oh, and by the way, if you want to scan for your own custom secret patterns, you can test and add these right inside of the tool as well. All right, now, there's one last type of check that we think is critical. But before I go into detail on that, I want to tell you a little story. You see, in 2011, NASA began one of its early Mars exploration programs, Curiosity. This was a two and a half billion dollar project and promised gigantic advances for science. But after the mission had already launched, they kept running a manual code review at NASA, and they discovered that their developers had made a critical error which could prevent the rover's parachutes from deploying, literally crashing the program. Right. Now, fortunately, they found this bug while the rocket was still in flight. And I didn't know they heres able to do this until I heard about this. They actually were able to send a patch over the air, well over space to the rocket and have it update the code on that rover before it ever reached Mars. This saved the mission, saved a two and a half billion dollar project, and security tools that they used to do that, the ones that are now part of GitHub, those security tools they used, automatically found and fixed 30 different variations of that same flaw that they hadn't found in the manual review, and also patched those with the over the air update. What, you might ask, is this magic tool? It's something called codeqL. It's a language which lets GitHub look not just at the actual text, the actual structure of the code, but at the meaning of the code. It examines new code your developers just wrote, compiles it into something that could be executed but isn't, and as well as all the components they've added in. And it lets us say things like, does there exist anywhere in this code base a circular object reference or a place where text comes in through an API or a form or some other entry point, which then goes through some execution path and eventually hits a database without being sanitized. Which of course would allow an attacker to penetrate the database. Right? There's over 2000 different code queries like this which ship with GitHub code scanning, and they cover the whole OASP top ten and beyond. When you put your code in, GitHub code scanning compiles that code and runs these tests at every pull request, tracing through the actual execution paths of your program and analyzing them for bad patterns and then alerting the developer and if needed, blocking that pull request. So with all of these in place, your secret scanning your dependency, scanning your code, QL code scanning, developers know that the code they bring in, as well as the code they've just written, all of these are secure long before it approaches production. They don't have to fear facing some mile long vulnerability list two weeks after they've finished coding. By the time they've finished developing the particular feature they're working on, the code is already secured and ready to go. That it does wonders for your development pipeline. It means that you spend less time stopping production, more time building features. Ultimately, it means that you ship features thousands of times faster out to your customers, all the while remaining secure. So what I encourage you to think about is how right now you are going about implementing a developer first approach to your security, and how you can guarantee that your code is vulnerabilities free before it ever reaches that main branch of code, so that your developers don't have to be at odds with the security team. And of course, if you want to learn more about GitHub's general approach to security, just head on over to slash security. There you can read and go into depth about everything I've mentioned here, plus a whole ton of other features, things like enterprise level security overview dashboards, or immutable audit logs, or our security research lab. Thank you so much for spending this time with me today, and I hope you enjoy the rest of the conference.

Slides

Download slides (PDF)

See all 46 talks at this event!

Conf42 DevSecOps 2022 - Online

December 01 2022

Securing the Software Factory

Video size:

Abstract

Summary

Transcript

Slides

Jon Peck

Senior Manager, Enterprise Advocacy @ GitHub

Join the community!

Featured event

2025

2024

Info

Conf42 DevSecOps 2022 - Online

December 01 2022

Securing the Software Factory

Video size:

Abstract

Summary

Transcript

Slides

Jon Peck

Senior Manager, Enterprise Advocacy @ GitHub

Join the community!