Conf42 Cloud Native 2021 - Online

Continuous Security - Building Security into your Pipelines

Video size:

Abstract

In the world of continuous delivery and cloud native, the boundaries between what is our application and what constitutes infrastructure is becoming increasing blurred. Our workloads, the containers they ship in, and our platform configuration is now often developed and deployed by the same teams, and development velocity is the key metric to success. This presents us with a challenge which the previous models of security as a final external gatekeeper step cannot keep up with.

To ensure our apps and platforms are secure, we need to integrate security at all stages of our pipelines and ensure that our developers and engineering teams have tools and data with enable them to make decisions about security on an ongoing basis. In this session

I will talk through the problem space, look at the kinds of security issues we need to consider, and look at where the integration points are to build in security as part of our CI/CD process.

Summary

  • The boundaries of what's an application and what constitutes infrastructure are becoming increasingly blurred. Over 70% of all security vulnerabilities are found in indirect dependencies. How do we make sure our applications and our infrastructure are secure when our working practices are evolving into super fast delivery pipelines?
  • Inside containers is also a risk point. If that container is then exploited, it allows an attacker to download new software. We need to embed security into our development pipeline. This is really all about the human aspect, making sure that our processes are correct.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi folks, this is me. I'm Matt Jarvis. I'm a senior developer advocate at Sneak and sneak are a cloud native application security company. So the story of the last decade or so in software developed has really seen the boundaries of what's an application, what is operations, what constitutes infrastructure? Becoming increasingly blurred. The old view of the world was that the application was the only piece under the responsibility of the development team, and all these other elements of the stack sat under it. Operations with security usually a step right at the end of the deployed process. But the world we live in today is one where infrastructure and workloads are almost completely tied together. Everything's declared as code, everything's a software development practice, and there's really no difference between our workload and the computing infrastructure that goes along with it. And by infrastructure, I don't just mean the underlying compute technology, but also the configuration and the operational policies that control those capabilities. And as a community of practitioners, we've discussed in lots of detail this blurring and eventual consuming of the boundary between development and operations. But in lots of cases, we haven't really considered how that impacts on how we model and how we practice security. In lots of organizations, security is still considered to be somewhat of an external practice that exists only when our applications are deployed and operational. But this is pretty much unworkable in the era of continuous integration, continuous delivery. As we've seen, development driven teams now have responsibilities for most of that deployment stack, and so this gives those teams a much greater responsibility for ensuring that these things are secure. By the time our code and our infrastructure is deployed to production, it's really far too late to deal with the implications of security issues, and we can't slow down that velocity to introduce security gates in the way that things used to work, because velocity and time to market is clearly the differentiator for businesses to succeed. So that presents us with a set of unique challenges around security. How do we make sure our applications and our infrastructure are secure when our working practices are evolving into these super fast delivery pipelines? Security still matters. And as we've seen repeatedly over the last few years, security breaches can have a really big impact on businesses, both from a financial perspective in terms of bottom line and potential fines, but also on how trusted our customers see us as. And trust really one of the key metrics for successful businesses in the cloud era. So let's start by taking a look at the different classes of things we probably need to be looking at to ensure that we've got security considered within our workflows. Firstly, the applications that we're creating, our workloads and modern applications are usually composed of a relatively small core of homegrown code, along with a huge amount of third party, usually open source modules. And this is great news for application development, because the availability of modules and library code means we get to develop applications faster. We have to write less code, and we don't need to reinvent the wheel all the time by solving the same problems over and over again. And anyone who develops in Java, in python, in node, in go, is going to recognize this pattern. And in all of these ecosystems, the number of vulnerabilities is growing. And this isn't necessarily because code's getting more insecure, it can be just because there's more code being written, more libraries, more modules being written. Maybe we're getting better at working out what's vulnerable. But in the end, this all means more opportunities for these vulnerabilities to be exploited. And when we import something into our code base, it can have a very large dependency tree, both in terms of direct dependencies that are the dependencies of the thing that we're importing, but also indirect dependencies. So dependencies of dependencies. And so we potentially bring in a huge amount of other code that we might not even be aware of. And typically over 70% of all security vulnerabilities are found in these indirect dependencies. So these are the ones that we have much less control over, and we might not be aware of them at all. So as an example of that, here's an exploit from the node community. It was introduced into NPM in 2018, and this is supposedly a library to parse HTTP headers, but it's actually a remote code execution exploit. It's about 40 lines of code to process remote javascript executed on the server via using specially crafted commands in the HTTP request. And this was hidden behind a tree of other dependencies, and it eventually ended up, the direct dependency ended up being used in mail parser, which has a huge amount of downloads every month. So it's pretty easy to see how in large developer communities these kinds of indirect dependencies can be used to hide exploitable code. So vulnerabilities in those third party dependencies are super important because they make up such a large part of our code bases these days. But as I said earlier, the lines between our application and the container it runs in are becoming really blurred. The container is the delivery mechanism for the application. They're typically developed at the same time, usually even by the same team. So for all intents and purposes, we can consider them the same thing. The application never exists without that container image. And like the availability of library code, the huge growth in public container registries has been great for the ability to run prepackaged software super easily and for us to consume that in our own infrastructure. But they are also a big source of vulnerabilities. And when we look at the container landscape, although best practices are emerging around things like building minimal containers, there is still a huge amount of people using containers directly from the upstream repositories. And lots of these can have very large numbers of vulnerabilities in them. And we're presented with lots of possibilities for attack vectors. So it's important that our developers working with container images understand the scope for introducing vulnerabilities here. And there's also a long tail taking the path of least resistance by giving applications containers based on full operating systems, for example. And when we look at operating systems in general, the amount of vulnerabilities in base operating systems is really massively growing. And that's partly because operating systems by design ship with a lot of software in them. And if we look at operating systems like that, we can see that they kind of break the paradigms of containers in the sense that what we want to be doing is producing an absolutely minimal package for our application. But there's still a lot of people using these kind of bigger images for workload deployments. And we can also see that a lot of people don't think about emerging vulnerabilities once their workloads are in production. So an image that when it was first deployed, didn't have vulnerabilities in it, there may be new vulnerabilities that have been discovered since the image was built. And if you're not looking at containers you already have in production, then you're never going to find out if they're now vulnerable. And fixing these things isn't usually very hard. Over 40% of docker image vulnerabilities can usually be fixed by upgrading the base image, and around 20% of them can be fixed just by rebuilding them. A lot of containers will have upgrade steps in the Docker file, and they'll get run during the build process. And as we've moved wholesale into cloud and now into kubernetes, configuration is almost entirely in code, and it's part of our development workloads. And by configuration we can include all of our kubernetes, Yaml, our helm charts, our automation, our terraform, and all of the policies and configuration that goes alongside that. And this is a massively growing field, as we can see from the amounts of this kind of code that are in GitHub now. We're really only just starting to view that as something that we need to consider from a security perspective. Systems like kubernetes are increasingly complex. And as we've moved the responsibilities for delivering that kind of code into our development teams, there's clearly space for misunderstandings about how things work. And this can be compounded with things like service meshes, which increase that complexity even further. And with this much code out there in public repositories, we can again see the potential risks of path of least resistance, where we might be using existing code as templates, when we might not fully understand how that thing works. And these are all very important in terms of the security of our environments. This quote from the open web application security project is a little bit old now, but it still proves the point that a huge amount of security breaches are coming from misconfigurations in infrastructure. And most really large exploits over recent years have been this combination of application level vulnerability combined with infrastructure configuration, which has then allowed the attacker to widen the blast radius and extend the exploit. And as I'm sure most of us have seen, there's many, many real world examples of this. Things like cloud credential leakages through source code repositories, or of Kubernetes clusters infested with crypto miners. And when we look at this space in terms of kubernetes, it's important to understand that kubernetes really doesn't give you any guardrails. It's insecure by default. And this is on purpose. It's meant to be highly configurable and users are expected to make these decisions by themselves. By default, there are no resource limits set. That means a pod can consume as much resource as the Kubelet will let it. And this has the potential for denial of service attacks, affecting a much bigger scope than just a single application. And kubernetes will also quite happily let containers run as root. And with a huge amount of containers in public registry still being configuration to run as root, this opens up really big security implications. A compromised pod running as root has the potential to escape the container. And so we really need to be limiting the potential for these kind of attacks. And very few applications actually need root access in order to run writable file systems. Inside containers is also a risk point. If that container is then exploited, it allows an attacker to download new software, make changes to configuration and generally be able to extend the access that they've got already. And containers also have access to the full range of capabilities configured by default in the container runtime and capabilities. Folks who aren't familiar with them are kernel level permissions. Many of these granular kind of permissions won't be required by your application, and having them turned on just creates additional vectors for attackers to use should that container be compromised. So where do we start with all of this in terms of models git driven workflows? Well, the emerging answer is that we have to shift our security practices far to the left and embed security into our development pipeline. So we share that burden of security responsibility across our development teams. And this is really where this concept of devsecops comes into play, that we need to integrate security considerations into our workflows in exactly the same way that we merged development and operations over the last few years. And where do we start in practice? Well, the obvious first place is at the developer. We need developers to have insights immediately into potential security issues tightly integrated into their workflow, so friction free. And that means tooling that's available from local command lines, integrations with ides. So we need to reduce the overhead for developers to use these kind of tools right at the point they're working before code even gets anywhere near our repositories. And the tooling we use has to provide developers with the right information to be able to make security decisions. Not just lists of cves, but tools that give us insights into how severe something is, how exploitable it is, and remediation advice, how do I fix it? Because that's what really do we care about is how do we go from the state we're in now to a better state. And as we saw earlier, we want to be able to look at all of those classes of things that we're interested in. So third party dependencies in our code, what's going on in our container images, and all of that infrastructure code that we're putting in at this point. And you can do all of this with sneak for free. So our second touch points is clearly git itself. Our git repository is now the single source of truth for everything. So that has to be secure. Git itself's been pretty secure over the years, and in most cases folks are using hosted git services like GitHub, like GitLab for this, which have been also pretty good at security. But there are definitely some process related things to consider. By its nature, git can open you up to certain things, and we need to make sure that our users are aware of where those potential problem points are. So we will need to be doing things like enforcing two factor authentication, making sure our users have strong key security practices and that they're keeping git updated locally. And exposing private data is always a risk here, particularly in commit histories or when we're working with repositories, moving them around. Configuration data really shouldn't be in git unencrypted, even in local repositories, for exactly that reason. So we need to help our users to be able to use things like git ignore for stuff like that. And it goes without saying that we need strong review processes. This is really all about the human aspect, making sure that our processes are correct and that folks understand what they need to do and where we can. We want to be automating as much as possible, reducing that friction for being able to do these things. And we can do that through things like pre commit hooks. And once we're confident the git is secure, we can start to leverage automation on every pull request. We want to be looking for the same issues that we were catching at the local development stage, but this time these things are obviously going to be automated. And the key difference is here that because these checks are automated, we're also monitoring for things that might have changed since a particular piece of code was committed. Perhaps an upstream dependency has changed. New vulnerabilities are discovered all the time, and code that didn't show vulnerabilities when it was first committed might now have some problems. So these monitoring scans over time will allow us to pick things up right in the code repository, where it's relatively low cost to fix it. And our container registries also kinds of fall into this category. Nothing's fixed in stone, so an image that looked fine when it was built might now be vulnerable. If your registry's got built in scanning, take advantage of that, or use tools that integrate with your registry. And we need to be scanning on an ongoing basis. Even if we haven't changed our images, that base image that we used to base our image on might have new vulnerabilities. And lots of people aren't rebuilding images unless things actually change. Another key integration point is our CI CD systems. And again, we can automate scanning directly into our build pipelines, and we're looking for things that may not have changed, but because we're rebuilding things, we can catch any changes that might be happening upstream, things that have changed since that code was first scanned, when it entered the source code repository, and then the final place we want to be looking is our production environments. Containers in production, particularly if they don't change, very often, can end up with vulnerable stages. So we need to be looking both at running containers and as a double check at new containers being spawned in this space. We can also take advantage of admission control, perhaps things like open policy agent to ensure that our policy security policies are being reflected in the code that's being deployed. Perhaps we want to double check that our images have been scanned before they hit production. And we can actually stop things here from deploying into our clusters if they don't comply with those policies. And in the production space, we can also look at emerging practices around runtime, perhaps looking at anomalous behavior. And there's lots of emerging tools in that space which are going to be checking for unusual patterns that might be happening inside your cluster, which might indicate that a particular container has been compromised. So the takeaways from all of this is that we need to shift our security left. We need to empower our developers to make decisions about security based on modern tools and modern process. In this kind of new world, security teams aren't gatekeepers anymore. With control over deployment, we need to consider the role of security professionals to be advisors and toolsmiths, as opposed to gatekeepers. Empowering our development teams to deliver feature velocity, new features and new code to production, and therefore delivering business value and visibility and remediation of security issues need to be baked in to each stage of our development pipeline. So we're leveraging automated tooling to scan third party code, our container images and our infrastructure code. So thank you for listening. If you're interested in trying out any of these features in sneak, you can sign up for free at Sneak IO.
...

Matt Jarvis

Senior Developer Advocate @ Snyk

Matt Jarvis's LinkedIn account Matt Jarvis's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways