Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi folks, this is me. I'm Matt Jarvis. I'm a senior developer
advocate at Sneak and sneak are a cloud native
application security company. So the story of the
last decade or so in software developed has really seen
the boundaries of what's an application, what is operations,
what constitutes infrastructure? Becoming increasingly
blurred. The old view of the world was that the application was
the only piece under the responsibility of the development team, and all these
other elements of the stack sat under it. Operations with security
usually a step right at the end of the deployed process. But the
world we live in today is one where infrastructure and workloads are almost
completely tied together. Everything's declared as code,
everything's a software development practice, and there's really no difference
between our workload and the computing infrastructure that
goes along with it. And by infrastructure, I don't just mean the underlying
compute technology, but also the configuration and the operational
policies that control those capabilities. And as a community
of practitioners, we've discussed in lots of detail this blurring
and eventual consuming of the boundary between development and operations.
But in lots of cases, we haven't really considered how that impacts on
how we model and how we practice security. In lots of organizations,
security is still considered to be somewhat of an external practice
that exists only when our applications are deployed and operational.
But this is pretty much unworkable in the era of continuous integration,
continuous delivery. As we've seen, development driven
teams now have responsibilities for most of that
deployment stack, and so this gives those teams a much greater responsibility
for ensuring that these things are secure. By the time our code
and our infrastructure is deployed to production, it's really far too late
to deal with the implications of security issues, and we can't slow down that
velocity to introduce security gates in the way that things
used to work, because velocity and time to
market is clearly the differentiator for businesses to succeed.
So that presents us with a set of unique challenges
around security. How do we make sure our applications and our infrastructure are secure
when our working practices are evolving into these super fast delivery
pipelines? Security still matters. And as we've seen repeatedly
over the last few years, security breaches can have a really big impact
on businesses, both from a financial perspective in terms of bottom
line and potential fines, but also on how trusted our customers
see us as. And trust really one of the key metrics for
successful businesses in the cloud era. So let's start by
taking a look at the different classes of things we probably
need to be looking at to ensure that we've got security considered
within our workflows. Firstly, the applications that we're
creating, our workloads and modern applications are usually composed of
a relatively small core of homegrown code, along with
a huge amount of third party, usually open source modules.
And this is great news for application development, because the availability
of modules and library code means we get to develop applications
faster. We have to write less code, and we don't need to
reinvent the wheel all the time by solving the same problems over and over
again. And anyone who develops in Java, in python,
in node, in go, is going to recognize this pattern. And in all
of these ecosystems, the number of vulnerabilities is growing.
And this isn't necessarily because code's getting more insecure,
it can be just because there's more code being written, more libraries,
more modules being written. Maybe we're getting better at working
out what's vulnerable. But in the end, this all means more
opportunities for these vulnerabilities to be exploited.
And when we import something into our code base, it can have a
very large dependency tree, both in terms of direct dependencies
that are the dependencies of the thing that we're importing,
but also indirect dependencies. So dependencies of
dependencies. And so we potentially bring in a huge
amount of other code that we might not even be aware of. And typically over
70% of all security vulnerabilities are found in these indirect
dependencies. So these are the ones that we have much less control over,
and we might not be aware of them at all.
So as an example of that, here's an exploit from the node community.
It was introduced into NPM in 2018,
and this is supposedly a library to parse HTTP
headers, but it's actually a remote code execution exploit.
It's about 40 lines of code to process remote javascript
executed on the server via using specially crafted commands
in the HTTP request. And this was hidden behind a tree of other dependencies,
and it eventually ended up, the direct dependency ended
up being used in mail parser, which has a huge amount of downloads every month.
So it's pretty easy to see how in large developer communities
these kinds of indirect dependencies can be used
to hide exploitable code. So vulnerabilities
in those third party dependencies are super important because they make up
such a large part of our code bases these days. But as I said earlier,
the lines between our application and the container it runs
in are becoming really blurred. The container is the delivery mechanism
for the application. They're typically developed at the same time, usually even
by the same team. So for all intents and purposes, we can consider them the
same thing. The application never exists without that container image.
And like the availability of library code, the huge growth in public container
registries has been great for the ability to run prepackaged
software super easily and for us to consume that in our own infrastructure.
But they are also a big source of vulnerabilities. And when we look at the
container landscape, although best practices are emerging
around things like building minimal containers, there is still a
huge amount of people using containers directly from the upstream repositories.
And lots of these can have very large numbers of vulnerabilities in
them. And we're presented with lots of possibilities for attack vectors.
So it's important that our developers working with container images
understand the scope for introducing vulnerabilities
here. And there's also a long tail taking the path of least resistance by
giving applications containers based on full
operating systems, for example. And when we look at operating systems in general,
the amount of vulnerabilities in base operating systems is really
massively growing. And that's partly because operating systems by design
ship with a lot of software in them. And if we look at operating
systems like that, we can see that they kind of break
the paradigms of containers in the sense that what we want to be doing is
producing an absolutely minimal package for our application. But there's
still a lot of people using these kind of bigger images for
workload deployments. And we can also see that a lot of people don't think
about emerging vulnerabilities once their workloads are in production.
So an image that when it was first deployed, didn't have vulnerabilities
in it, there may be new vulnerabilities that have been discovered since the
image was built. And if you're not looking at containers you already have
in production, then you're never going to find out if they're now vulnerable.
And fixing these things isn't usually very hard.
Over 40% of docker image vulnerabilities can usually be fixed by upgrading
the base image, and around 20% of them can be fixed just by
rebuilding them. A lot of containers will have upgrade steps
in the Docker file, and they'll get run during the build process.
And as we've moved wholesale into cloud and now into kubernetes,
configuration is almost entirely in code, and it's part of our
development workloads. And by configuration we can include
all of our kubernetes, Yaml, our helm charts, our automation,
our terraform, and all of the policies and configuration
that goes alongside that. And this is a massively growing field,
as we can see from the amounts of this kind of code that are in
GitHub now. We're really only just starting to view that as something that we
need to consider from a security perspective. Systems like kubernetes are
increasingly complex. And as we've moved the responsibilities for
delivering that kind of code into our development teams,
there's clearly space for misunderstandings about how things
work. And this can be compounded with things like service meshes, which increase
that complexity even further. And with this much code out
there in public repositories, we can again see the potential risks
of path of least resistance, where we might be using existing code as
templates, when we might not fully understand how that
thing works. And these are all very important in terms
of the security of our environments. This quote from the
open web application security project is a little bit old now, but it still proves
the point that a huge amount of security breaches are coming from misconfigurations
in infrastructure. And most really large exploits over
recent years have been this combination of application level vulnerability
combined with infrastructure configuration, which has then
allowed the attacker to widen the blast radius and
extend the exploit. And as I'm sure most of us
have seen, there's many, many real world examples of this.
Things like cloud credential leakages through
source code repositories, or of Kubernetes clusters infested
with crypto miners. And when we look at this space in terms of kubernetes,
it's important to understand that kubernetes really doesn't give you any guardrails.
It's insecure by default. And this is on purpose.
It's meant to be highly configurable and users are expected to
make these decisions by themselves. By default,
there are no resource limits set. That means a pod can consume as much
resource as the Kubelet will let it. And this has the potential for denial
of service attacks, affecting a much bigger scope than just
a single application. And kubernetes will also quite happily let containers
run as root. And with a huge amount of containers in public registry
still being configuration to run as root, this opens up really big
security implications. A compromised pod running as root has the
potential to escape the container. And so we really need to be limiting the potential
for these kind of attacks. And very few applications actually need root access
in order to run writable file systems. Inside containers is
also a risk point. If that container is then exploited, it allows
an attacker to download new software, make changes to
configuration and generally be able to extend
the access that they've got already. And containers also have access
to the full range of capabilities configured by default in
the container runtime and capabilities. Folks who aren't familiar with them are
kernel level permissions. Many of these granular kind of
permissions won't be required by your application, and having them
turned on just creates additional vectors for attackers to
use should that container be compromised.
So where do we start with all of this in terms of models git driven
workflows? Well, the emerging answer is that we have to shift our security
practices far to the left and embed security into our development
pipeline. So we share that burden of security responsibility across
our development teams. And this is really where this concept of devsecops
comes into play, that we need to integrate security considerations into our
workflows in exactly the same way that we merged development and operations
over the last few years. And where do we start in practice?
Well, the obvious first place is at the developer. We need developers to
have insights immediately into potential security issues tightly
integrated into their workflow, so friction free.
And that means tooling that's available from local command lines,
integrations with ides. So we need to reduce the overhead for
developers to use these kind of tools right at the point they're working before code
even gets anywhere near our repositories. And the tooling we
use has to provide developers with the right information to be able to make
security decisions. Not just lists of cves,
but tools that give us insights into how severe something is, how exploitable
it is, and remediation advice, how do I fix
it? Because that's what really do we care about is how do we go
from the state we're in now to a better state. And as we
saw earlier, we want to be able to look at all of
those classes of things that we're interested in.
So third party dependencies in our code, what's going
on in our container images, and all of that infrastructure code
that we're putting in at this point. And you can do all of this
with sneak for free. So our second touch points
is clearly git itself. Our git repository is now the
single source of truth for everything. So that has to be secure.
Git itself's been pretty secure over the years, and in most
cases folks are using hosted git services like GitHub,
like GitLab for this, which have been also pretty good at security.
But there are definitely some process related things to consider.
By its nature, git can open you up to certain things,
and we need to make sure that our users are aware of where those potential
problem points are. So we will need to be doing things like enforcing two
factor authentication, making sure our users have strong key
security practices and that they're keeping git updated locally.
And exposing private data is always a risk here,
particularly in commit histories or when we're working with repositories,
moving them around. Configuration data really
shouldn't be in git unencrypted, even in local repositories,
for exactly that reason. So we need to help our
users to be able to use things like git ignore for stuff
like that. And it goes without saying that we need strong review processes.
This is really all about the human aspect, making sure that our processes are
correct and that folks understand what they need to do
and where we can. We want to be automating as much as
possible, reducing that friction for being able to do
these things. And we can do that through things like pre commit hooks.
And once we're confident the git is secure, we can start to leverage
automation on every pull request. We want to be
looking for the same issues that we were catching at the local development stage,
but this time these things are obviously going to be automated. And the
key difference is here that because these checks are automated,
we're also monitoring for things that might have changed since a particular piece
of code was committed. Perhaps an upstream
dependency has changed. New vulnerabilities are discovered
all the time, and code that didn't show vulnerabilities when
it was first committed might now have some problems.
So these monitoring scans over time will allow us to
pick things up right in the code repository, where it's relatively low
cost to fix it. And our container registries also kinds of
fall into this category. Nothing's fixed in stone, so an image
that looked fine when it was built might now be vulnerable. If your registry's
got built in scanning, take advantage of that, or use tools that
integrate with your registry. And we need to be scanning on an
ongoing basis. Even if we haven't changed our images,
that base image that we used to base our image on
might have new vulnerabilities. And lots of people aren't rebuilding images unless
things actually change. Another key integration point is our
CI CD systems. And again, we can automate scanning
directly into our build pipelines, and we're
looking for things that may not have changed, but because
we're rebuilding things, we can catch any changes that might be
happening upstream, things that have changed since that code was first
scanned, when it entered the source code repository, and then the final place
we want to be looking is our production environments.
Containers in production, particularly if they don't change, very often, can end
up with vulnerable stages. So we need to be looking both at running containers
and as a double check at new containers being spawned in this space.
We can also take advantage of admission control, perhaps things like
open policy agent to ensure that our policy
security policies are being reflected in the code that's being deployed.
Perhaps we want to double check that our images have been
scanned before they hit production. And we
can actually stop things here from deploying into
our clusters if they don't comply with those policies.
And in the production space, we can also look at emerging
practices around runtime, perhaps looking at anomalous behavior.
And there's lots of emerging tools in that space
which are going to be checking for unusual patterns that might
be happening inside your cluster, which might indicate that
a particular container has been compromised. So the takeaways from
all of this is that we need to shift our security left.
We need to empower our developers to make decisions about security
based on modern tools and modern process. In this
kind of new world, security teams aren't gatekeepers anymore.
With control over deployment, we need
to consider the role of security professionals to be advisors and
toolsmiths, as opposed to gatekeepers. Empowering our
development teams to deliver feature velocity, new features
and new code to production, and therefore delivering business
value and visibility and remediation of security issues
need to be baked in to each stage of our development pipeline.
So we're leveraging automated tooling to scan third
party code, our container images and our infrastructure
code. So thank you for listening. If you're interested
in trying out any of these features in sneak, you can sign up
for free at Sneak IO.