Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, I'm Iska, a software engineer at Armo,
and I'm here to talk about security posture and the good management
of security issues in open resources projects, specifically in
graduated CNCF projects. I'm going to compare them based on some research
we did at Armo using our own statistics. So,
Armo is a Kubernetes security company. We're the creators
of Kubescape, which I'm one of the container of. And Kubescape is a security
posture platform which shows you misconfigurations inside your cluster
and guides you on how you can improve your security inside your cluster.
It's both a CLI tool and an operator, which you can use in different
ways. It also scans your images running inside your Kubernetes
cluster and can tell you what kind of vulnerabilities your containers have.
It also works in what we call shift left. So when you're preparing
your deployments in your GitHub repository or in your vs
code using our Kubescape extension, you can scan your yamls and helm
charts and your container registries to find out things about security posture
before they're deployed. Now a thing I want to note here is
that one of the most important goals we had in this project is not
to just dump data on you and have you try to work out how to
solve these issues on your own, but also to prioritize them and show
you how to fix these issues. So Armo provides a platform behind
cubescape in order to store the data of the results in these scans and help
you analyze them even further and give you a bigger perspective.
So now that we're collecting all this data, we need to
analyze it. What type of data do we have?
Armo platform we're receiving data from Kubescape, which is basically,
on the one hand, configuration issues of Kubernetes objects that
comes from git registries and from objects in the Kubernetes cluster
itself. And on the other hand, vulnerabilities from image scans,
again from the actual cluster and also from container registries.
Today we're going to focus on data that we get from objects before they're deployed,
from git repositories and mainly container registries.
What the data that we have today includes
is we've scanned container images from nearly 180 registries
and scanned nearly 44,000 container images.
We've seen nearly 2000 git registries and inside of which
are nearly 165 YAML files and helm charts.
So it's not a small data set and it can bring us some really interesting
things, starting with the container image scans.
Like I mentioned, we're going to compare two samples, one against the other. The first
is the general population, if you will, and the second is the
sample of CNCF graduated projects. We're going to look at what
are the most common vulnerabilities in both samples, the distribution
of severities, and we're going to dive into the relevancy of these
vulnerabilities. So looking
at the most prevalent image repositories in the general
sample, we see ArgoCd, which is a CNCF
graduated project itself appears in nearly 20,000 images.
Then we have Redis, which is also an open source project,
but not under the CNCF umbrella. Then we have Armo
again. We have Prometheus a few times, also a
graduated project. We have cool, interesting project,
the sidecar again open source, but not under CNCF.
MongoDB, Prometheus again, and the datadog agent.
Then we have the top ten among the graduated
sample. Obviously the top repositories are
again armo CD and promisus, which makes sense. We saw them leading in
the general sample, and after that we can see some core images of Kubernetes
itself. So we're going to start by comparing
the distribution of severities. This chart is ordered by the
distribution of severities inside the general sample. You can
see that most of the vulnerabilities were medium in the general sample,
and then the second is what we called negligible.
Might be a bit confusing. Most image scanners mark vulnerabilities
with CVS's score of zero to four as low.
Cubescape is using gripe which differentiates between low and
negligible severities, where negligible is zero to
two and low is two to four. For graduated
projects, the most common vulnerability severity is negligible.
Both have more or less the same amount of critical vulnerabilities,
and in general we can see that there's a slight difference in the
distribution in favor of the CNCF projects,
but honestly it didn't really make a lot of sense yet,
didn't give us a lot to work with.
Let's look at the top ten vulnerabilities in the general sample.
You can see that the first vulnerability is above
all else in terms of numbers. It's a relatively new vulnerability
in busybox, which we saw in nearly 37,000
images that we scanned to high severity vulnerability.
Then we have libgCrypt, which was in nearly 15,000
and high as well after that we have a few, which are
also around 15,000 images. And after that,
from eight and down, we have a bit less. For the sake of
this talk, we're only looking at high and critical issues.
And now to try and make sense of this data, we went over
each one of these vulnerabilities manually to try and understand are
they really exploitable and how relevant are they. So let's
start with the first one, the busybox CVE busybox
contains all sorts of tools that we use in our containers. One of
them is Netstap, which can be used to read your DNS record,
except most people don't really use it.
This vulnerability is pretty severe. If you're using Netstep to read
your DNS record, an attacker with a malcrafted DNS record can
take over your terminal. Alternatively, the attacker can also
change the colors of your terminal. Cute. But the
risk of it being exploited is very, very low. This vulnerability
is not very relevant. Next we
have Libgcrypt, which in specific versions mishandles
a cryptographic algorithm called El Jamal. And it enables the
attacker to extract the private key from a side channel, which is
actually a very serious problem. The point is that the reason this
image vulnerability is so high in our list is because it's part of the GPG
protocol in our package manager. But when we install packages,
their signatures are verified with public keys, not private
keys. So it's a very bad practice to have private
keys in containers, but most don't.
They'll just have public keys. So again, this is a vulnerability that's
not really exploitable in about 99.99%
of kubernetes clusters, and yet it's still up here in our top ten
list. Next,
Sqlite. A heap based attack.
So SQLite had a critical vulnerability which enabled an attacker
to take over the process which runs SQlite with a very specific
crafted SQL query. Again, this is
a very serious issue, but if you think about it again, even before taking
over the container, the attacker needed to have an SQL injection beforehand.
So again, this is something that, it's really hard to see how it's
going to be exploited. So if we look again at
this top ten list and sort of filter out
the less relevant ones,
we came out that these two are the lightly
exploitable ones. Again, it's not based on very hard
facts, but sort of a manual check and looking over the
cvs, these are the two that we thought are likely
exploitable. So it goes to show that the original data
we got from the image scanner was not completely relevant.
So let's look now at the top ten vulnerabilities in the graduated project
sample. We have protobuff library
leading by far, which dates back to 2015,
a high vulnerability which was around for a
long time. We have some Golang issues and some Prometheus
issues, which again makes sense because Prometheus was one
of the most common images we scanned, and also armo CD.
And at the end of the list you can see the same libgcrypt
and escalite issues we had before.
Let's look now at this proto Bruf vulnerability.
It was very surprising. It's really strange that in the graduated
CNCF projects you have a high vulnerability which is so old and it's
been around for so long. But what we realize is that
there's actually a bug open on this issue. The image
scanner actually misinterprets a Golang package as a C
package here. In other words, this is a false positive.
Actually, Gripe has come out with a new version since, so it might even be
fixed again. We're looking at the filtered list
now, as we did before.
These are the five vulnerabilities out of the original
ten that we marked as actually exploitable
in some circumstances.
If we look at this distribution of the average of vulnerabilities
per severity in both samples, it looks like if we take
a random image from a graduated project, it'll most likely have way
less vulnerabilities than an image from that of the general sample.
So this graph is very complementing and it looks great and
it's easy to think okay, cool. We're doing an awesome job here in CNCF,
and I want to believe that. But we still raise the question of is
it really true that the situation is that good?
Because we learned something during this process, and that is,
like we mentioned, a vulnerability that an image scanner has returned
is not always an exploitable security issue.
So we're trying to fix that. I'd like to show you a work in progress
project called Sneefer, which will soon be integrated into cubescape and will
look a bit different. But the idea is to enable us to
use EBPF to understand which software packages are loaded into
the memory of the container. Basically,
using this information, we can filter out the s bomb of the container image
and show the software packages that are actually inside our running container.
Then if we feed this back to the vulnerability scanner, we'll get a better
result, a more relevant result. So just returning here
for a minute, you can see that in this NginX container,
the image scanner returns nearly 400 vulnerabilities,
but only four of them are actually in memory, which is amazing. It makes
a huge difference. So if
we look again at this top ten vulnerabilities and recalculated
the images with this relevancy taken into account,
what we found is that still the graduated projects are much
better in the general sense, but the difference is not that big.
If we just focus on the critical and high vulnerabilities, we see
that there's no big difference in the numbers, which again
made us think that the graduated projects are doing better, but maybe not
by that much like we thought before.
Now why is this happening? One of the reasons
we came up with for the fact that in the graduated projects and in
general the CNCF projects have much less vulnerabilities is
because we're using newer technologies. Most of our projects are
really created for this ecosystem. They're using go. They're usually built
upon base images which are empty, not like base images of
different Linux distributions which are bringing in their own vulnerabilities even
before you've added your own file systems into them.
Also, if we look at the way we're managing things at CNCF
and these projects, the transparency has a big effect on solving issues
really fast. On the other hand, most of the projects,
like we said, are written in go, which is loaded into memory, so it's harder
to detect what's unused.
So we'll talk briefly also about configuration
scanning of kubernetes objects. So again,
what we're going to do is compare these two samples of repositories of graduated
projects and registries of the general population, and we're going to look at what kind
of issues we're seeing in both samples.
Kubescape uses controls which are basically tests of
different properties of kubernetes objects.
For example, checking whether you're using an immutable file system
in your Kubernetes workloads is one of the tests which, as we can see,
is the control that the general population failed on the most.
A lot of workloads don't use immutable file systems,
which kind of makes sense because immutable file systems are not easy
to configure. And although it's a good thing from a security
perspective to have, it's probably not the most important thing
in terms of priority. So that did make sense for us.
What was a bit more surprising is that number two is resource limits. A lot
of workloads don't have resource limits in our sample we
have a lot that are running as root container, therefore they are failing.
The control of non root containers around the same number
are allowing privilege escalations. After that you
can see a lot are not configuring memory limits, resource limits,
common label usage, which is not a security issue per
se, but we have a lot of cis compliant issues down at the end of
the sample. Moving on to the graduated
CNCF project, again, number one,
which is quite surprising, is the most common issue here, is the resources limits
are not configured. After that we have readiness probes
missing, and again immutable file systems not configured.
Then memory limits are missing. Then the next two
non root containers are configured more here in graduated
projects, and so are containers that don't allow privilege escalations,
but not by a lot more. In general.
Again, we see that this distribution is a bit different,
but we still have a lot of issues.
We tried to compare what if we were to take a random workload
from each of these samples? What percent of our controls would fail?
And as it turns out, 35 of our controls would fail if we
chose a random workload from the graduated project, and 38 would fail if
we chose one from the Java sample. So it's
clear that these numbers are quite close and we don't see such a huge difference.
So to wrap up in terms of misconfigurations,
CNCF graduate projects are only slightly better in
terms of vulnerabilities. Again, it's hard to say that graduated
CNCF projects are meaningfully less vulnerable,
especially when we're taking into account that, as we saw, not all
vulnerabilities that we get from an image scanner are exploitable, but it
seems that they are slightly less vulnerable. In general, there's a
lot of work to be done to make vulnerability
scanners more effective and more relevant.
Thanks very much for listening and enjoy the rest of the conference.