M9sweeper, the Open Source Kubernetes Security Platform

Video size:

Abstract

In this lightning talk, I will show you what the m9sweeper security platform can do, from scanning your container images, scanning your cluster, performing pen tests, advising you on improving deployments, setting and enforcing policy guidelines, even detecting intrusion.

Summary

Jacob Beasley is a certified Kubernetes security specialist and administrator. He also leads a team who supports the open source project Msweeper, which you're going to be learning a lot about. It's essentially a Kuberne security platform that takes all the tools the Cloud native Foundation recommends.
When you think about security in kubernetes, we think of it as a layered security model. The next layer is the Kubernetes cluster itself. Then we'll look at the container. By the time we finish this presentation, these aren't going to be foreign concepts for you.
Kubernetes is a container orchestration engine. The majority of our efforts revolve around making sure that access to that API is limited. Don't expose your API on the Internet, and don't make everybody an administrator.
Role based access control simply means preventing people from having too much access to the cluster. Once you have a role, you then bind it to something. Users are not actually a thing that exists in Kubernetes directly. Service accounts allow applications to communicate with API.
Next we're going to look at a couple of tools, Kubebench and Kubernetes. For every other major technology, there are an established set of benchmarks that you should follow. Most companies that are successful securing Windows or Linux will simply use the right tools.
The Linux kernel supports several features that enable containers to work from the perspective of your application. Virtualization is considered less segregated than hardware separation. Containers have quite a bit of isolation but if you're not careful, or if there's a Linux security vulnerability, maybe you could break out.
In a container you create a container image. The image is actually a layered file system. Inside of a container, we have a command that is used by default to boot up whatever application you're containerizing. Almost every part of this can be overridden at runtime.
The worst case scenario is you don't secure a container and then the application can break out. Whenever you are deployments an app to kubernetes, we need to set a security context. If you wanted to enforce or at least make sure people are following best practices, there's a great tool.
Don't allow applications to run as root or escalate privileges. Be very careful with host volumes. Use tools like OPA, gatekeeper, pod security policies, and pod security standards. Finally limit service account privileges.
We've built a UI around Cubesec. It will analyze the manifest of a pod and give you advice. We've made it even better with can easy to use Ui so that it's easy for your team to use.
We have three pre made standards for Kubernetes security, privileged baseline and restricted. Start with baseline and then finally restricted, which requires you to set quite a bit. We've created an open source project that makes deploying apps with restricted pod security standards fairly easy.
K eight EZ is a helm chart for deploying apps to Kubernetes. It allows you to deploy an app just by specifying the name. Or you can create a full values file to configure everything you could think of. By default the security context is fairly locked down.
Kubernetes network policies allow you to limit what pods can connect to what pods. There's a lot of applications where people are not doing any authentication or authorization on internal backend APIs. Not all container network interfaces support it.
Gatekeeper is a plugin for Kubernetes to allow you to describe policies. It will assess and enforce those policies. Think of it as different kinds of operations you want to do. Pretty powerful stuff.
Next I want to talk about code scanning. We can scan everything that's currently running in your cluster. You can even block things from booting up that don't meet your standards. And you can create exceptions for teams that need those. And there's a lot more in here too.
Project Falco is a rules engine for monitoring Kubernetes API logs or just Linux kernel calls with an EBPF filter so it can monitor for suspicious behavior. This allows you to configure alerts so that you're notified whenever something suspicious happens.
We have an easy install guide on the left here. You can spin up a cluster and install all the tools in 2030 minutes. If you have any issues, if you go to GitHub. com msweeper our GitHub repository is where you'll see all the activity happening. We'd love to hear from you.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

You. Hello and welcome to the Kubernetes Security Workshop with Msweeper. I'm your host, Jacob Beasley. I'll be teaching you a lot about Kubernetes and Kubernetes security. I've been using kubernetes since 2016, so that's a long time, kind of from the beginning. I've got experience building and supporting software in just about every tech stack you can think of, from Java net to rails to python to PHP. I'm a certified Kubernetes security specialist and administrator and a lead a team of people lead a team of site reliability engineers that has deployed many like hundreds of applications to Kubernetes. I also lead a team who this same team supports the open source project Msweeper, which you're going to be learning a lot about. It's essentially a Kubernetes security platform that takes all of the tools the Cloud native Foundation recommends, all the things the Linux foundation recommends, and gives you a user interface to manage all of those tools. All right, so before we get too deep into this, I want to talk about the four c's of cloud or container security. When you think about security in kubernetes, we think of it as a layered security model, starting with the cloud. The cloud is really the physical infrastructure and the way in which you manage the physical infrastructure. So how do you go from your bare metal to your virtual machines, which Kubernetes ultimately runs inside of? And that's really important because you have to think about things like network security and physical security. That's largely out of scope of this presentation because it isn't really anything specific to Kubernetes. The next layer is the Kubernetes cluster itself. So we'll be looking a lot about how do you secure that cluster. Then we'll look at the container. The container in Kubernetes is the unit of work. How do you make sure that the container isn't running with too many privileges and that you don't have applications that can escape or do bad things in that container? And then finally we have the code, and the code is the code that actually runs your application and we have different ways of validating that code doesn't have any obvious vulnerabilities. We're going to be doing a number of labs in the process of this presentation. So you're going to see demonstrations of Kubebench, Kubernetes, OPA, Gatekeeper, Cubesec, trivia and Project Falco. By the time we finish this presentation, these aren't going to be foreign concepts for you. They're going to be very real things that you'll be able to use. Now if you want to follow along with the lab, you can click on the View lab guide in the PowerPoint. You can also go to killercoda.com slash mindsweeper. Killercoda is a great resource for you to go spin up a Kubernetes cluster rapidly and play it trout different things in Kubernetes and we have a Msweeper lab that you can click on here and it will walk you through setting up a Msweeper cluster inside of killer coda and using every one of these tools that we'll be talking about today. It's a great way to introduce minesuper and try it out. All right, let's get started talking about the cloud. So what is Kubernetes exactly? Kubernetes is a container orchestration engine. What does that mean? With Kubernetes you can have many nodes which are where your applications run typically, and you can describe to Kubernetes what you want to deploy and then it will make it. So you typically do that by talking to its API using various command line tools and saying, here's what I want to deploy. And then it will plan out using various controllers like schedulers and things that are part of this API or connected to the API technically, where those things should go, and then it will make it. So on each of your node you have Kubelet and Kubeproxy. And Kubernetes talks to the API and says, what should I deploy? And then it deploys it. And then the Kubernetes proxy does all the networking for you. This is important because when you talk about Kubernetes security, you'll see I've circled the API in red. The majority of our efforts revolve around making sure that access to that API is limited and that users can't deploy bad things with that API. Here are some general best practices that aren't related to the configuration of the cluster itself, but they're things that are really important, right? So one of them is don't expose your API on the Internet. So typically people will put it behind a VPN or at minimum have some kind of IP address whitelist that just significantly reduces your footprint. If Kubernetes ever had a major zero day vulnerability, if you were behind a VPN, it's much harder for someone to find your cluster, connect to it and then exploit it. So far there haven't really been those, but if it ever did happen, you want to be behind a VPN. Number two, don't make everybody an administrator. So by default, if you use a lot of the different cloud managed Kubernetes environments, it's very easy to say grant someone administrative access to that environment, but you really want to limit that to only people who really need to have administrative access. It's very helpful. Most of the clouds have some kind of managed Kubernetes service with can active directory or some other kind of identity provider integrated where you can say this, say active directory group is granted access to this particular namespace or this particular role in Kubernetes. So keep that in mind. And the third one, I'll go back a little bit, I kind of brushed over this, but you'll see the API connects to this thing called ETCD. ETCD is a form of database and it's where all the data is stored. And that's important because you need to control access to that eTCD data store the same way you would anything else. So it's a file on disk, and if somebody had access to connect to that VM and modify the contents of that decrypt and modify etCd, then theoretically they could cause a lot of problems. So a good example of that would be if they were to delete things, they might be able to break your cluster or they might be able to create new things in Kubernetes, effectively bypassing the API. So that's something important that I want to point out. All right, moving on, cluster. So the first thing I want to talk about is role based access control. Role based access control simply means preventing people from having too much access to the cluster. I think we've all had that case of access denied errors when using Kubernetes, but you really should set up roles in Kubernetes. I want to talk about a few vocabulary terms. First of all, cluster roles and roles. The cluster roles are for all namespaces, the whole cluster, whereas roles are for specific namespaces. Namespaces is how we segregate different teams within your company or different workloads. Once you have a role, you then bind it to something. So you might say bind it to a group, to a user, et cetera. So there's cluster role bindings and role bindings. Users are not actually a thing that exists in Kubernetes directly. Rather, Kubernetes implicitly has integrations for external user stores, identity providers. So you typically bind to a group, you can even bind to a user, but you still have to have an identity provider hooked up to Kubernetes to allow that person to authenticate. Finally, service accounts allow applications to communicate with Kubernetes API. So we have the ability to create service accounts in Kubernetes and then you can generate certificates for those and somebody can use that certificate to connect to kubernetes. Here's an example role. So here we have kind role. We have some information about what namespace it is. Remember roles are namespace scope, whereas cluster roles are global. And you can see what APIs they can connect to. So if you want to know which APIs are available, look it up in the docs, it's not too bad. But here in this example you can see they can get watch and list pods in the pod reader or in the default namespace. That's an example role, but before you can use it you have to bind some user to it. So here's an example. A user or group, right? So role binding subject, Jane, right? So a user named Jane, it could also be a group and then the reference is a reference to the role that it's going to be related to. So the pod reader role that we created is bound to Jane. So now Jane is a pod reader. If you're wondering how to bind to things like active directory groups, check the documentation for your cloud providers. They generally have great examples and you pretty much just copy paste from the examples and set up your roles as needed. Next we're going to look at a couple of tools, Kubebench and Kubernetes. And I do want to back up and say, well why can't I just give you a checklist of things that you should do and you should just follow it? Why do I need tools to help me secure kubernetes? The answer is kubernetes is pretty sophisticated and for every other major technology, whether it's Linux or Windows, there are an established set of benchmarks that you should follow. And most companies that are successful securing Windows or Linux or Kubernetes will simply use the right tools to make sure they're following best practices. So we're going to be looking at Cubebench, which runs the center for Internet Security's benchmark suite. CIs has its own benchmark suite, but it's a paid product, whereas Qbench is open source and made by Aquasec, a great company. So Qbench is what we recommend, it's the one that, they're members of the Linux foundation, it's a CNCF project, it's very well respected. So Kubebench will connect to your cluster and run a battery of tests and give you feedback on how secure your cluster is. Cubehunter will do a pen test. It can even try to exploit things, although by default you do it in a passive mode. So let's demo these. Now you can run all these tools from the command line, but the way that we like to do it is we have msweeper installed in our cluster and when minesweeper is running I can see my list of clusters and I can click on Cube bench on the left navigation and I can see every time that I've run it. If you want to set up Kubebench in your cluster, you can click this run audit button and it'll help you figure out how to install it. We've created helm charts which are open source and you can say things like run it every day, run it every week, or you can just run it one time and be done so it's all automated. Here, take this, put it in a pipeline, hit go and you've got yourself a benchmark running. We typically recommend running a benchmark every day to make sure that configuration changes don't drift. Or if you do a cluster upgrade, you don't suddenly have things configured in an insecure way. Here you can see we ran a worker node test and you can see it ran and it checked a number of different things. So is the worker node configured correctly? Is Kubelet configured correctly? And then I can click down and see details about the test and what happened. I'm just going to find one where I got a warning. Here it is, 3211. I can see ensure that rotate Kubelet server certificate is set to true. It's not set to true and it gives me advice on how to remediate it. Now when I'm using a managed service, I can't always change all of these. I could dig into it. In this case I'm using Azure Kubernetes service. So I might not be able to fix everything. But you can see this is pretty good. So that's Kubebench. Next I want to look at Kubernetes. This is a very similar tool, but it does a penetration test, meaning that it actually boots up an application inside of Kubernetes. And it says, if I'm an app running in Kubernetes, do I have access to do bad things? So you can see it ran and it has some advice for us. It says that by default it's injecting a service account into the container that it's running in. So that's not necessarily good. A service account is how your pod can communicate with the Kubernetes API, and a lot of people recommend turning that off by default so that it only injects it. If a pod actually needs to connect to Kubernetes, that is a smart move. Also, it says here that there's an API. This is a big one, and I'll explain in a minute why that allows the application to learn a lot of information about its environment. In this case, it can find out what version of Kubernetes is being run. And that's concerning, because if there was again, a vulnerability in Kubernetes, I could leverage that API to figure out which say, metasploit exploit that I would run. But if I didn't have that exposed, it would be much more difficult. So it kind of gives you some advice. You can click on this cubehunter link. I could get more details about what I need to do to fix that. So Kubebench and Kubernetes, great tools from a very reputable company. Open source projects very, a great tool. Let me pop back to our PowerPoint. All right, next, let's talk about securing the container. Before we dive into that, let's explain the difference between a virtual machine and a container. This is really important. So virtualization allows an operating system to virtualize another operating system inside of it. Use this thing called a hypervisor to keep track of to do cpu scheduling between the different operating systems, to keep track of things like storage and memory, things like ram. The hypervisor keeps the vms from stepping on each other's toes and accessing each other's data. And that's great. But a VM has the overhead of a whole nother operating system typically takes a while to boot up. It's kind of heavy. Enter containers so containers share allow different apps, different container to share the same operating system kernel. So instead of having to have a hypervisor detecting things, you can have a container. The Linux kernel supports several features that enable containers to work from the perspective of your application. Your app thinks that it's running as if it's its own virtual machine for the most part. But we use a few features of the Linux kernel to trick the app or to isolate the apps so that they are mostly separate. C groups limit the cpu and memory of each container. So when you create a container, you can share cpu and memory, but make sure that one container doesn't use all of the cpu time. CH Root means that you can change the root directory. So when a container boots up, it unzips your container image and all of its software into a folder in the host operating system. And then it switches into that folder and runs your entry point as the appropriate C group. So now it's got its own root folder. It's running with limited cpu and memory. And then finally for things like users processes, network volume mounts, we use this thing called namespacing so that your app in its container cannot see other applications processes, can't see other volume mounts, can't see other applications users. Now I will say some of the things like namespacing for users is not implemented in many of the container runtimes. So we'll talk later when we get to cubesec about this. But there are some key best practices that you need to do to prevent applications from potentially breaking out of their container. It's not foolproof, but it doesn't have to be that hard. We've got great tools and there are many great open source tools that will help you do this really well. So we have degrees of isolation. On the one extreme we have like racking a physical server for each client application. And then on the other hand, we have running applications with no isolation. Virtualization is considered less segregated than hardware separation. There have been vulnerabilities in hypervisors where people could break out of one vm and see the other. It's very rare. Containerization is not quite as much isolation as virtualization. Containers have quite a bit of isolation. But if you're not careful, or if there's a Linux security vulnerability, maybe you could break out. And then finally, if you have no isolation where apps are just running as the same or different users in Linux they have even less isolation. One application by default in Linux can use all the cpu and memory. Unless you do C groups like containers do. So containers are better isolation than if you had none. Cotta containers I want to mention, right, if we want to make containerization a bit more secure, if we're afraid of can application breaking out of its container, some people will use kata containers or gvisor. Cotta containers actually spin every container up in a very lightweight virtual machine. So it'll spin up a vm with enough on it to be able to boot up a container inside of it. So that's Kata containers. And then Gvisor will filter and validate that applications aren't doing things they're not allowed to do. So in case there's a vulnerability in the Linux kernel, Gvisor will prevent an application from escalating privileges or accessing things it shouldn't be allowed to access. Let's talk about the parts of a container image. So I mentioned that we do things like C groups, ch root and namespacing. I should say the container runtime engine, whether that's docker or container D, takes care of this for you. And in a container you create a container image. The image is actually a layered file system. So technically what that means is it's a series of tar files that get unzipped one after another. And if you've ever seen a docker file, which we'll look at in a moment here, actually, I'll pull one up. A docker file has a series of steps, and each step, in each step it figures out what files changed and then it tars them up. So then when you're downloading a docker container, you'll see it saying downloading, downloading, downloading a whole bunch of files. Each of those files was a step in the original docker file. Inside of a container image, we have a command, which is the command that is used by default to boot up whatever application you're containerizing. You have a working directory, just like when you are running a command shell, you've got a working directory you're sitting in. The present working directory is the default directory where you're going to be running that command that it runs by default. When the container boots up, you could have a list of default environment variables. So you might default in a rails application, you might default rails environment to production, or you might default a path to include certain executable files. And finally, what group and user does it typically run with? The thing about it here is that almost every part of this can be overridden at runtime. And you'll see when we get to security contexts and things, there's ways of doing that. And I did want to show you a docker. See this docker? We look at an example of a docker file. You'll see, look at a good example here. Here's a good example you'll see here in this docker file. We start out saying we're from Ubuntu 22, so it's going to download that base image and unzip it effectively. Then it's going to copy in a file from my local directory inside into my container. Then it's going to run the make command, and then when the app boots up, it's actually going to run pythonappapp py. So here you're configuring your container, and there's many other commands to configure container. But these are the basics. Copy them. Some files say what user you're going to run as. That's the basics. All right, I want to talk about container breakout. What happens if we don't secure a container? So the worst case scenario is you don't secure a container and then the application can break out. The easiest way to do it is if a container is running as root, you actually have the capability of mounting volumes into your container. So running as root, you could potentially mount a volume that's actually the host root volume. So even though you're switched into your container folder, you could mount a volume from outside of your container, effectively letting you see everything else running in the operating system. So that's container breakout, preventing container breakout. All right, so now we get to the fun part. So whenever you are deployments an app to kubernetes, we need to set a security context. So a pod is the unit of working kubernetes. So most of our examples are just going to be with pods, although oftentimes when you deploy, you'll use a deployments, a stateful set, a daemon set, some kind of container that will deploy multiple pods, whether that's n instances of pods or one for every node. But in this example here, we've got an individual pod, and you'll see the pod is set up to run as a particular user and group. You can configure some things like run as user and group, both on the security context layer globally for the pod or on the individual container, and then file system. Group says basically if you have any volume mounts, who's going to be the owner of those volume mounts? Typically if you want to be able to read and write from those, you'll set the owner to the same as the group. This example is a bit different, but just bear with me. And then for the container here you can see allow privilege escalation, false, run as non root, true privileged false. These three are kind of best practice to set these three. Some people also will do a read only file system to prevent one container from using up the whole disk in Linux. But you probably don't need that. So you can see run as user group and file system group security context. Very straightforward. If you wanted to enforce or at least make sure people are following best practices, there's a great tool. We have it built in here, open source cubesec. You can choose any pod that's running and I'll find one here that I know is going to be probably in trouble. Go Project Falco, which we'll explain later. And here it looked at my project Falco pod and it said, is this pod following best practices? And my answer is no. But that's actually because Project Falco legitimately needs elevated privileges. But you can see okay, service account name is green good. Limited cpu good. Limited memory good. Requested cpu and memory, good. Okay, we're good, but then come down here to critical. Uh oh, we're running as privileged, basically running as root. We have access to the docker socket. We can connect directly to docker. It's not good. And then it's got a number of other advice here saying let's use app armor seccomp to limit which Linux capabilities we have and which Linux calls we can make. It's saying run as non root should be true. So a lot of, lot of good advice here. So we got a negative 39 on critical, seven for passed. It adds it up and says, well, we get negative 32 points, we stink. That's actually okay though, because project Falco legitimately needs these privileges. But I recommend using this for most of your applications and getting teams to look at it. And even better, we'll talk about later tools like gatekeeper that allow us to enforce policies. So you could just prevent application teams from ever deploying anything that looks like this. But we'll come back to that in a minute. All right, preventing container breakout. So do not allow applications to run as root or escalate privileges. Good. Two, don't mount in any host. Be very careful with host volumes. So Kubernetes allows you to mount in volumes. One of them is called a host path. But if you could connect to Kubernetes, if you could mount a host path, why not just mount slash as the host path? Effectively, then you've just become the root computer. You've just overridden everything, right? You now have access to the root file system you've broken out of your container. Three, use tools like OPA, gatekeeper, pod security policies, and pod security standards. We'll cover more later to prevent someone from deploying pods with host volumes or with elevated privileges. So basically don't do these things and then use some kind of policy standards to just prevent someone from doing these things. And then finally limit service account privileges. So the service account is potentially an account that's injected into your pod in order to allow your pod to talk to Kubernetes. Maybe your app wants to connect directly to Kubernetes for some reason. Maybe it needs to legitimately spin up other containers. For example, if you're using Apache airflow, other tools. Apache Airflow will want to kick off jobs in kubernetes great. But definitely look at those service account privileges and only give people access to the things they need. Keeping in mind that if that service account can create pods, it can effectively, and you have no policies that service account can effectively create a pod with elevated privileges and break out of the container. So unless you're doing some kind of policy management, be very careful with your use of service accounts. Limiting Linux kernel calls it's another fun one, so let's back up a little bit. A lot of us have used Linux. Not a lot of us understand exactly what a kernel is and how the Linux kernel works. So whenever your application wants to do something other than access cpu and memory, maybe it wants to open a file, maybe it wants to connect to something on the Internet. It has to make a system call to the Linux kernel. An example of a system call might be something like getting the current time or setting the current time. But some system calls are kind of dangerous. So maybe you want to do a system call to change your user account. That might be dangerous. You might want to get a system call to change the time of the computer. That might be dangerous because it might mess up other people or mess up other applications running on that computer. So we want to limit our apps from doing certain things. You can, within the security context, specify a list of which things are dropped or added. Also, some people will use seccomp and app armor to create pre made profiles. App armor can even listen, follow your app for a while, build a profile based upon what it's using, and then you can apply that profile. So I'm not going to demonstrate using setcomp and app armor, but I will demonstrate this. So inside of the security context, you can explicitly add or remove capabilities. So here I'm adding the sys time, which would allow me to change the system time. As an example. I'm not going to give you the big list of all the different options. There's a lot out there. Usually the defaults are good enough in many cases, but it is worth considering this. If an app ever needs to add capabilities, be careful, dig into what those capabilities are. Net admin is a great example of one that could be a bit dangerous, right? It could allow the app to look at other things that are running around or to reconfigure networking to potentially break things. Sometimes you have legitimate use cases for this, so just think critically about it. All right, Cubesec, we demonstrated Cubesec earlier, so Cubesec will analyze the manifest of a pod and give you advice. We've built a UI around Cubesec. I know, I demonstrated it briefly earlier. I can go to Cubesec here. I can choose a pod or upload and I can even pick. So we'll pick, let's do cube system, like pod. I'm going to click all of them. I can even pick a whole bunch of them and it'll run it for all of them at once. Take it a minute. Hopefully it works. You never know with live demos. There it goes. And now it's given me a breakdown of every single pod in this namespace and how they're doing. Now it is cube system. So I can see the azure disk, some of this you'd expect, right? So azure disk in cube system, the CSI driver is going to of course need elevated privileges, right? Cube proxy, same thing. It's got to have network administrator access to set up the proxies. So it makes sense. But it is interesting to look at. So great tool, it's a great tool. We've made it even better with can easy to use Ui so that it's easy for your team to use, even if they're fairly new to Kubernetes. All right, next, pod security admissions. So for those of us who've been around a while, Kubernetes used to have this thing called pod security policies, which gave you a lot of granular control over what things could do. But Kubernetes recognized, or the creators of Kubernetes recognized that that was really too complicated for most people. It's enough simply to have a few general presets and if it's not enough, they can go use OPA gatekeeper or something to create their own policies. So with pod security admissions, we have three pre made standards for Kubernetes security, privileged baseline and restricted. What you can do is enable in a namespace a standard, and then if any app tries to deploy that doesn't meet one of these pre made standards, it will block it. Anything running in, say, Kube system is probably going to need privilege access like we talked about earlier. So that's what privilege is for. Unrestricted deploy anything you want. Definitely lock down those namespaces that allow privileged deployments. Baseline, this is pretty good. It prevents things like basically the things that would allow container breakout. You need to dig through the list because I think you could probably still do host paths, which are potentially dangerous, but it would prevent like running as root or privilege escalation at a very minimum. Start with baseline and then finally restricted, which requires you to set quite a bit. You have to configure a bunch of stuff in security context on every pod. It's kind of a pain. We've actually created an open source project which I'll show you in a minute, that makes deploying all of your apps with restricted pod security standards fairly easy. When you configure a namespace, you have to add this label, podsecurity Kubernetes IO enforce restricted. And this would allow you to enforce a particular level of pod security admission standards. So you could do privileged baseline or restricted. If you do nothing, it's effectively privileged. Right. But if you configure baseline or restricted, it's going to start locking down that namespace, which is really nice. I want to show you. We've created a project called K eight EZ. It's a helm chart for deploying apps to Kubernetes. It allows you to deploy an app just by specifying the name. Or you can create a full values file to configure everything you could think of. The big thing is that by default the security context is fairly locked down. So by default it's running as non root, it's dropping all Linux capabilities, privileged, false. Run as non root, true. Allow, privilege, escalation, false. It's doing all those things by default that you want to do. So definitely try it out. We use it for a lot of our client implementations and it's been very successful. We've also had clients where they have a whole bunch of apps in kubernetes and they're able to use a single helm chart to deploy all of them. So then if they're enforcing tens or hundreds of apps to Kubernetes, or if we're onboarding them, they can just create a values file for each app. They don't have to create a custom helm chart for every app. And they can trust that by default things are pretty locked down. You couple this with the pod security standards and you do a good job locking down kubernetes. All right, let's keep moving. Network policies. So Kubernetes network policies allow you to limit what pods can connect to what pods. This is really nice. In a lot of architectures you'll have a front end and a back end, and the front end and back end might both run on your servers, right? So maybe you have something that receives requests from the Internet and then it connects to APIs. And I see a lot of times where some of those back end APIs that have no external ingress also have no authentication. I wish they did. There's a lot of applications where people are not doing any authentication or authorization on internal backend APIs. So with network policies in Kubernetes we can lock down what applications are even allowed to connect to them. So that makes it much harder for a hacker to exploit one of these APIs, because let's just say hypothetically, you have hundreds of APIs, but maybe half of them are internal only and maybe don't have a lot of authentication or authorization on them. You can use network policies to just prevent any app from connecting to those APIs. So you can literally limit it to things in the same namespace or just named connections. A network policy looks a bit like this, you would say what pods it applies to. So you'll use a pod selector to match pods with certain labels or certain namespaces, and then you can say whether it's in ingress or egress policy. So this one here for example, would deny all ingress. So it says it's can ingress, but it also doesn't specify what can connect to it. So the default is to deny everything. You can do the opposite, where you say ingress and you have an array meaning match everything, which allows all ingress. Same thing for Egress. I'll let you dig through the docs on all the different options, but the big thing is that you can say this namespace or these pods can connect to these pods. Very helpful. Not all container network interfaces support it. So istio does, but some of them don't. So you do have to think critically when you set up your cluster, do I want this or not? And if I do, which network interface do I want to use? And that's something when you set up your cluster. A lot of people are using cloud managed clusters, and by default they come with something like this. Definitely something to check into though. There are shortcomings though. So one of the big ones is that, and I don't even know if I've got it on the PowerPoint here. One of the problems is that it doesn't allow you to connect to control access to external hostnames. So a lot of people we see using Kubernetes service entries to control access to particular hostnames. Most of the rules are namespace wide, but you can do pod selectors like label selectors. Not all conceivable rules can be set up. So again you can say external host names is a big limit. So if you wanted to say I'm going to allow them to connect to an external database. But that database doesn't have an IP, it's got a hostname. Kubernetes by default doesn't support it. But again, people use istio with Egress gateways to do that, or some people are using cilium for that now. So there are other methods. And yeah, pod security standards are far less granular and may not be appropriate for all workloads. Still, it's very powerful and it can be extended with other third party applications as well. All right, OpA and gatekeeper. So OPA is a language really, it's a language for describing policies. And Gatekeeper is a plugin for Kubernetes. It creates a CNI, basically it extends Kubernetes to allow you to describe policies inside of Kubernetes and then it will assess and enforce those policies. So you write these scripts, and usually you don't write the scripts, you use one of the open source standards and you just configure it. And then these scripts validate whenever something tries to deploy that it meets those standards and it does more than deployments. You can do it for even namespaces, anything. I'll give you a couple examples. So a good example would be maybe you want to make sure that for cost accounting reasons, you want to make sure that every application has a cost center attached to it. So if I open up know gatekeeper in the UI, the first thing I do is create a constraint template. So by default there are numerous different constraint templates that are included in gatekeeper. It's an open source project and they have a number of premade constraint templates. Constraint templates are different. Think of it as different kinds of operations you want to do. So you might want to say, I want to require containers to have a limit on cpu and memory. So I can say okay, container limit. And you'll see here we have different constraint templates available and they have this rego code in it. But advising rego code, it's really complicated. I can let you go dig into that on your own time, but we don't see a lot of people actually doing that. Usually what they do is they use one of the official ones like constraint limits or pod security policies, like running as non root. Let me find one. I like the label one. Okay, required labels. All right, so we're going to do required labels. We're going to actually require that people tag every pod with a cost center so that we can do our fin offs that we're all being told we have to do. Right? So I've created the required labels constraint template. Then I have to come in here and click add more and I have to pick, I have to configure each constraint. So the constraint template is the rego code, the constraint is parameters for the regal code. So I'm going to say I want to enforce it on all pods. I'm going to call it cost center constraint. I'll just call it cost center cost center required okay, kubernetes required labeled labels description require cost center to deploy pods excluded, namespace included. We're going to exclude gatekeeper system and cube system and Msweeper system. Okay, cert manager. So I'm going to exclude my built in stuff. I'm going to say allowed Reg X and key. So we're going to require cost center and it's going to be star meaning anything. It's got to be there, but it can be anything. I'm not going to be picky about the format of it, save changes, and now I've just created it. Let me just take a look real quick. Good. There's different modes, audit and enforce. Audit means we just simply report on it, whereas enforce actually looks at it. It can take it a minute to actually run and give me back my violations. I've actually got another one that's already been set up called container limits. And you can see here, once it's had time to compile and run the rego code, I can click on this violations here and it'll list off for me every pod that's currently violating. So very useful. Here we go. It's beginning to compile all the pods that are breaking the rules. So my ingress controller doesn't have a cost center, so I don't know who to charge for it. So pretty useful. And you can see here, we did it all through a UI, so it was super easy. We also enforced creating exceptions so you could give a team an exception, but for only a specific period of time. Pretty powerful stuff. All right, so that's gatekeeper. Next I want to talk about code scanning. So whenever you create a container image, a container image contains both your operating system and all of the operating system utilities you require. Maybe you need go script to create pdfs, for example. Right? And you need Java to run your Java code, and then it's got your code. So maybe it's your Java jar files, your ruby code, your php code, your node code, whatever. So this packaged up container image is actually something we can scan, kind of like a VM image and trivia or sneak are the most common two that I see. We integrate with trivia sneak is coming. Let me show you a little bit about that. So you can run it yourself from the command line or in a CI 3D pipeline locally, and you'll get an output like this, which is very useful. You can even block someone from deploying something that has certain levels of vulnerabilities or even things that are not fixable. But a lot of times we find people have to create exceptions. So we created an interface here where it will scan everything that's currently running in your cluster. You can browse around say by namespace, you can see what's running there and what container image it's running, and then you can expand and see a scan of that container image and whether or not it meets your standards. You can even block things from booting up that don't meet your standards. And you can create exceptions for teams that need those. So here you can see I did can open policy gatekeeper. Apparently I'm not running the oldest, and I can see here it's running an older version that has a vulnerability out of bounds memory access. That's pretty bad. And it's fixable in version 00:40 so I should probably upgrade. I can click request exception here. So if I was getting blocked for that reason, I could click a request exception and request the security team give me an exception. We have an entire exception flow here where you can request an exception and emails the admins. They can review and give you a thumbs up or thumbs down. They can give you an exception for a specific period of time, that sort of thing. And then if I scroll down I can see they click details, details about that particular CVE known vulnerability and exposure. So here I can see details. If I was to open this up new tab, I can see more details about that CVE. So CVE 2002 228946 high. You'll notice there's different scoring methodologies, but it's bad, should probably upgrade. And there's a lot more in here too. That's a little bit about CVE scanning. I want to show you the exception management interface. It's pretty neat, so you can see all the exceptions and then for any exception I can configure it. So kind of like what you would expect. Super useful. All right, next I want to talk about Project Falco. I want to back up a little bit. So if you remember we talked earlier about Linux kernel calls. So every time your app wants to do something other than cpu and memory, it has to do a kernel call to do that action or to perform it. And so what if something was doing something that it's allowed to do, but that thing seems suspicious? Or what if it tries to do something that it's not allowed to do, but trying to do it is itself suspicious? So if an application tries to change its user account, that's suspicious. If an application tries to mount a volume when it's not supposed to, that's suspicious, right. Project Falco can monitor and alert you whenever that's happening. Well, should say alert. It doesn't do alerts. It'll just do an API call or a log. It can also monitor the Kubernetes API logs. It can really do anything. It's kind of like OPA open policy agent. It's a generic rules engine, but we've integrated with it and a lot of people use it for monitoring Kubernetes API logs or just Linux kernel calls with an EBPF filter so it can monitor for suspicious behavior. We make it very easy for people to set up and use. Falco, let me go back here. There we go, Falco. So in my cluster I can set up all my filters and I can see. Okay, so here I can see all the recent events. Now this is a test cluster where we've intentionally configured it so that we get lots of events. So here I can click on this one and see what happened. Okay. Cube Prometheus deck. It got a priority error level. Okay, that's high ish. And the message, full message. If I expand it, file. Okay, it attempt to open a file for writing. And that's a so, all right, so I shouldn't do that. And then I can see here all of the different other times that it occurred. And if I click more, it actually will expand and it takes a minute. But it's going to give me a graph of the historical incident rate. So it's happening regularly. So it's probably part of some kind of regularly scheduled process. We calculate a signature here by combining several pieces of metadata and then Shaw hashing it so that I can find all of the other cases where this same kind of thing occurred with Project Falco. So you can search and see all the other incidences. And then if I go down to raw data here, I can actually see the full details in JSON or Yaml or in a table that project Falco logged out. So very nifty. So we've built this. You can go into your Falco settings globally and you can create rules. We've found that by default, Project Falco is kind of chatty. So we've got a rules engine here where you can go in and say ignore certain things in certain environments. From a realistic standpoint, you're probably going to have to do some tuning. We also do anomaly detection where it will automatically alert you whenever it finds something new. So if I go to settings in the corner here, I can say notify about anomalies, notify no more than once every say once a week. And I only want to be alerted to alert, emergency, critical and maybe nothing below that and I want to send to myself. Right. So pretty powerful. This allows you to configure alerts so that you're notified whenever something suspicious happens. And because we're doing that signature where we combine different metadata, we're able to alert whenever something new has occurred. So it's really powerful. All right, summary so we talked about the four c's of cloud security, cloud cluster, container and code. We talked about different tools you can use such as vpns and firewalls to limit access to the Kubernetes API. Kubenshube Hunter role based access control open gatekeeper Cubesec trivium project Falco we wrapped it all up in a really neat Msweeper demo. If you have questions, if you go to we have. You can use the contact form to reach out to me and you can click on docs at the top and we have great documentation on how to get started, so I definitely recommend starting there. We have an easy install guide on the left here. This getting started guide will actually get you up and running fairly quickly. It can be as easy as one line or you can create as easy as a one liner to try it out. As I mentioned earlier, we also have killer Coda, so that's also another great way to try it out. You can spin up a cluster and install all the tools in 2030 minutes and it'll go away when it's done. So super easy. Also, if you have any issues, if you go to GitHub.com msweeper msweeper our GitHub repository is where you'll see all the activity happening as well as who has contributed. And you can always file an issue to give us feedback about feature requests. Or if there are gaps that you're finding or bugs that you're finding, definitely post them there. We'd love to hear from you. All right, thanks so much for the time. I hope you enjoyed it. I hope you learned a lot about Kubernetes security and I hope to talk to you on GitHub.

Slides

Download slides (PDF)

See all 21 talks at this event!

Conf42 Kube Native 2023 - Online

September 28 2023

M9sweeper, the Open Source Kubernetes Security Platform

Video size:

Abstract

Summary

Transcript

Slides

Jacob Beasley

Project Maintainer @ M9sweeper

Join the community!

Featured event

2025

2024

Info

Conf42 Kube Native 2023 - Online

September 28 2023

M9sweeper, the Open Source Kubernetes Security Platform

Video size:

Abstract

Summary

Transcript

Slides

Jacob Beasley

Project Maintainer @ M9sweeper

Join the community!