Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello, folks.
A very warm welcome from my side to CON42 Incident Management 2024.
I am Manpreet Singh Sachdeva and I work for Walmart Global Tech in United States.
Today, I will be talking about Kubernetes cluster security and will explain
how Kubernetes has become really important In managing the containerized
applications and it's why it is so important for us to secure our Kubernetes
workloads, the applications and to protect us against the cyber attacks.
So first, let me introduce myself.
I'm Manpreet Singh Sachdeva and I carry a very diverse
experience in DevSecOps, MLOps.
Incident management as well as automated automation testing.
So I myself certified, as a Kubernetes administrator, application developer
and security specialist as well.
So I'm certified in, AWS as well.
I possess both solution architect as well as DevOps, professional certification.
I'm very passionate about, security, cyber security.
I consider myself as a cyber warrior fighting against the cyber criminals
and, protecting all the applications throughout the internet world.
So we wonder, cyber security is one of the most, like the hottest topic nowadays.
Because, we saw an example a couple of years back about Log four j
vulnerability and the damage it caused throughout the industry, are
all our customers were at stake.
We had a big reputational damage to so many companies as well as
financial damage and, cybersecurity has many ways of attacking us.
the most popular way, is like the phishing and, optimizing, the cyber, optimizing
the cloud, resources to attack the cloud infrastructure, and way to manage
out, things where our worker nodes, our applications are thereby impacted.
we'll talk about everything today.
And, we'll first, we'll talk about how we can secure our infrastructure,
how we can secure our workloads, how we can secure our cloud as well.
So I'm really excited to share my talk and I, just hope we all of us learn
together and get something out of it.
So guys, let's get started.
So Kubernetes cluster security.
what is it and how Kubernetes became important and, what are the
advantages Kubernetes offers us?
So few years back, all of us were hosted on virtual machines, bare metal servers.
And then we had a revolution where everything starting
started getting containerized.
So we had the docker containers and.
We started using the applications, making them, platform agnostic.
Like we can use Java, we can use Node.
js, we can use Python and just containerize our
application and run anywhere.
Anywhere in the system.
And, then docker had its own way of orchestration using docker swarm, but
it had limitations where we couldn't, operate on scale and, there were, there
was limitations in the resources as well.
And then came Kubernetes and offered us, out of the world orchestration
We started operating at scale.
All the applications started adopting the kubernetes way of, functioning.
So kubernetes, was made for the applications to work on scale and the,
all the, different applications team, they started containerizing their workloads.
And today we learned that how can we leverage the real world best practices To
safeguard our cloud native applications
So i'm just presenting in front of you the table of contents for what we will
go through today So we will talk about Kubernetes security in general and our
multi layered approach for the kubernetes security How we can secure the control
plane, secure the nodes we work on, secure our applications and how can we secure,
the network part of, the Kubernetes.
And then we will touch base on some of the real world threats and incidents
which happened in the past and how can we, leverage some best practices
for our workloads going forward.
I'll also share some resources.
At the end of this session and I think, I'm just excited.
So let's get started
So yeah intro introducing you to kubernetes security So as I told
you know kubernetes revolutionized the way organizations deploy manage
and scale applications So the best example of kubernetes is like you
can scale in and scale out Based on the auto scaling, we can configure
with the help of the cloud services.
So Kubernetes itself has its auto scaling called horizontal pod auto
scaling where you can scale in and scale out And you can operate the way you
get the customer traffic so That's you know We are given such a great power
to operate on scale But with great power, there is a great responsibility
You of securing our clusters, which is really essential, as it becomes a
prime target for the cyber attacks.
a single security flaw, can lead to major data breaches, service outages, and as
I told you, so much reputational damage.
In the past, we have seen big outages just based on, the security
vulnerabilities in our infrastructure.
And that was all like maybe due to the negligence or unawareness of the
teams who actually operate on scale.
So here we will talk about how we can secure not only our workload
applications, but also the control plane, the network policies.
How can we make sure that we have certain network policies that don't
even allow, the, access to our cluster through some IP addresses.
And we'll also talk about how to manage our secrets in such a way
that we don't expose our secrets.
and we never keep them in the plain text.
We always encode them and, use, our secrets management systems like Vault,
AWS Secrets Manager, and Azure Secrets.
The way we can, work in conjunction with the cloud providers
and secure infrastructure.
So for modern cloud infrastructure, security we know
is totally not non negotiable.
Robust defenses at all layers ensure, that applications remain safe from intrusions.
And, when we talk about security, we talk about four C's.
So one is the cluster level security that we will secure
the platform cluster itself.
the infrastructure, I, when I talk about cluster, the infrastructure.
Then the container, the workload which are running.
In containers all we know, it's containerized.
So securing our container How anyone cannot execute inside a container?
How cannot somebody just come and use the IP address and Do anything on a container?
So we have to secure our containers We also have to secure the cloud like,
whatever cloud provider we use like gcp Azure or aws all have really good
functionalities which we can leverage To secure our cloud and, I will give
some examples, at the end, like how can we also leverage the cloud, how can we
secure our cloud infrastructure as well?
And then we come at last to the code.
even the code has some vulnerabilities in case like we don't follow
the standard coding directions.
Which we should be following and we don't have the, right kind of levers and
right you know, the coding, directions we have been, we must be using, when
we are working in an agile environment.
So we'll talk about that as well.
here we have a multi layered approach to security.
by securing each layer, we ensure that there is a comprehensive protection,
across the entire Kubernetes environment.
When we say comprehensive, we mean end to end protection
for a Kubernetes environment.
the Kubernetes cluster security involves addressing risk at multiple layers.
The four key layers are Basically include the control plane.
It's nothing but the brain of the cluster.
All of the, the master nodes are in the control plane and we have
to make sure that the way we manage our cluster, has to be secure.
And we have to apply all our security policies at the control plane, making
sure, we give, because all, always the admin has the control plane access and
we have to make sure that, we follow the security guidelines and, adhere
to whatever the security policies are being given by our company.
like we all have our company policies, so we have to, adhere with that.
And then we, talk about the nodes, like when, then we have the worker nodes,
in case, for the virtual machines.
Like we have the running workloads or we also may have a physical virtual machine.
Then we talk about the actual workloads, the containerized apps
which are deployed in the cluster.
We'll talk about how we can secure our workloads.
And then we talk about the network part, which is the communication pathways that
connect the services, the applications and the infrastructure together.
So we'll talk about the network security as well.
So let's first talk about the control plane security.
So in control pane, we have the API server authentication and authorization.
So it's really important to secure, this API server authentication.
For that, we can implement OAuth or OpenID Connect.
So that is a very strong way of authenticating and we can also use the
role based access control policies.
Role based access control policies are very popular in the Kubernetes world.
suppose I'm part of one user, I'm part of one group and the group has an owner.
So as a group owner, or as a part of the group, as a group member,
I will only have one access.
The access to some of the functions on Kubernetes, suppose I can, if
I'm part of a, Developer community and I'm using a staging environment.
So I may be able to list, create, delete, or, do anything on a
lower environment, but suppose there is a production environment.
So on that, my role based access as a developer.
should not be able to delete any resource or should not
be able to edit any resource.
So role based kind of access will enable a certain kind of, ownership
or certain kind of rules, to a kind of, the group we belong to.
Suppose there is a developer group, there is a DevOps group, and then
there is an SRE group, and then there are like the incident managers.
So all will have a different kind of role based access.
And the role-based access will define the kind of operations they
can do on a particular workload or on a particular control plane mode.
So this was what we talked, spoke about the role-based access.
Then we come to the HCD encryption.
So what is HCD?
So like we have heard this name a lot in Kubernetes, so it's
nothing but a key value database.
It works like the way, the Redis works.
So it is always encrypted at rest to prevent unauthorized access
to our critical information.
So it's very important to have encryption on etcd because if we don't have
encryption, we all have, we will have all our, basically cube system resources.
Vulnerable because they will all be exposed if we don't have
encryption on the etcd level.
So most of the companies they follow etcd encryption, but like even if we do
a demo project if we do a a Just want to do a a poc still it is very important to
encrypt your etcd because a small gap in not, managing etcd properly, there will
be a risk to, expose your resources.
that is very risky.
it's always recommended do a etcd encryption at risk.
Then we talk about the network policies.
network policies are applied on the Kubernetes services.
Kubernetes services is the way the pods or the other resources communicate with
each other based on the port we expose.
network policies control the traffic between the pods.
Suppose I am in a different namespace and, another app is in a different namespace.
We have two apps.
We don't want them to communicate.
So we will ensure that we will, make such network policies that we will
restrict traffic between the apps.
And there is, suppose there is a database app which uses a logging app.
But for them we will always want them to, have a communication.
So for them we will have a network policy which will allow, Access to
and from the both the apps using the ports on the service objects.
So network policies always control the traffic between the ports.
They minimize the risk of lateral movement within the cluster.
So there's a specific tools which help us leverage the network policies.
like we have Calico or psyllium, they can actually define and
enforce these kinds of policies.
So we spoke about the control plane security.
Now we will talk about how we can actually secure our node.
So node level access if somebody has a worker node level kind of an access
they can do anything they can even remove our Workloads, they can even
expose the secrets if the attackers have the access to the node It really
becomes difficult For us to control what they can do So on the node We know
there is an operating system, so we have to harden the operating system.
So how can we do that?
Just start by minimizing the attack surface at the OS level, use lightweight
hardened distributions like, where we are, which are very less, vulnerable.
So use those kinds of hardened distributions.
For example, there is a Ubuntu minimal.
And then, follow all the CIS benchmarks for the system hardening.
So CIS benchmarks are nothing but security standards, which we have to follow.
To make sure that we are just following the guidelines set by the CIS benchmarks.
Then we talk about the container runtime security.
So on a node we have the container, so it's always better that we limit the
container's access to the host system by enabling app armor or the SE Linux,
so I will talk about the AppArmor, I'll share some resources, how we can, secure
our runtime security through AppArmor.
by which, we are able to make sure that, if a user is not, comes in
or if a profile is not, listed in AppArmor, if it is blacklisted, it
cannot access the container at runtime.
so that's a really important feature where you can restrict certain
profiles through blacklisting and you can allow certain profiles
with AppArmor through whitelisting.
So both of the things can be done.
So in this, you can only restrict the container runtime.
To a specific profile by profile, you mean that, there may be a user or a
group of users or a particular profile, which we have created on the node, which
has access to some of the containers.
So that can be achieved through app armor.
Then we talk about the security on the kubelet level.
So how can we secure our kubelet?
The agent running on each node.
So it runs as a, like a daemon set on, on each node.
So we can secure the kubelet by enforcing, the TLS for communication.
Like we can use, certificates.
And, we have the certificates bind, bounded with the keys.
So always use the TLS certificates, on the kubelet and restrict
the kubelet API access.
So by which, if only a client has the certificate key, those clients
can only, use, or do operations on the kubelet, on the kubelet level.
So it is always very good practice to use TLS certificates.
Thanks.
We can always enforce TLS for communication and
restrict the kubelet API.
So by this we will have no unauthorized kind of access on the node level.
So only the admins which, who have the, the keys for those certificates
can, run commands on the kubelet or all of the authorized users.
So this was all about the node security.
Now let's talk about the application security, the workload security, or how to
actually come inside a container and have policies which can secure the container.
So the most important thing, is to not give a container an elevated
level of access that the user should not have a root kind of access.
Or we should not expose such kind of access where we can
run, all admin kind of commands.
So that is really important, when we give, the pod, when we talk
about the pod security standards.
So initially, there was a term called pod security policies, which
is now deprecated and has been replaced by pod security admission.
So what it does is enforces the security policies, enforcing the baseline.
And, also restricted or a privileged kind of access is also restricted.
then we talk about the runtime security.
continuously monitor the containerized workloads for malicious activities.
for this, for runtime security, we have tools called Falco, which will
determine or detect any kind of anomalies based on the rules and the
behavior patterns, which we have set.
I would also share the resources for Falco tool.
At the end of the session, so it's a really cool tool where we can,
apply runtime securities for the workloads and then, a very important
thing is secrets management.
How we manage our secrets.
As I told you, some people will just expose their secrets
in config maps in plain text.
that is not a good practice and, these kind of config maps are available for any
user to list and, even copy our secrets.
all the sensitive information like the API keys or passwords, first
of all, should never be hardcoded.
Hardcoding a password or an API key in itself, is it's, damaging
and can, cause a really big issue.
So instead we should use secrets management solutions like the AWS secrets
manager where we can store the secrets, and we can rotate them after maybe 30
days or 90 days as per our secrets policy.
We can also use like most of the companies, they use HashiCorp Vault
to manage it, manage the sensitive.
secrets information.
So with, by using HashiCorp vault or even as your secrets, we never
expose our secrets and they're always encoded in a base 64 format.
So these are all the best practices which we can use to secure our workload.
And the secrets management is definitely, a very secure way of,
managing your secrets or any kind of passphrases we use in our code.
So moving forward.
We talk about the network security.
So how can we strengthen our cluster security?
Because one is our, we were able to secure the cluster.
You're able to secure our workload.
Now we have to make sure the way the, our workload and our, all of the application
which is hosted on Kubernetes, how it interacts with the network and how we can
make sure that there are no attacks on the network side and we can prevent them.
So we will talk about the service meshes.
So the implementing a service mesh.
In a large scale applica application is almost a mush because then we will have
a Istio based service mesh, which is in itself very secure because it uses the
mutual TLS, it uses the, mutualist TLS encryption, between all the microservices
which interact, with that service mesh.
using T service mesh.
Or Linkerd kind of a service mesh will ensure that you are secure on the network
level, like if the traffic is coming, it will first come to the service mesh and
then it will get routed to the workload.
So it's always a very good practice to have a service
mesh, before your application.
And then we can do a kind of, security, implement security
through ingress and egress control.
Some of the some of you folks might be aware of nginx ingress controller So
how we can do is that we can secure the external facing services Whatever are
ingress, the domain names which we use, suppose we use a domain name called abc.
com or something called, xyz.
com.
So these are nothing but ingress hostnames.
So the way to secure them is with the help of, a server client
encryption using certificates.
We can always use TLS certificates and we can always use ingress controllers.
So ingress controllers themselves have encryption.
implemented on their, like the network node, so we can define clear rules
for both ingress and egress traffic with Kubernetes network policies.
So how we can leverage is that we can use our Kubernetes network policies,
which, we discussed are, always applied on the service, Kubernetes service,
with these network policies, we can use the ingress, objects and, which
are defined in our Kubernetes cluster.
With the help of ingress controller.
So all the ingress will be defined on the ingress controller and all the
ingress can be, basically protected with the help of, TLS certificate
and that network policies can, we can leverage with the ingress host names.
So that is the way we.
Secure our incoming traffic and even the outgoing traffic through Ingress
and through Ingress controllers.
Then, we might, have heard the term about, distributed denial of service attacks.
the DDoS attacks, the DDoS protection, is also, done by Kubernetes.
So Kubernetes clusters exposed to the public internet are always
at the risk of DDoS attacks.
So we have like cloud services which, can prevent these DDoS attacks.
We have like in all the cloud providers such as AWS, Azure and GCP.
So they all, offer us a very scalable solution where we can get,
protected by using their services all from all these kind of attacks.
Now, we'll talk about a real world threat and incidents, as we are all
managing or part of major incidents.
We, in the past couple of years, we have seen a major incidents in organizations
coming through, security vulnerabilities.
And especially there was an example where the Kubernetes
infrastructure was compromised.
And, there was a lot of big reputational damage as well as a financial damage.
So we'll just take that example here.
So in 2021, Kubernetes security incident happened.
An attacker actually gained unauthorized access to the cluster level API server
due to weak authentication configurations.
So as we just were discussing that if we leave these Cluster API is exposed
because of the weak authentication.
It can be very damaging.
So there was an incident which happened in 2021, where an attacker, was able to
gain the access and basically destroy some of the running deployments.
The compromised API allowed the attacker to basically extract the sensitive data.
the secrets as well as the passphrases.
And was able to tamper with the running workloads causing, service outage,
service disruption at that moment of time.
So this is just an example, just one example, but there were
several, incidents which happened due to same kind of, problems.
unauthorized access to the Kubernetes cluster API server.
how can we actually prevent it from happening?
if we have a strong API authentication and a role based access control enforcement,
that is a very good way of, preventing it.
And as we already spoke about the HCD data encryption, it's very
important to encrypt our, data in HCD, at rest, at, during all the time.
And, another thing which we can do is do a monthly security audit.
Or, there is maybe, do an automation of the security audits.
To catch all kind of misconfigurations or if somebody has taken out any
rule or if there are no role based access being present on a cluster.
So those all can be red flags and, any, all the companies have the InfoSec team.
So they should be regularly doing the security audits to, help us preventing,
from these kind of incidents to happen.
as we spoke a lot about, all of these, level of securities
on all the Kubernetes layers.
So let's also discuss about some of the best practices, which we
can take away from this session and follow in our daily lives.
to secure our kubernetes, clusters.
So i've just listed some of the industry best practices for securing the kubernetes
So first is the regular security audits as we also discussed in the previous
slide, you know Continuously auditing the security kubernetes security of the
kubernetes clusters using the guidelines provided to us by the cis They have set
up a set of benchmarks It always help us to identify, any kind of misconfigurations
or any, gaps in, our infrastructure.
So doing a regular security audit is a must, for the
applications to operate on scale.
Then, another way we can, help our SRE teams and even the DevOps team is to.
Set up continuous monitoring.
So with the help of like real time monitoring, we discussed
about tool called Falco.
There's another tool, an open source tool called SysTick, where
we can always watch for a suspicious behavior across the cluster.
So Falco has rules.
So if a rule for a particular workload is triggered, we will get alarms, we
will get notified on our emails, on our all kind of, notification channels
such as, PagedUtxMatters, where we get notified when there is a breach,
in some, security or if there is a rule which should not have been there
or some workload have been triggered something, which, you know, having
some users having unauthorized access.
So if we have a continuous monitoring in a DevOps culture and an agile
environment that can definitely help all the, the ecosystem to, catch the
incidents, catch the issues in early stages and, take necessary actions.
Another best practice for Kubernetes is, is a golden rule of thumb that always
give least privilege to, all the things.
This least privilege principle is that implement role based access control.
And always follow the reach privilege principle, like any developer or any,
user who doesn't need access or admin kind of an access should not have,
basically an access where they can go inside the container, run commands or
do any kind of, create directories or do any kind of, Stuff which is not required.
So always, as a rule of thumb, give least, privilege to a user who is not required
to do anything important on the cluster.
And then, with the automation, like we can have automated
patching that regularly update and patch the Kubernetes components.
because kubernetes also launches, the, patching, regular patching, if we follow
the kubernetes, documentation, they will be releasing the security patch, I think
once in a month, and it's very important to keep up with the security levels.
So there should be regular patching with the help of, CICD tools like Jenkins,
which can run an automation pipelines.
And, close all the known vulnerabilities in the previous, releases and make
sure that we are always up to date as per the, kubernetes, documentation.
So that's really important to, help, taking out the old vulnerabilities
in the code or in the cluster.
So these are the resources which I am like, sharing with, my viewers.
this is like how we can, encrypt the data on a, on a cluster.
How we can use the, another tool called Trevi.
it's a very handy tool where we can secure our workloads.
I was talking about Sysdig.
So this is the documentation of Cystic.
I was talking about App Armor, that how can we create a profile, create some rules
on App Armor and only a, some specific part of, the users which are part or
are, or associated with that profile can only access some of the workloads and
other users will not be able to access.
If we define the profiles in the App Armor, then, on also sharing
about something called Sec Comm.
So by this you can, secure your network policies, or the
network part of Kubernetes.
then, talking about more about App Armor, there are some tutorials
specifically for the security.
Then we talk about the ALCO rules.
And these are the dogs, the Alco rules.
you can.
Go through there's a cheat sheet for kubernetes security very handy for
someone who implements Security in day to day life on the clusters and then
there is a security checklist, which is also very handy for someone to host
an application or Also, even when you want your application to communicate
with the other applications or the outside, Vendors it is very important
to check the security checklist.
So by that you will know, what are all the parameters, which make
sure that your, the workload and as well as your node is always secure.
So these are really good handy resources and all of them are open source.
So you don't need to buy a, get a license.
You it's everything is available.
Open source.
The beauty of Kubernetes is most of the stuff is cloud native.
And it is open source and, it's very easy to implement and, most of the
stuff you'll find on GitHub that people have written security policies, which
the whole of the world can leverage.
at the end, I would say, security comes more of as a responsibility.
we all should step up and, be very, open minded and, take a responsibility
in fighting as a cyber warrior by why?
Because if we consider ourselves as a cyber warrior, we will find
that ownership in ourselves to protect all our applications.
using these, security best practices.
So concluding, this talk, I would say, as Kubernetes continues to
drive cloud native innovation, security cannot be an afterthought.
This should be no compromise on the security.
Every layer of Kubernetes ecosystem must be fortified to protect against
increasingly sophisticated cyber threats.
So by adopting the best practices such as the network policies,
runtime security, runtime monitoring, Kubernetes, secrets management
organizations can ensure their clusters are actually secure and resilient.
Kubernetes offers us a very like a, niche specialist certification.
So for this, you have to be the, Kubernetes security specialist
certification really offer the deep understanding of the security measures and
empowering all the professionals to take the charge of the cloud native security.
But for this, you have to be, like CKA certified first.
and then you can, sit for an exam for CKS, but if you want to like, grow in
this field and, understand the in depth working of, the security protocols, on
the Kubernetes layer, I think this is a must for somebody who is, motivated
by, the security, cyber security and wants to work in this field.
So that's.
recommendation I can give, this certification will really
help you understand the in and out of the, cluster security.
So that is it from my end.
a massive thank you for the Con 42 incident management team who gave me
this opportunity to present my thoughts.
And a massive thank you to all my viewers who were with me, listening
and, I hope you all gain something and, you all take away really good,
security measures, to secure your, infrastructure to secure your cluster.
And, by this, I would say, if you have any questions, you can reach me out.
on the screen and I'm, really again, thankful for giving me this
opportunity to share my thoughts.
hope you liked it.
And if you have any comments, please do share.
Thank you.