Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everybody and welcome to conference 442 Kubernetes
track. If you wonder if this is yet another Kubernetes stock
then the answer is maybe stick around. Then let's find
out. So today's presentation is entitled I
don't know kubernetes and at this point I'm too afraid to ask.
This title is rather extensive and today we're gonna
touch upon Kubernetes API, Kubernetes controller
and operator pattern. Today's agenda is a
rather simple and straightforward one.
We're gonna talk a little bit about Kubernetes history,
then I'm gonna share some personal blunders and aha
moments. Then we're gonna have a small demo concerning operators.
And last but not least, we're gonna talk about Kubernetes
importance in today's landscape. A couple of words about me.
My name is Alex, I'm a site reliability engineer at
systematic and in my spare time I like to contribute
on various platforms. You can find me
under that handle Dejano Alex. Feel free to reach
me if you have any questions concerning this presentation.
Kubernetes has roots in two
internal systems at Google. Borg,
a cluster manager and the other one omega.
These are some snippets from the white papers which are publicly
available. I highly recommend to read them
a little bit just to understand kubernetes.
Now the first prototype of kubernetes was written in
Java. It was actually a Borg cell running
on a local machine.
Later on being rewritten in go for obvious reasons.
In 2014 we have the first public commit
followed by one year later
followed by CNCF
Cloud Native Computing foundation in which Kubernetes was
donated as c technology. So here we can
see Google. We are pleased to
contribute Kubernetes open source cluster scheduler formulation
is rather interesting, right? It's not just a
plain orchestrator. Now for those who don't know,
CNCF Cloud Native Computing foundation
is an umbrella for open source projects.
This is the CNCF landscape. And in here
each small square it's actually a open
source technology that addresses a specific need. Here we have the
main pillars, application definition and image build
databases. Here we can see under scheduling and orchestration
Kubernetes. And if we zoom out
a little bit, if we scroll, we can see that
the landscape is rather daunting. We have almost 200
technologies in CNCF. Now why
it is important that Kubernetes was the
c technology. Moving further,
after almost ten years, our job
market looks like this. Desired characteristics kubernetes
technical stack kubernetes. Nice to have kubernetes.
Last but not least, what your day will look like. Kubernetes. So these are
snippets from actual job descriptions. And fortunately
or unfortunately Kubernetes a prevalent technology.
So more or less evolved under,
around kubernetes, right. Being c technology
in CNCF. Our entry
point in today's discussion is Kubernetes Lingo,
right? So let me share with you a funny
story. So in the beginning,
involved in various meetings, one of which someone
said we need a solution that supports automatic resource
bin packaging for our workloads. And I was okay,
so we're not talking about Kubernetes because Kubernetes is a
container orchestration system, right? But actually this
is the technical definition. So coming back to the Borg white
paper and, and omega, right, that, the scheduling part,
the automatic resource bin packaging,
this is a very important feature
of kubernetes. And of course we have various terms,
right? Like naked, naked pods, both which don't
run under a controller. We have things like workload.
So you might say my pod has been scheduled or you can say the
workload has been scheduled. We have various container
design patterns like init containers, sidecar containers,
ambassador pattern, right? For, for containers.
And also for those interested in Kubernetes
administration, we have things like static pods.
Now almost every presentation concerning Kubernetes starts with
this high level view of Kubernetes architecture in
which we have the control plane with API server, scheduler,
control manager, etCD and the
data plane with workers. Right?
Now one might say okay, so this is the special thing concerning
Kubernetes. But if we were to take a look at another
technology, we can see that it
also has a control plane with managers. It all has a
state store, raft based state store like ETCD.
And last but not least, it has a data blame with workers.
So for sure we can see some similarities, right?
Then this begs the question, what's the special thing about kubernetes?
Even though we work with yamls on a daily basis, behind the
scenes, Kubernetes has a nice HTTP based API
server. So behind the scenes they're actually jsons and
some, some protocol buffers, right, for internal calls.
And this API server, which is the core of Kubernetes
architecture implements the controller pattern,
right? What do we mean by controller pattern? It's a simple
watch loop that watches the desired state, meaning the
one that we have in our yamls and the current
state, which is the actual state of the,
let's say objects that we have running in our cluster
and tries to do the reconciliation, right,
the famous reconciliation loop, it's actually the controller
pattern which tries to adjust the current state
in order to match the desired state.
Now the controller mainly does three things.
It observes the desired state, it analyzes,
and last but not least, it corrects the drift of our
current state, right? So the controller,
it's, as I said, it's a, basically it,
a watch loop, a generic watch loop, so to say operators.
The next step, operators are a design pattern for extending
Kubernetes API and creating software to run
kubernetes. This is a rather steep definition.
Now basically an operator is
just a custom controller, right? A custom watch loop.
And most times operators
need some, some custom resources. Here on the right side we have some
custom resource definitions. So if you do Kubectl get
crds, you're gonna see custom resources
that you, you are having in your cluster.
Things like harbor or Jagger or keycloak
configuration resources. Right. Now we're
gonna try to have a demo concerning operator.
Now the, the entire premise of our
setup is as following, so as we said, the operator is
a custom controller. We're going to have an operator running,
we're going to deploy an operator in our cluster. The operator is
being configured by a custom resource. So we're
going to have a custom resource definition. And actually a custom resource is a
way of extending Kubernetes API.
And the entire purpose of this operator is
to aggregate logs from various parts.
So let's see it in action,
starting with some, some small prerequisites.
Right now in kubernetes we have the documentation
at our fingertips. So if we do a Kubectl explain
pod, for example, we can see exactly
how the controller works, so to say, right? So we have some nested
fields, the spec which is the desired state and the status,
the most recently observed status of the pod,
right. If we drill down a little bit and take a look at the
status, we can see the all known phases like
fail, pending, running, succeeded and
unknown. So these are our pods,
specs and status, right. The, the fields that the
controller uses. Now you
might have heard that kubernetes is a computer abstraction.
And why is that? For example, on my local machine, if I would
like to have a sorted view about my processes
resource consumption, I can do a top, I will select CPU
as a key for my sorting. And here we have a sorted
view concerning our resource,
resource consumptions for our process. But guess
what? We can do the same thing in kubernetes so we do a Kubectl ted
top pods. Let's take a look at the containers.
Let's sort by cpu and
let's take a look in all namespaces. So here we have a sorted
view for our pods. Guess what?
We can do the same thing for our node.
So we can do Kubectl top nodes.
Here I have the simple Kubernetes cluster provided by,
by my setup by the Docker desktop solution.
The interesting part is if we increase the verbosity
a little bit, we can see that behind the scenes an API
is being called, more exactly the nodes API.
And we have a response body. So the JSON that
we, we talked in the beginning,
now we can do the same request having
the endpoint, right? So we can do a
row request to this endpoint and
let's pipe it to JQ. And guess
what we're, guess what, we're going to have a response, right? So if we
do Kubectl top pods, top notch,
sorry. We're going to see
that we have the same response. Of course,
previously we had it in nano cores and here we have millicores.
Nonetheless we can talk with API.
So we can see that behind the scenes. It's a nice API.
Now if we take a look at APA resources, let me zoom out a
little bit, we can see that we have our,
let's say vanilla objects like pods,
secret services deployments. Right?
Here we have the APA version, the API groups, and here
the names for our kinds. The kind, it's so to say
object in kubernetes, more exactly, is an API resource endpoint.
Now if we grab four metrics,
we can see that actually this
is the APA, that API endpoint that we use
when we did Kubectl top pods or top nodes,
right? So if we
take a look at this API group, we can see that we have
two objects behind matrix API
group. This is the version, right? And here we have
the kind meaning the objects. Now we have an
idea that behind the scenes there, there's a, there's an
API. We basically talk with the AP to get various information.
Now the interesting part is that we can extend this
API. So for example,
if we would like to extend the API with so to say KCD
API endpoint, we just verified we don't have it.
We need something and that something is a custom resource.
So if we take a look at our custom
resource, there are a couple of important things in here,
meaning the API group. So we said that we
need or we want the KCD API
group. Behind this API group we have a
single object of kind log drain.
Right. And last but not least, another important thing is that
this object has only one spec, meaning the target spec
which is a type string. So now if
we were to apply this
custom resource definition,
we can see that now we have our new APA group
of version V one alpha. And behind this
API group we have this API resource kind
log drain. And the nice thing is we can explain
it. So if we do x explain LD,
the short name logger drain, we can see that is a resource that
allows the configuration of a dim operator. So it needs
an operator, right? But before creating
the operator, let's take a look at the specs. So if
we're looking at the specs we can see that it has
only one field as we we know the
target field and it's a specification
for managing logger aggregation to the demo operator
drain. And the target is used to set
the desired label view for the operator to pick up.
So now this begs the question, what's an operator? We said it's a
custom controller, a custom watch loop,
but actually it's just a simple application, a simple
binary. We can take a look, have the small
script package as a weekend.
Actually let's build it up.
So we need to build our image,
let's use the same tag
and manifest.
And basically we packaged our application
as a docker image and
now we need to deploy it. But keep in mind our
application needs to talk with Kubernetes API. Therefore it
needs some RBAC roles. So we need to create a
service role and some cluster role bindings for our
application. Of course in
order to deploy our application we're gonna need to apply the deployment.
So we have a deployment and
if we take a look at the pods which are
running in the default namespace, we can see that we have our
operator which is the our
application running in a pod. So if we do a Kubectl logs
and follow the dim operator, we can
see that our operator looks for
logger drain objects. Now let's create
our logger drain object. Let me split the screen here.
Now of course keep
in mind on the left side we have the operator, our custom loop
which is running and it's looking for logger drain
object. So we created our custom resource definition
but we don't have any logger drain objects. So if we do a Kubectl
get logger drain, there's nothing.
Now let's create an object. So let's
create our local drain object.
It's of kind. Log drain. Let's call it demo LD.
And here let's set the target as being KCD.
And now as soon as we create our
logger drain object, so we're going to kubectl apply
it, we can say that our pod has
already seen the logger drain object, right? So logger drain
object named demo LD with specs target KCD.
So that's nice. Now we have a logger
drain object which tells to our operator pick
up the logs for all ports that have the label KCD.
But we don't have any pods label kcd with the label
KCD. Therefore we need to create.
I'm going to use a simple Nginx image
and I'll create a naked pod. And as soon as my
nginx pod named pod one will be healthy and
ready, we can see that our operator picked
up its log so it found pod one in
default namespaces. And here we can see the
logs of the nginx based
pod. We can make another naked
pod. Remember, the scope of the operator
was to drain the logs for all pods
with the label target KCD.
So here it's a simple flask based application
and the story remains the same.
As soon as the board is healthy,
we can see that the logs
have been picked up for pod two as well. And here we can
see the pods right now.
Coming back to our presentation, what we've seen,
we've seen that a resource in Kubernetes
is actually an endpoint, right? It's an endpoint.
Kubernetes API operators are custom controllers
which allows us to have opinionated resources
on top of Kubernetes, meaning that we have a way of extended
Kubernetes API. Now this enables us
to build various things on
top of Kubernetes, right? And here
we've seen the entire adoption of platform engineering,
right? It's rather easy to extend Kubernetes API,
but this doesn't mean that you really need to do it. I mean you
can have something that it works,
but it lacks a proper developer
experience or a proper use case, right? So keep
in mind, if you want to extend Kubernetes API is rather easy
using custom resources and custom controllers,
think twice before doing it, right? In many
ways, everything is an abstraction, right? So we're working with abstraction
on daily basis, right? So when you create a custom resource
in an operator, you create an abstraction that
can be used in Kubernetes. As we've seen,
Kubernetes is a platform for building platforms, right?
It's easy to extend its API and it has
almost of a plug and play approach with regards
to the API. And last but not least, a very important thing.
Kubernetes is only as good as the infrastructure it runs
on top of, right? So keep in mind having
multiple small nodes versus
a few larger nodes. This is a ongoing debate and
Kubernetes will not do any magic if the
infrastructure under provision.
That was it. Thank you very much.
Don't forget, if you have any questions with regards to this presentation,
this is my handle. Have a nice one.
Goodbye. Thank you.