Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi and welcome to our presentation and demo about building internal development
platforms, leveraging the power of Kubernetes operators my
name is Dan McKean and I'm a product manager at MongoDB and I'm joined
by George Hanzaris, who's an engineering director also at MongoDB.
We're responsible for enabling our customers and users to run MongoDB
on or through Kubernetes. Throughout this presentation,
we're going to talk about the advanced in internal developer
platforms, how Kubernetes
has become the standard tool for the
underlying infrastructure for building internal developer platforms,
how Kubernetes can be extended.
And finally, we're going to see how by extending Kubernetes,
we can build platform capabilities and a
short demo, how we build a database as a service.
We're going to start by looking at internal development platforms and
platform engineering. The term platform has been around for a
while. Its most basic definition is a foundation
that developers can use to build software applications.
It provides a set of tools and services that make it easier to
develop, deploy and manage applications.
But the term internal developer platform has sprung up as a
new term in recent years to take things further.
According to the CNCF, this should include reducing
the cognitive load on developers and product teams by
providing them with a set of tools and services that they can use
to build, deploy and manage applications. These are often described
as golden paths. It aims to provide a consistent
experience for developers, which can help to improve productivity and reduce errors.
And lastly, it can be designed with the user in mind and should
be, which can make them more user friendly and efficient.
Platform engineering, unsurprisingly, is the effort that goes into
designing, building and iterating and maintaining such
a platform. So a platform offers and composes
capabilities and services from many supporting infrastructure
or capability providers.
Platforms bridge the gap between underlying capability
providers and platform users such as application
developers, and in the process they implement
and enforce desired practices that
we call golden paths platforms capabilities may
comprise several features of an IDP, meaning aspects
or attributes of a platform that users can use, such as
observability, tooling, managed databases,
secrets management and more. As platform
interfaces, we describe the ways that platform capabilities are exposed
to users of an IDP. These can include a CLI,
a Githubs workflow using a tool like argo, CD or
flux, or developer portals like backstage.
So now we're going to look at the rise of Kubernetes in
platform engineering. While there's not that much
data to categorically prove that the majority of IDPs
are built on Kubernetes. It does seem to be the consensus in the industry that
Kubernetes is winning as the platform of platforms.
Reports from reputable industry analysts like Gartner,
Forrester or IDC often discuss the role of container orchestration
and specifically Kubernetes in the IDP space.
Many IDP vendors highlight their use or support of
Kubernetes on their websites, and many publish case studies
that mention Kubernetes usage within the IDP
implementation. But why is Kubernetes making such
a powerful impact for IDPs?
Firstly, containerization. Kubernetes unsurprisingly
excels at managing containerized applications.
IDPs, by their nature, often involve integrating various
services and tools, and Kubernetes makes this easier by packaging
these components as containers, making them portable and easier
to deploy and manage across different environments.
IDPs can experience fluctuating workloads depending
on their integration needs. Kubernetes facilitates automatic
scaling of resources up and down based on demand,
which ensures efficient resource usage and optimal performance.
IDPs often orchestrate complex workflows involving
multiple services. Kubernetes excels there automating,
deployment, scaling and networking of these services within
the IDP. Kubernetes offers
a platform agnostic approach, allowing IDPs to run on
various infrastructure platforms, cloud on premise
or hybrid, without needing to rewrite the code for each app environment.
This gives flexibility and declarative management.
Kubernetes inherently uses a declarative approach where you specify
the desired state of your application and Kubernetes manages achieving
and maintaining that state. This simplifies IDP management
and reduces configuration errors. In essence,
Kubernetes provides a robust container orchestration platform that
simplifies IDP development, deployment management,
offering scalability, flexibility and efficient resource
usage. But Kubernetes hasn't always
been ready to be the platform of platforms. There's been a few
key developments. Originally, Kubernetes focused just on
container orchestration, but huge improvements in extensibility
have meant it could provide a platform for far more like databases,
security systems and other cloud native components as
well as extensibility. The CNCF has fostered
an amazing, rich ecosystem of tools and services that
can be deployed and managed on kubernetes,
and standardization and abstraction have played a big part.
They've meant that Kubernetes has become far more interoperable and
the abstractions have provided simplified management. The aim
is to make the underlying complexity invisible, further solidifying
its role as a foundational platform. And lastly,
there's various extensible interfaces, which make integration with external
tools much easier. Container storage interface
provides a welldefined API for attaching, mounting, and managing
persistent or ephemeral storage for containers.
Container runtime interface acts as a bridge between Kubernetes
and the underlying container runtime engine like docker or container d
in each node of your cluster. In simpler terms, it allows Kubernetes to
talk to different runtime environments without needing to be rebuilt for each one
and container network interface defines how network
plugins manage the creation and configuration of network interfaces.
This allows you to use various networking solutions with your
clusters, providing flexibility in how containers communicate.
However, nothing is perfect, and there are often discussions about how Kubernetes
is potentially becoming too complex.
While it can handle various deployments, it might not always be the
most efficient choice, and the future may see a balance
between leveraged kubernetes strengths as a platform and using simpler
tools for specific needs. But abstraction might yet
provide the solution to simplify Kubernetes.
Before we go further, to understand how some of those key
developments have happened in a more practical sense, it's critical
to recap some of the fundamentals so
we can dive into the control plane, which acts
as the brain of the operation. It manages the worker nodes
and applications running on them,
so there are multiple components on the control
plane. The API server acts as a central point
of contact. It receives requests from the users and applications to manage
resources in the cluster. The scheduler
decides where to run pods on the worker nodes. It considers factors
like resource availability, pod requirements to ensure efficient
allocation. Controller manager is a collection
of controllers that together ensure that the state of the cluster matches
the desired state, and we'll come back to that in a
minute. And finally, etCD, which is a highly available
key value store that acts as the cluster's single source of truth.
It stores all the configuration data about the cluster and its resources.
These components work together to automate and manage the
lifecycle of containerized applications within the Kubernetes cluster,
it's worth digging a little deeper into the controller
manager and the control loop. They're intertwined
concepts in Kubernetes, and they're also critical as we start to talk
about operators. As mentioned, the controller manager
is a collection of controllers that together ensure that the state of
the cluster matches the desired state. It's made
up of several controllers.
Examples include the replication controller, which ensures a desired
number of replicas are running for deployment the
endpoint controller, which maintains a list of pods that provide a particular service,
or the namespace controller that manages the creation and deletions of
namespaces in a cluster. Now each controller
follows a control loop pattern. Initially we
have the getting the desired state. The controller retrieves
the desired state for a specific resource like
number of replicas in a deployment from the Kubernetes
API server. Then you
get the current state. So again the controller queries the
API server to determine the actual number of pods running,
and then the controller compares the desired and current state.
If there's discrepancy, it issues commands to the API
server to take corrective actions like scaling pods up or down.
Now the control loop is the core principle behind each
controller. Within the controller manager. It's a continuous iterative
process that ensures the cluster state remains aligned with the desired
state defined in Kubernetes configuration. Control loops enable
self healing if a pod crashes, the control loop
automatically detects it and launches a new one.
Scalability. So when pod replicas
need to be adjusted based on demand, the control loop ensures
the automatic scaling up or scaling down and
consistency. So the control loop continuously works on rectifying
any deviation from the desired state, keeping the cluster in
a stable and predictable condition.
So in essence, the controller manager
acts as the conductor, orchestrating the control loops of
individual controllers. These loops constantly
monitor and react to changes, ensuring the cluster
maintains the desired state. So now we're
going to take a look at how Kubernetes can be extended.
The inbuilt controllers of kubernetes are critical, but one of the most
significant developments made to Kubernetes in recent years is
the ability to extend Kubernetes with custom controllers.
What if you need to manage something beyond the basic functions?
That's where these custom controllers come in. Custom controllers are
user defined programs that extend the capabilities of kubernetes.
They work in conjunction with custom resource definitions to manage
custom resources specific to your needs.
Here's a breakdown of how they work in a custom resource definition.
You define using YaML, the schema for your custom resource.
This essentially creates a new type of object that Kubernetes can recognize
and manage. Custom controllers typically written in go,
then interact with the Kubernetes API and follow the same control loop
pattern as built in controllers. They get the desired
state, they get the current state, and then they reconcile comparing the desired
and current state, and if there's any differences,
they take action to achieve the desired state. There are
several key benefits of custom controllers. They extend the
native Kubernetes functionality by enabling you to manage resources specific
to your application or infrastructure needs. They allow you
to manage these resources, these custom resources in
a declarative way. You just define the desired state of your custom
resource and the controller handles achieving and maintaining it and
automation. Custom controllers automate tasks related
to managing your custom resources, for example,
rolling updates to your pods. So what exactly
are Kubernetes operators? How do they fit in
with custom controllers and how are they different?
So operators package up custom controllers and
a few more components to make it easier to deploy and manage
applications. So where a custom controller
executes a single control loop to manage a specific kubernetes,
custom resource operators not only package multiple
controllers, but also additional assets needed to deploy and manage an application.
These can include CRD definitions specifying the custom
resource type. The operator manages
controller code so the custom controller program that enforces
the desired state for the custom resource. Deployment manifests,
which are YAML files defining how the application components
like pod services are deployed in Kubernetes
service manifests defining Kubernetes
services needed by the application helm charts
packaging configurations for the application in a standardized
format documentation monitoring tools
like information utilities for understanding and managing applications in Kubernetes.
So by building these elements together,
operators offer a self contained unit for simplifying management
within Kubernetes of an entire application throughout
its lifecycle. So why are operators
so useful? Firstly,
simplified application management by providing a single
tool for deploying and managing applications throughout
their whole lifecycle. They offer a declarative
approach where instead of writing complex deployment scripts,
you simply find the desired state of the application using
the operator's manifest files.
Reduced errors by automating manual operational tasks,
minimizing the risk of human error during deployment or configuration.
Standardized packaging operators promote
a consistent way to package applications. This standardization
makes them more portable and reusable across different environments.
Operators can be designed with domain specific knowledge for
particular types of applications like databases, messaging systems and
so on. And this expertise ensures the operator
understands the application's intricacies and configures it
optimally within the Kubernetes environments.
Finally, a rich ecosystem of operators exists
for various applications and functionality, so you can find pre
built operators for popular databases like MongoDB,
monitoring tools and other popular components,
which saves you time and effort in managing them individually.
But not all operators are alike, so Kubernetes
operators can be broadly categorized into two types based on
the resources they manage.
Internal operators focus on managing resources that
are entirely within the Kubernetes cluster itself.
These resources are the core building blocks of applications deployed
within Kubernetes. They use both inbuilt and
custom resource types, including things
like deployment, stateful sets, persistent volumes
and so on. On the other hand, external operators
extend Kubernetes's reach by managing resources that
reside outside of the Kubernetes cluster.
This might be
a self hosted service running outside of Kubernetes,
something running in one of the hyperscalers, or any
other external service like MongoDB's Atlas developer data platform.
Despite their differences, both internal and external operators
offer the advantages associated with operators in general,
so simplified application management, declarative configuration,
reduced errors and standardized packaging.
And by leveraging the appropriate operator type, you can
effectively manage both internal and external resource
dependencies, leading to a more
robust and streamlined application deployment and management
experience within Kubernetes. So now we're going to move
on to look at a demo in the demo, we're going to focus on
just one element of an IDP, using Kubernetes and operators
to show that. We're going to demonstrate how you can build a database
as a service into your IDP. We're going
to use an external database as a service. I'm going to be Atlas.
Atlas is a software as a service offering value as
a developer data platform of different complementary
tools like search, analytics and support for mobile apps,
as well as support for serverless time series data,
geospatial data, and multicloud global distribution
and resilience. We're also going to use the Atlas Kubernetes
operator. It enables you to use the same tooling and processes to
manage Atlas as you use for your services in Kubernetes.
Atlas doesn't run in the Kubernetes cluster, but the operator allows
you to use declarative configuration files that can be applied into Kubernetes
where the operator picks them up and using control loops makes
changes to Atlas via the Atlas admin API.
We're also going to build a Gitops interface for our users
using Argo CD as the mechanism to bridge the gap between the IOC
files in the repositories and Kubernetes.
This gives us a highly automated and standardized Gitops workflow,
and in doing so cuts down on the expertise and permissions that individual teams
need, as no more direct interactions with Kubernetes are needed.
This provides a self service mechanism. Let's see now
an operator in action and how managing external resources
really works. An operator generally can be installed
in a few different ways through Kubectl helm
charts automation scripts. In this case,
we'll use an automated way. So we consider having
installed the Atlas CLI and this makes installation
and configuration super simple.
So by running the Kubernetes
operator install command, we firstly install
the Kubernetes operator. But all the necessary setup happens in
the background, such as creating new API keys for the operator to
manage external resources.
We can see that the operator is now running in our cluster,
and if we go into our Atlas UI we see
that the relevant API keys have been created to use.
And finally in the background the operator has safely stored
the API keys as Kubernetes secrets,
so we can use as we make the API calls to reconcile.
So going to our text editor we are going to create an Atlas
project. This is defined
as a Yaml file we apply in Kubernetes and we see
that the new custom resource has been created in Kubernetes
and under the hood the operator has read the resource
and through the Atlas Admin API has created this new
project in our Atlas
account. So we are now ready to deploy our first cluster
back again to our text editor simple Yaml file. To create
an Atlas deployment we describe the
cloud provider, the instance type, the region we want
this cluster to be deployed at.
Again, a simple Kubectl apply,
and we see that the custom resource again is created
in our Kubernetes cluster and the
operator has created it through the Atlas admin API
in atlas. So let's move on
to build the platform interface as well. So integrating
and exposing this database as a service functionality
through a GitHubs workflow using ArgoCd now
again, installing argocd is pretty simple.
Firstly, we'll be creating a namespace where
all of the operator components are going to be running in,
and then the installation of the operator is automated
through an installation script. Just a single command and
a couple of seconds later we can see all
of the argo CD operator pods are running in our
ArgoCd namespace. Now. Next we'll be creating an ArgoCD
application. We are going to define the repositories
that argo CD will be watching to find new Kubernetes
configuration files and the destination Kubernetes cluster,
the namespace that we're going to be using to deploy resources.
Going into the Argo UI we can see this TBAs application created
and we can monitor the sync status and so on.
So to deploy an atlas project through this interface,
we use the exact same file as before, same configuration.
The main difference is instead of Kubectl apply, we just
do a git push in our infrastructure as code repository
and then looking into the Arco UI. We see that our Argo
application has picked up the change and has created the Atlas
project custom resource in our Kubernetes cluster.
And here it's same as before. The Atlas
Kubernetes operator has picked up this custom resource and
has deployed the new project in our Atlas
account. Now deploying the cluster again, we're going to
use the same Atlas deployment file.
We're going to be using it in the Githubs project that we just created.
Git push for the cluster creation and
all the same magic happens under the hood. Argo CD
has picked up this change in. We can see this in the UI.
It deploys this custom resource in Kubernetes and then the
operator creates the new database in our GitHubs
project. Now for the last part,
as we've streamlined the deployment and management of new databases,
we want to see how we can leverage the operators and make it easy
to connect to those applications in a Kubernetes native way.
So we'll be using a new custom resource called Atlas database
user in order to manage database access. And since
each Atlas database user is just another YAmL file
in our repository, the operator again makes
that super easy to manage through our Githubs workflow.
So we've already created a simple Kubernetes secret which
contains the user password. We have a custom resource
definition for our database users where we refer to the password as
we create the user. Similarly, as we did before
and through our GitHubs workflow, argocd picks
up the changes in our git repository,
sees the new resource here and deploys the
Atlas database user in Kubernetes.
Now an additional step here is that the operator creates a Kubernetes
secret which contains all
the connection information we're going to be needing to connect to our database.
If we explore the secret here we can see the connection string
that our applications need. So for the last part we'll be creating
a simple Kubernetes deployment. We'll be using the Mongo
image to run a Mongo shell to connect to our atlas
database and we read and
use the connection string as an environment variable for our
Mongo shell. We deploy this application,
we use Kubectl and we can see that our
Mongo shell pod is running.
And if we inspect the logs of our pod, we can see
that this pod has connected to our atlas database using
our connection string. Now as this was the
end of the demo, I'm pretty
sure you were expecting fireworks towards the end.
The good news is that when it comes to managing infrastructure
in critical production environments, not having fireworks is
a good thing. And if you also want to experience this transformation,
this is a good time to prep your phone.
In the next slide, Dan will show you a QR code where we
can help enable you build an IDP through operators.
Through the link or the QR code you can find a landing page that gives
you all the information you need to replicate our demo or
to even take the next step and build your own IDP database
as a service using MongoDB Atlas thank you very much
for joining us.