Transcript
This transcript was autogenerated. To make changes, submit a PR.
You hi, I am Orali Darbasani,
staff software engineer at Ivan. In this talk I will go through
a few challenges companies are currently facing when using Apache
Kafka. We would also explore a few tools
and aiven into claw which is an open source project. I will
also give a demo of it and see how it actually overcomes
those problems. We have about 30 minutes for this talk,
so let's begin. Let's start with some background on Kafka
itself. As we all know, Kafka is widely adopted
in many companies of all sizes due to its unique
measures such as scalability, retention, reliability,
making it apart from any traditional messaging platforms.
Kafka is similar to other messaging solutions, but with
its exceptional features it definitely makes it stand out.
However, adapting Kafka comes with its share of challenges.
Now, many companies include Kafka in their technology stack as
it enables them to grow and scale effectively.
Now events are published to Kafka and they are actually stored into topics
which can grow significantly as the applications
around them also grow. However, managing these
topics and their authorizations can be a very big struggle
for many organizations. I should say large
companies might have hundreds or even thousands of
these topics, and the improper governance and
the authorization management can become a big headache.
To address these challenges, organizations often turn
to open source or commercial products, and some even
develop their own automation frameworks using tools like Jenkins,
Git, confluence, or even Excel. Now in our
next slide, we will see how bringing structure to an
entity would be more productive.
This is the image of a library where it is almost
impossible to find the right book you are looking for,
and on the right hand side we see how books are properly
organized and categorized. Most importantly,
they're also tracked on who is actually renting those books,
right? And for how long? We can actually compare
these books to Kafka topics. In our next slide, let's see
how topics actually are represented in Kafka.
Here we are seeing a typical Kafka environment and provision
of topics to different producer and consumer teams.
Some of these problems include manual activities like
one single topic creation. Providing producer and consumer access
and even promoting to higher environments can take
up to ten email communications at least. Now executing
these steps, which means these actual commands execution can take
up to two to 3 hours, excluding these non human
response time. Do these users
have the authority to request for access on these topics?
And whoever is actually creating those topics, there is no record
of who executed those manual commands for
topics and the ACL creation basically,
a centralized governance is challenging when
actions are initiated through emails or maintained in
spreadsheets and manual release process can
lead to errors and system outages, especially when
we think about schemas and their evolution.
Deployments of various versions on the schemas
would become critical. Do you know that several Kafka
consumer clients get into deserialization issues when
a new schema version is released on a topic and basically
they are not aware of this unless the right compatibility is set
on the topic schema? Various questions emerge in
the context of kafka implementation, posing challenges for
developers who has the permission to produce and
consume on a topics. We know it's an application from a team,
but what if they are not authorized to get access?
Who owns a topic or a kafka connector or
most importantly the schemas? How can
topics be promoted seamlessly from lower to higher
environments with tested configurations? Is security
properly enforced with access controls? Now how
can the kafka configuration be backed up to manage the disaster
recovery process? Now these challenges are typically
not evident during the initial stages of the project.
However, as the application or the project expands
and the number of topics grow, managing them
really becomes cumbersome. There are a few
tools available to address certain problems in this space
and below I have mentioned a few.
Kafka manager from Yahoo is quite a good one
with a nice user friendly interface and if you are
looking for metadata to be stored in git and with
CI CD pipelines, then Julie Gitops has a
good automation there producers is a UI
for Apache Kafka. It is a simple tool that makes
your data observable and also helps you find and
troubleshoot the issues faster. These tools
are doing quite good, but not actually what we are looking
for. For example, a user can get a topic or ACL
directly created on the cluster without any approvals
and there wouldn't be any ownership on the topics. So it's very hard to find
the right contact person and that's where we see the need
of claw. Claw is a toolkit web application
designed to automate the process of creating and managing
the kafka topics, ACL schemas and connectors.
It focuses on four main principles, governance,
self service, security and automation, which we
shall look into detail in the next slide.
Governance involves defining roles, responsibilities,
ownership, auditing activities, and naming conventions,
to name a few, and claw acts as the
single source of truth preventing manipulations on the clusters.
However, Claw can identify those manual changes,
notify administrators, and synchronize across the systems.
Self service it empowers teams to become
independent of the infrastructure teams and
manage the kafka configurations. They can create,
promote, edit and claim topics, provide producer
and consumer access and request for schemas and connectors.
Security it is quite crucial for preventing
unauthorized manual changes on the cluster.
Claw supports various protocols like kerberos and
SSL for secure connections,
and users can log in using these active directory credentials
or existing SSO mechanisms.
And the last one, automation is a core feature of claw,
making it self sufficient. It enables easy provisioning
of configurations and allows smooth promotion
of topics across different environments.
Metadata on Claw can be synchronized back and
forth from the cluster, with users receiving notifications
for every request or the configuration update.
Now that we have seen the fundamentals, let's see the architecture
of Claw. Claw is developed with React, the new
UI as the front end technology. While it
also has angular based UI, Claw has two
jaw based spring applications. Claw has defined workflows for
applying configurations to Apache, kafka clusters, and also
other types of clusters. Now, instead of directly creating
configurations on the cluster, we know Kafka follows the four s principle
concept. This approach entails raising
a request and obtaining approval before
actually implementing any changes. Benefits of workflows
it's like it provides an additional layer of security by mitigating
any risks associated with those manual entries.
Basically, it ensures a thorough review and verification of the request
by another person, ensuring the sanity of the application,
and we can easily track these configurational change history.
All this ownership metadata and actual topic
ACL schema configurations is stored in the metastore.
By default, Claw will use H two as the metastore.
This means that there are no additional dependencies to get started
with your project. If you prefer to use another
rdbms such as MySQL, we of course recommend to use
it. And the second spring application concerns
to clusters for cluster related operations.
Now another nice aspect is you can connect
to any number of Kafka clusters and there is no limit there.
As Claw runs on this concept of forest principle,
it is recommended to use Claw with two users,
at least to request and review topics, et cetera.
Now, in the next five demos we will have two users,
William and Jennifer, where one requests and the
other approach them. I will be demonstrating provision of
topic ACL schemas and also the disaster recovery process.
This use case is to provision a Kafka topic.
Here, William requests for a topic which is validated and
stored in the database, and Jennifer reviews and approves the request
on approval topic is provisioned on the cluster. That's it.
Now, same principle is applied for the other elements too.
I have two browser sessions open, one with William user
and the other one with Jennifer. So this
is William, the other
one is Jennifer. So let's request for a topic and these approve
with the other user.
Go to dev say let's say test
topic for demo conf
we can mention a topic description topics
for demo and we could also add some text for
the approver to know what is these topic about and why do
we need that topic?
Need needed for demo
we can see our topic request topic for democonf.
Let's if we go to approve requests, then we should
be able to see it in the last one. So here is the
topic request. Let's go to view that and
approve the request. And if you go to topics
search for conf, we should be seeing the
topic. Now that we have seen this topic creation,
let's get a consumer access on it. Usually there
would be multiple applications producing and consuming from the Kafka topics.
Now these applications can be owned by different teams
and these consumers teams would have to request for read access
on the topics. If this process is not
automated, it could take ages to get approvals and it's
very hard to track what's actually happening in the background.
In claw, any team can view a topic and
request for access on it. It's the topic owner team
who actually decides to approve or decline the request.
Now note that claw masks the ACL information for
other teams. Only the owner teams can view the acls
like IP addresses or sfls or the principals.
This is an added security clause in the demo here.
William requests for a consumer ACL on a topic which is actually owned by
a different team. And a member of that team
is going to approve the request on approval. ACL is directly
created on the topic, which means the relevant application can
already start consuming from the topic. Now let's get
into the demo. And for these demo purpose,
I have moved Jennifer to a different team.
We can see that here. Jennifer now belongs to Devrel team.
So let's get into the topic. If you go to subscriptions,
currently there are no subscriptions on it, I will request for a consumer ACL
on it. Go to consumer and then
consumer ACL access now
that the request is created. If we come to approve requests
from the topic owner team, if we see in the
acls, we should
be seeing this request. We can view it and
this is the ACL access.
Now let's approve this request.
Now that the request is approved, let's see these
topic.
We can see there is one consumer access on it.
So we have now created the topic and also provided a consumer
access. Schemas are created for events
or measures now. They provide a good structure to the event.
In Kafka projects, it is very much needed to
define these schemas in the initial stage of the project now,
else it would be hard to get things in the right direction.
As these project grows, claw relies on the rest API
of the schema HT server for all of its API operations.
Clause uses the concept of a team for schema ownership
and management where the topic owner team requests a
schema for a specific topics and the team that
owns the topic is responsible for making the final decision
on any schema related request, such as approving or declining
the request. Now claw enforces these topic naming
strategy to measures. Only one schema
is applied per topic which uses the topic name
to identify the schema. Subject used for the schema
lookups now clause supports Aiven, scarapace and also
confluent schema registry in the demo.
Here, William requests for a schema on the topic which is owned by
them. Request is validated and stored, and the
other user, who actually belongs to the same team is going to
approve on approval. Schema is directly created on the subject.
Now let's get into the demo. To request for schema on the topic,
go to the topic and schema tab. There are no schemas here.
Request a new schema, select dev
upload a new schema.
Okay, so we have now selected a schema.
Now that we have submitted a request, let's go to the other user.
The schemas tab you can see already
the request is waiting. Let's approve it so
we can see now the schema exists. Promotion is
a key feature of claw that improves governance,
administration, and control of the topics. Now, with topic
promotion, a topic can be initially created in the lowest environment
and then promoted to next environments as needed. Now, once a
topic is created in the base environments, you can promote it.
Now, this will create a promotion request that your teammates can
review, approve, or decline the topic. Overview will
show the environments of the topic where it is configured, including the environment
to which you want to promote the topic. Now, in the demo here,
William requests for the topic promotion request is validated and
stored, and the other user who belongs to the same team
is going to approve the request. On approval, the topic is promoted
to the next environment. Now, you might be wondering if
at all we need two people for these activities.
I would say yes. All these topics,
or acls is nothing but infrastructure, and we are creating
infrastructure as a code and it has to be reviewed by peers.
When I meet people, they keep asking me like is
ownership mandatory on topics or schemas?
Yes, again imagine, without defining ownership,
whom to contact for any issue on the topic or any
permission, usability or documentation and many
more. It is similar to the books in the library where either they
are rented by people or they are living in the library.
Basically this ownership gives responsibility to
them. Let's now get to the demo. We will now
promote the topic from dev to test. If you see here,
this topic is only available in dev. So let's go inside and
click on promote. We want to promote with the same configuration,
what is provided and topic
promotion. Submit a promotion request and
for this demo purpose again I have moved Jennifer back to
the same team. We can see that now
if we come back to approve requests, go to topics.
If we search for conf,
it is waiting for approval, view it and these approve
if we come back to topics.
If you see here, the topic now exists in both the and test.
Disaster recovery is a common phenomenon in
most of the software projects and so is with Kafka.
Now Claw helps immensely in this recovery process.
Claw supports synchronization of the
configuration between claw and Apache,
Kafka and other clusters. Note that it is only
the configuration and not the actual data which
is lying on the topics. Claw allows
for seamless synchronization of topics and acls
from these clusters into your new setup.
If your claw instance is already up and running
or restored from a backup or unaffected by any
cluster outage, you can leverage the synchronized
option to reinstate or update topics and acls
across the clusters, which basically measures
the data consistency and uninterrupted operations.
In the last demo here, superadmin, who has
the sync topics permission, logs in and tries to
fetch the topics from the cluster and sync to claw.
Let's get into the demo now. So, like I mentioned,
synchronization of topics from cluster can be done in two ways,
with the individual options or with the bulk options.
Here you have to select each and every topic one by one,
and in the bulk options you can select all of them in
one go. So here we are seeing about 162
topics which are out of sync, which means they don't exist
in claw. And if you select so, for example,
and we want to assign them to a particular team,
now all the topics are synchronized with
clop. Note that we are not synchronizing any
data which exists on the topics. Rather it is only the topics
configuration. Now all these topics exist in Claw.
So does claw fit into your project? The answer is mostly
yes. Claw is an open source solution based on Apache
license so you can download and deploy in any of your environments for
free. It basically consists of two Java producers and is also
available as docker images, making it production ready and
deployable in high availability mode. Additionally, Claw offers
a rich react based UI which can be accessed when the NPM
assets are built. We are almost at the end of the talk here.
We have a few useful links. If you have any technical
inquiries regarding the project, please feel free to raise a git issue in
the git repository. The project can be downloaded from git or
Docker, or you can also access it from the available releases.
Thank you for watching and hope you enjoyed it.