Transcript
This transcript was autogenerated. To make changes, submit a PR.
You. Hi everybody, my name is Robert Hodges and
I'd like to welcome you to my talk on fast, cheap doityourself
monitoring with Opensource analytics and visualization.
I'm presenting today at Comp 42 devsecops 2023.
I'd like to thank the organizers for inviting me to talk and for
doing all the work to make this conference possible. Thanks a
bunch. It's a pleasure to be here. Let's do a few intros.
So, my name again, Robert Hodges. I've been working on databases
for over 30 years. Actually this year it's 40.
And I've been heavily involved with open source Kubernetes security.
Other issues related to operational topics
around managing data, particularly in cloud and cloud
native environments. My day job, I run a company called
Altinity. We are a service provider
for Clickhouse. It's a very popular data warehouse. We'll be talking about
this good chunk of this talk.
We run a cloud, so we have hundreds of clusters that we run on
behalf of people. We also help a very large number of people run it themselves.
Among other things were the authors of the Kubernetes operator for Clickhouse.
So if you run clickhouse and use cloud native approach,
run it on Kubernetes, it's a good chance. Using our software already.
And just a little bit about my colleagues who've helped put together
the information behind this talk. We have about 45 people in
the company, spread out over 16 countries. We are,
by and large, database geeks, centuries of experience with database
and applications, particularly analytic databases.
So let's jump right in. Monitoring. Why do we do it?
Well, it could be because we like looking at nice screens,
but really it's to answer questions. So when something
happens in your system, for example, users start to see performance problems.
You want to know why. And as you dig deeper, like when
do the performance problems start? How many users are affected?
Which of the services is at fault? These are questions that
require data, and moreover, they require a history of the systems
in order to answer. Now, in the
old days, we used to take a slightly different approach,
which leads to a question, what's the best way to answer those
questions I showed on the previous slide? Here's the old way.
Go into the system, lay hands on it, run vm stat, kind of
watch the numbers until they become blurry. So would you like to
do it this way, or would you like to do
it visually? So chances are if you're in this
business, you already have monitoring like this. This is actually
Grafana, which we'll be talking about. But this type of visual
display is much easier to understand, interpret and
use. So visual displays, well,
there's a lot of systems that will actually do this that come
right off the shelf. Now.
There have been proprietary solutions developed in this space for years.
And in fact, if anything, over the last few years we've seen a blossoming
of systems to do observability in general and
monitoring system monitoring in particular. But perhaps they're
not for you. One simple reason is they can be very costly.
But another one is that you may have specialized needs
for monitoring that you need to cover. Perhaps your business is monitoring,
so you don't want to use somebody else's system. You're developing your own, you may
want to own the stack. There's a bunch of reasons
you may want to control the data. There's a bunch of reasons why you might
want to do it yourself. So let's look
at how to do that. The basic system that we're going to build
to do monitoring is we're going to have a system that
consists of three parts. We have the source
data, so we need something that can collect that data and
ingest it. We need a place for it to live. And that's
what we call an analytic database. This is a database that's designed to
hold large amounts of data and answer questions on it
very quickly. And then finally we need a mechanism to
display it so that you end up with some nice graphical visualization
like what I showed you a couple of slides ago.
So let's look into how we would go about building that type
of system. So the first thing we want to do is pick
an open source analytic database. Open source
databases tend to be problem specific,
and as we're looking at them, there are several that you might consider.
So you might consider open search, which is the open source version of elasticsearch
that's great for full text searches on unstructured
data. It can be used for log analytics.
You could also use Presto, which is a very powerful
database that can do federated query across many data sources
and information and data lakes. But for this
type of system, particularly observability, one of the best
choices on the market is Clickhouse,
which allows you to do real time analytics. By that I mean
be able to run queries and get answers back almost instantly.
And it can do this performantly
and easily on very, very large quantities of data.
It's in fact used for an enormous number of use cases
ranging from web analytics to network flow logs.
Observability, of course, financial markets,
seim, so on and so forth. Super popular for this and
a great choice. So here's
a short list of reasons why Clickhouse has
turned out to be such a good choice for so many people.
So it is a SQL database. In fact, in many ways it has
the simplicity and the accessibility of a system like MySQL.
So it understands SQL, it runs practically anywhere.
You can literally run it on a phone. There was actually a demo of that
a few years back, all the way up to running
it in huge clusters containing hundreds of servers in the
cloud. It's also open source in this particular case,
Apache 20, which is super flexible and gives you the
ability to run it for any purpose. In addition,
it has very powerful analytic capabilities. So it shares
this with many,
in fact a number of features that are standard for analytic databases.
Overall, they include storing data in columns. We'll show an example of
that in the following page about why
that's such an important feature. It can also parallelize execution very
well, so that the data is organized so it can be read quickly, and then
it can read from many locations in parallel, and it scales
to many petabytes. So these are all
good reasons why Clickhouse has become a core engine
for real time analytics. Across the use cases that I mentioned,
let's look at some of the details that are relevant for observability
and monitoring. So as I mentioned,
clickhouse is optimized for very fast response on large data
sets. So if you actually look at the data, you can see that
you'll quickly see, particularly if you go in and look at its storage, that each
column is stored separately, basically as an array.
So when you go and look at the on disk representation,
the columns, each of them has a couple of files that
elements it. So within that column you have very
highly compressed data. Putting things into an array
like this makes it easier to compress. Moreover, it's sorted, which can result
in and then compression is applied to it.
Particularly in observability cases, we can often get compression levels
of 90 or even 95% on data. Second thing
is we have replication between nodes, so we can maintain
multiple nodes and then query across them.
And the third using is we have vectorized or parallelized query,
which can run across all nodes. In fact,
you can also divide data up into shards and
run parallel query across them. This allows you to apply
the power of multiple machines if you need a fast answer. And then
within single machines, clickhouse is extremely efficient.
It uses what's called vectorized query, where we basically treat these columns
basically as array values, because that's how they're stored
and can take advantage of things like SIMD instructions,
a single instruction, multiple data. Also, we have great performance
because this kind of data aligns well with the cache structure in
modern cpus. So for all these reasons and
more, clickhouse tends to be extremely fast.
Another thing that clickhouse does that makes it very nice for monitoring data is
it has a huge number of what we call input formats. So these are
things like CSV, which is one of the most widely used
formats in all of it, but also CSV with names where
you have the name of each column in the first row.
We can read Json, we can read what's called Json. Each row, which is
a record. Each record is a separate JSON document, protobuff,
parquet, tab separated, you name it, there's dozens of
these. And what this means is that there's a
pretty good chance that the data that's being emitted from your monitoring system,
clickhouse, just knows what it is and can read it and stick it in a
table. Finally, once you get in the table,
clickhouse is extremely good support for time ordered data.
And that's important because monitoring data
is fundamentally time series. It is a series of measurements
on things, for example like hosts, that have particular properties
and then particular measurements associated
with a point in time on that host. So there
are three date types, regular dates,
which are pretty useless for high granularity data,
but date time, which is your typical Unix timestamp,
and then date time 64, so you can get precision down to nanosecond.
Bi tools tend to like date time, and then there's a whole
raft of functions that allow you to process the data. So for
example, to be able to normalize a date to the year, to the nearest hour,
to the start of the year, so on and so forth, as well as a
bunch of conversion functions to pop out to
turn it into a month, so on and so forth. So these
are all great reasons for using Clickhouse
and that make it particularly effective for this kind of
application. Speaking of Grafana,
Clickhouse pairs really well with Grafana when you're building
observability applications. In fact, there's a
pretty good chance that many of you who are listening to this talk already
use Grafana for this purpose. Why is Grafana good?
Well, first of all, it's built around display of time series data.
It's very simple to install. It has piles of data sources.
So we will be using a data source that can read Clickhouse data, but it
can also read prometheus it can read mySql,
you name it. If there's a database,
Grafana can connect to it and use it. Moreover, for displaying
the data, it has a pile of plugins. This example on the right just shows
a few of them, but time series
sort of heat maps, tabular displays,
and they're very easy to set up and apply to the data.
One of the things that makes it particularly strong for
monitoring, it has very good zoom in and zoom out.
So the ability to look at different timescales, to look at different series
at a particular time sales scale, these are all things that you like to
have when you're trying to drill in,
understand the data, sift through what you're seeing, and then zero
in on a problem. And that's
taken together, this makes it great for monitoring dashboards. And then the
final thing which makes it a good match for Clickhouse is it's open source,
in this case AGPL 30. So how
do we go then to build an
actual monitoring application?
What we're going to do is start with those VM stat
commands that I showed you a few slides ago, and we're actually going to
turn that into data in a table in clickhouse
and then display it in Grafana. So let's dig in and show how
to do that. It's really not that hard. So the first thing is
we need to generate the VM stat data.
So here is a simple python
script. It's about 14 lines that
is actually going to run VMStat at 1
second intervals and then basically split the
results up and stick them in a JSON document.
If you look around, there are plenty of tools that will do this automatically.
I just wrote it myself because it's really simple to do.
Data collectors are just not that hard to write.
So you can read the code, and if you carefully look at
it, you can prove to yourself that it's eventually doing adjacent, that it's
basically constructing a dictionary and then dumping it out
as JSON key value properties. To understand the data, it's a
little bit easier just to go look at it. So here's the output that you
get. So the key value pairs, you get a timestamp. That's really
important because that's your time ordering. And then you get a bunch of properties,
including the host and things like
that, and then actual measurements, like for example,
the idle time here, which is 98%. So this is
the data that we're going to be loading into clickhouse.
So the next thing we need to do is we need to design
a clickhouse table to hold the data. So clickhouse,
unlike a database like Prometheus, for example, does require
data to be in a tabular format. But Clickhouse is
very, very tolerant of what it considers to be a table.
In this particular case, we're taking a pretty conventional approach.
So we're going to take things like the timestamp,
the day, the host, and we're going to consider those to be
dimensions. So these are the properties of the measurement,
and then what we have is the measurements themselves. So these
are just all the data that we get out of the vm
stat command. So the amount of free memory, amount of
buffer cache, the different amounts of percentage
of time, sort of ways that the CPU is using its
time, so on and so forth. One thing to notice, if you've used SQL
databases before, is down at the bottom. We have this
engine equals merge tree. So for MySQL
users, this will be familiar. This is a particular way of organizing
a table merge tree. In Clickhouse is the workhorse table for
large data or for big data, and it has
partitioning built into it. So you have to give clickhouse
a bit of a clue how you want the data broken up. In this case,
we're doing it by day. This would be appropriate if we're building a system,
for example, it holds a year of data, and then we also give ordering.
This is something in analytical databases that's critical.
You need to give a sort order to the data. And if you do this
correctly, and this here, we're sorting by the host followed
by the timestamp. This will order the data in such a way
that, among other things, the values between successive rows
will be very similar and will compress incredibly well.
So next step, we've got the table, we've got the
data. Let's get the data loaded into that table in clickhouse.
So the actual SQl insert command to do this,
if we've got the data, is really simple. So this is
a JSON, each row format, every measurement results in a JSON document.
So the top command is how we do that in SQL.
The actual command to get this done is a little bit different.
For example, we can go ahead and use curl to post this data.
So this is an
actual command that loads some of this, a file containing this data,
and this is it. So literally two lines.
Well, got to construct the insert command with Earl
encoding, but that's it. So very simple to get it loaded. You can of course
write a python script. That's in fact what I did because it's a little bit
easier to control it than running inside shell.
And then the final thing is you're going to want to construct
a grafana dashboard. So in this particular
case I've constructed, and we'll see this in action
in just a few minutes. I've gone
ahead and constructed a simple display that shows me
my cpu, that's the top display, and then a
more detailed cpu usage graph that
actually breaks down the components
of the CPU usage and then memory usage. The bottom two are done
by host, so there's a little selector on top.
When you're using clickhouse. There are a couple of plugins that
you can use for Grafana. I prefer to use the
one that we maintain, which is the alternative plugin for clickhouse.
It's been around for years. It's had about 12 million downloads at
this point. Incredibly popular, used across thousands and thousands of
dashboards. So that's the one that's used to construct this
display. And then finally,
once you have this all set up, not only do you have the
display, so you can go and look at this
information directly through the display,
play around with it as we'll do in a couple of minutes.
But you have the full power of SQL and you can ask any question
you want. You can go ahead and do this interactively
off the command line, turn it into further displays. For example, this is
a query that just shows all
the hosts that had greater than 25% load
for at least a minute in the last 24 hours. And it also sums the
number of minutes that had that. So this is a way of seeing which hosts
are running hot. So that's the system.
It's really not very complicated. This is in total
about 100 lines of code that was sufficient
to get this. So let's go out and bounce out of
the slides and go have a look at the system actually at
the ground level. So here we go.
This is the system that we saw a few minutes ago.
And you can see the way that this is set up.
It's monitoring a couple of hosts that I run in my home
office. They're called logos two and had
a temporary glitch in the audio there. So they run two hosts,
logos two and logos three. And I can select particular hosts.
You can see how this selector allows this to change them.
I can look at all of them, in which case I don't get
cpu specific, cpu and memory usage.
Let's go pick a specific one. We'll go pick logos two.
And I can also change the timescale. This is super easy
to do. So here we go. We can switch this to the last 30 minutes.
We can go see what's going on here. Let's go to last hour and
see if there's anything interesting. You can see there's been some activity on the
system, on this logos too.
And in fact you can distinguish the different
traces. So right here, without really doing anything special, you have a
lot of insight into the load levels on these systems and you
can basically drill in to get
much closer views. This is something I love about Grafana that I can actually
come in and I can just select a very small section
and then the display automatically zeroes in
on the part that I want to look at. Let's go ahead and get this
back to doing the last five minutes and
let's put some load on the system. Let's test this thing out.
So for that we have a couple of handy
commands. A great command to bash on the
cpu is sysbench. So this
is something you can just say apt install. Pseudo. Apt install sysbench.
What I'm going to run is a cpu test. This is just going to beat
up on the cpus and we're going to let this run for a minute or
two and we will basically
be able to see
this beating up on the system tools. Actually,
I made a slight mistake there. Let me run it
on the logos two host because that is
actually a more capable system. So we're going to go ahead and run the
cpu test while that's running and the data is collecting. Let me just
show you this, that all the code that we're using here is actually available
in a project called Clickhouse. SQL examples. Let's go to
the open source monitoring directory. So for example, the dashboard
that I'm showing you is there the little python routines
that we have here. These are
loading the data into clickhouse. And then the
python script which I showed you that actually generated
the data. It's all there. So if you want to go ahead and do this,
and as I say, it's about 100 lines of code total to
get this whole thing to run. And then actually running it is very simple.
It's as simple, the collectors are as simple as
the following. So just run the
collector pipe into the consumer and up it goes to clickhouse.
Great. So that test has been running in the background on logos two.
So let's see what we have. And actually you
can see that. We can actually see
the test going on and we can
see the effect on the cpu. We ran
it first on logos three. So right there.
Ran it here on logos
two. So we can see the cpu.
Now what's a little bit more interesting is to bash on this a bit and
actually do some work on, show the effect on the memory.
That's a little bit more fun. Let's run another program.
Let's kill the cpu test. And we're going to run a program called
stress. So here it is.
And this is a program that can
beat up on your system, but it can also use a lot of memory.
So this is basically spawning four threads. They're each going to eat about four
g's. And you have to love any performance
test program that calls its workers hogs. So off
they go. And we'll actually see these coming up in
the display. Let's go ahead and actually,
let's change the time. Okay,
so we can see these actually starting to use resources.
There's the memory coming up. This is not actually putting enough
load on the system. Let's beat it up a little bit more. Let's go ahead
and add eight threads.
So go ahead and put that in and give
it a minute or two. And what we'll see
now is this will put very heavy load on the system.
So we'll start to see in this memory usage, we will start to see
this climbing very rapidly. Colors here are probably not the best,
but here we can see that it's actually putting heavy cpu
load on this system. We can also see that up here. It's basically pegged
at 100%. So this machine is just getting hammered.
In fact, what's happening right here, this is kind of interesting.
You basically, let me get that back
to five minutes. It's zooming in too quickly. We've actually got gaps in collection.
And what that indicates is the machine is so loaded that the collector is
not even generating data. So that's the demo.
This is something that I put together. I'm having all kinds of fun with
this at the cost of about 100 lines of code. Of course,
if you want to productize it, you're going to end up also storing
the system configuration and managing Grafana, managing clickhouse.
But the point is, you can build this system yourself
and basically monitor anything you want and collect practically
any kind of data you want. Okay,
so let's go back to the slides and dig
in further to some final notes.
So if you're going to build a monitoring system, you can use
Python, as I did, but you don't have to.
One of the things that's great about Clickhouse is it's a very popular project,
certainly among the most popular analytic databases
across all of GitHub. It has a huge number of libraries
and software packages that work with it, everything from Kafka to
airflow. And then for display, as we saw Grafana,
superset, cubejs, bunch of different client libraries.
We do a lot of work with Golang in our work, but if you
like Java, if you like Python, the drivers are all there.
And then if you want to run it on kubernetes, there's the altinity operator
for Clickhouse, which I mentioned at the start of this talk. This allows you
to run Clickhouse very efficiently on
kubernetes, which is turning out to be a really great
place to run data. But of course you can run it on anything you
want. Clickhouse runs great anywhere. You can run it on vms, of course,
use ansible to manage it, so on and so
forth. So where can you find out more?
Well, there's the official docs for both the clickhouse project as well
as Grafana. So these are shown here.
We do in the course of our work with Clickhouse,
as well as other products like Grafana, we write blog articles,
we do a huge number of talks on on
that we post on YouTube concerning topics related
to running Clickhouse, as well as integration with other tools.
We have a knowledge base that you can use to learn more about how to
solve specific problems, particularly if you're operating at scale. And then it's just a
pile of other opensource associated with Clickhouse. There's a
very large community around this.
We get thousands of contributions per year,
ranging from sort of simple elements on issues
all the way up to things like prs. Last year about 392
people, unique people on GitHub,
submitted prs that were actually merged into Clickhouse.
So that is my talk. Thank you very much
and go out there and have fun. I'd like to thank the comp 42
folks once again for setting this conference up.
It's great to present here and if you want to contact me, I'll be
hanging out on discord as part of the conference.
But you can also get to me at alternative.
You can just go to the website, do contact us. We have a slack channel
that you can join and you can just join that channel
and dm me, or you can find me on LinkedIn. And once again,
altinity. We do alternity cloud, which is a cloud for
Clickhouse. We do builds of clickhouse, stable build and the alternative Kubernetes
operator. Those are just a few of the many things that we do to help
people operate Clickhouse at scale and build applications
like the one I just showed you. So thanks again. Have a great
day.