Transcript
This transcript was autogenerated. To make changes, submit a PR.
Ability. No surprise there. But this is from a very
basic standpoint. So if you have no
clue what's observability, if you're coming from a non production,
code based background, then this is something you might enjoy
and have a lot to learn from.
So my entire agenda is in the form of questions we'll be
answering today. So we'll start with a quick hello, go to me.
Then we'll see what's observability for folks outside engineering? When I say outside
engineering, I mean anyone who has zero
experience with production code base or has not worked with
DevOps. Then there's for engineers, basically people who have a little bit
of idea of DevOps, who have worked in a little bit of production code
base but haven't really worked on testing and incident
response. Then we have a quick overview
of how we make a system observable. Then we
talk about streaming database, also one of
my favorite databases that I've come across recently. And then we
get into the block diagram of the project that I created to understand observability
better. All right, folks, let's start with the answers.
So, hi folks, I'm Ritzvi. I am a maker.
I love learning new things and then talking about them.
I'm currently freelancing from a Devre standpoint
and a firmware engineer standpoint.
I'm a coach at major league hacking and I love
that job because I get to do all the things I love, from mentoring students
to teaching to coming up with quirky project ideas
all at once. So, as you all might notice,
the theme for this presentation is quite
unusual. But then it is summer and I decided
that we are talking about observation and correlation.
So might as well come up with a nice fruity summary,
recipes based presentation theme to kind of
go with our topics.
So let's get started with what is observability?
Right, so before we go on here, if you are actually very new
to this world, to this word,
I want you all to take a second, pause the video right here
for 15 seconds, and just think of the first thing that comes to your head
when you think of observability. This is something called the intuition
exercise that I do for most of the topics that I especially
take virtual sessions on because it's hard for me to gauge whether or
not you're understanding where you're coming from. So if you do this yourself,
it's much easier for you to kind of untangle threads later when a
lot more terms are thrown at you, right? So let's
take just 10 seconds right here. And if
you all are watching this live and it's going on, type in the first
word that comes to your head when you hear the word observability.
All right, I'm going to move on. So I hope you type in
the first word that came to your head. Trust me, this will really help you
relate correlate with everything that I talk about today
much easily and in a more holistic way.
And then it is about observation. So what better
way to get started? All right, so talking about observability
for everyone, well, let's start with one thing that we would
not be very intuitive about when it comes to observability,
right? We would never consider observation a measurable quantity.
But that's the first thing that we know about observability, that it is a
super measurable quantity. And if it doesn't make sense,
then remember where you're trying to make a system so measurable that
it becomes easy to observe. This is the best I can put
it. So it makes more sense in terms of measuring
observability. It is more than just a status on
those startup websites. You go down and there's a green all
okay, all systems working, status okay buttons.
Observability is way more than that. There is a system that's keeping
in check and reporting a status, but that's not what observability is about,
right? It is a very layered process. We learned about how
there are lots of points
that go into making a software or
an application observable. A great example
to go about imagining observability is fast
fashion. Now, the comparison might seem absurd,
but I promise you will make sense when I explain it. So I
don't know if how many of you all know how fast fashion works,
but the basic working of fast fashion is that they're
keeping a tab on a lot of things. For example, the trends,
the current fashion shows, the current hypes,
not just these, but also social media trends
to finally push out their new products. Then they also
keep a tab on their customers, what clothes are
they viewing, regardless of the trend. And then there's a lot of data
correlation that's happening. That's a great example for observability.
They're not just keeping in tab a particular cycle.
They're not just keeping in tab a status
per se when it comes to a product. They are constantly,
constantly observing customer
versus trend versus social media versus influencers,
versus a number of factors. For example, the lead time to
get a clothes to get it stitched, et cetera. The colors available.
Whether a celebrity picked up on some trend and is following
it, there are so many layers to it that fast fashion
is following, and then pushing out clothes with
almost three weeks lead time.
And that's not very environmental friendly. It's a
great example of how observability really works. If you read more
about this, you'll realize how this analogy makes
perfect sense. And with those beautiful raspberries in mind,
let's move on to the origin story of observability.
So why has observability become such a trendy word in
engineering? Why is everyone talking about observability?
Why are there so many products overnight doing so
well, helping you all achieve observability in your
system? So let's go back to the origin
of the word observability. So the term observability was coined by
engineer Dolph E in 1960,
right? It has grown to mean so many different things in different communities,
but it has its own meaning and definition
in the modern software systems, right? In the 1960 paper,
he introduced a characterization he called observability
to describe the mathematical control systems. So, in control
theory, we usually have a measure of how well the
internal states of a system are working
or are inferred in terms of their knowledge from its
external outputs. And this ratio
is called observability. Or this measure is called observability.
So when you study observability and
controllability, they're actually mathematical duals.
So these come along with sensors, linear algebra
equations, and tons of formal methods.
So the traditional definition of observability is
the realtime of mechanical engineers and people who have a
heart for physical systems with a specific end state in
mind. If you're
interested in knowing more, there are some really nice blogs
available on correlating observability and mechanical systems versus
modern software systems. But yeah,
that's how the origin
of observability as a word. But let's look
into the adapting concept of observability to software,
right? So when we're adapting this definition to
software, we must also layer additional considerations
that are specific to the software engineering domain.
For example, for a software to have observability,
there needs to be the following. For you to even consider
having observability. So you need to be able to understand
the inner workings of your application. You need to
know the state your application may have gotten itself into,
or you know that it's reaching states that
you can't predict. You need to have a system
state solely by observing and interrogating with
external tools. And you should be able to understand the
internal state without shipping any new custom code to handle it because
it's, again, basically reiterating the above point that,
you know, it's reaching places you can't predict.
Right. All right, so basically,
to have a system to be observable,
it's the same thing over and over again that we iterate. That it
has to be unpredictable
is what we're trying to establish for a system to be observable.
Now why do we want system to be observable? Why not just
observe it anyhow? Well, for starters,
you don't always need observability if
it is, for example, a monolithic application
or an application that does not or is not meant to support
cloud of users. We'll get into that in a
little more detail in a bit. All right,
so let's talk about why we need observability,
and let's talk about a couple of
other questions that come into picture when we say we want to make
a system observable. All right, so let's
start by asking the question why our existing
methods weren't enough, right? Why did people have to go out of
their way, come up with a completely new
layered, very layered process to
monitor, basically? Right? Well, first, it is
not an equivalent to monitoring. Observability is way
more than just monitoring. It's about the software
reaching unpredictable places. But you being able to be
ready with an incident response before it reaches that unpredictable place
by constantly keeping
a tab on different factors that can even remotely
affect your system.
And also making sure that when you are keeping
a tab on these things, you're constantly also using
different tools to correlate all the data that you're collecting
and making sure that you know what is happening when it's happening because of what
is happening. Right. And I know it sounds okay,
so we are basically monitoring, but it is, I promise
you, way more than that. Let's reiterate back
to the definition of observability,
but this time in a slightly more technical manner.
All right, so these are the terms that I'll be using a lot, right,
logs, errors, traces, metrics, cardinality and dimensions.
Now, these are very commonly used terms in the world
of observability, so it's nice if you all get acquainted with them
individually. All right, let's start
looking into sentences that we can form using these words.
So let's start with the three pillars of observability. All right,
your first three words in the left, that is log errors and traces
are your three pillars of observability? Why? How? We look
into that in a tad bit. Okay, a single number
with tags optionally appended for grouping and searching those numbers.
This is basically what we call a metric, right? So monitoring
depends a lot on metrics. A metric is a single number
with tags optionally appended for grouping and searching those numbers. So basically
what we're saying is a metric is already assuming
that we only want correlation in this particular way and
then getting numbers. Right. Next,
let's go on to the next sentence. Every application has an inherent amount
of reducible complexity. The only question is who
will have to deal with it? The user, the application developer,
or the platform developer? Well, this is the question of why do we need observability?
Right? So who is dealing with the problem of this
complexity? That is scaling with every new application that comes
into the market, right? So that is where
observability steps in. So when you have an application with
irreducible complexity, you bring in observability,
which has these three main pillars, which is logs, errors and traces.
And then we use our two more terms, which is cardinality and dimensions.
So comparing high dimensionality and high cardinality data
becomes a critical component of being able to discover otherwise.
Hidden issues build in complex system architectures.
All right, so let's talk a little more about what do you mean by
cardinality? And what do you mean by dimensions?
Well, cardinality refers to specifically data,
which is very rare. So what
do you mean by very rare? Which means it is a very low occurring data.
So if you, let's say, have a
log with 100 events happening, and there's one
data that happens every one in thousands
event, right? That is your cardinal number, which basically means
it is happening. It's a highly rare
data type or occurring data, right? Then that's called
dimensions. Dimensions basically keep
value basically a data set
or data pair, which you define
based on the permutations you create with your cardinal data.
So how does this help? Let's look at the sentence again,
right? Comparing high dimensionality and high cardinality
data becomes a critical component. So high cardinality automatically
reduces your entire available data to a
very small number of data, because you know that the cardinality,
if it's a high cardinality data, that means it's already occurring very less,
right? Then comes dimensions. So dimension is
basically how many permutations can you make with your data available?
And using a high cardinality key gives you, while less
dimensions, it'll give you very unique dimensions.
And using, creating these permutations and
comparing them, it becomes a critical component in being
able to discover otherwise hidden issues. So you're
basically taking a lot of data, measuring it,
correlating it in a way that only rare events
are being correlated with each other. And then you're trying
to see how and when system will go
into an unpredictable state, predicting it beforehand, and then
being able to keep a tab on it. Now,
this might seem obvious after listening to the explanation,
but it took me quite a few blog posts and videos to
understand this place concept, right,
the world seems fairly easy. Like, okay, we're trying to observe something
that might not happen or
might not happen because we have predicted it.
But this basic concept helps you apply
it in your own applications and make your system observable
without making it highly complicated,
even though I understand that you do not want to apply it to
applications that aren't highly complicated. But this entire exercise is to
understand the world of observability better.
Okay, folks, now that we understand basically what observability is,
let's move on to some simple facts,
and then we will look into some
tools for observability. All right, this is another great analogy
that I love giving students monitor, but teachers
observe. So remember in school your
teacher used to allocate someone as a class monitor,
and they would never be the teacher, right? They're just monitoring for a certain
period, and they have their biases, et cetera. But the teacher,
even when they weren't actively monitoring or
trying to keep the class in control,
they were constantly observing. So that's something similar
to the difference between monitoring and observing.
All right, so let's look into some facts. So the more highly
dimensional your data is, the more likely you will be able to find
hidden or elusive patterns in application behavior. In modern systems,
where the permutations of failures that occur are proactively
limitless, capturing only a few basic dimensions in your telemetry
data is insufficient, obviously, right? Because the whole
point of us bringing in observability was because we weren't able to predict
the states that the system was reaching in in
time for us to have incident response
in a way that is helpful to the application.
Right? So what do you go about? What is the
next action plan of action for you
to make sure that your application is constantly observed
in the right way? Well, we want to gather incredibly rich
detail about everything that's happening, right,
especially at the intersection of users code and your user
systems. Right? High dimensionality data provides greater context about
how those intersections unfold, remember, because our high
dimensional data is also something that is coming from cardinality.
Right. And when you're doing that, you're basically making these
deep but fairly unique assumptions.
And not just assumptions, but predictions that
you are ready to handle way before they can happen.
So how do you make a system observable? And then obviously making
an observable, actually observe it.
Okay, so making a system observable, there are four steps
to it. First is instrumentation, which is basically collecting
all the important data that you have, right? Then there is
data correlation, again, basically creating
dimensions and using cardinal data
to create some very unique data correlations.
And then there's obviously incident response, which is basically
being ready for an unpredictable state which has been predicted
by your observability system. And then the new
thing is the aiops, which is basically using a little bit of
AI to make your observable system way smarter.
We will talk a little more about instrumentation.
So why do we want instrumentation? Well,
you can't always have the application send certain
data, especially because you know that you have a lot of
resource consumption while the application is running. So you use
specific tools to specially instrument
all the data, collect it, and then send it to your favorite place
of data correlation. Then you do the data correlation.
There. Your incident response happens based on
what you're trying to predict, what you're not trying to predict based on your application
type, if it's front end, if it's back end,
et cetera, et cetera. And then there is aiops,
right, which I haven't looked a lot into, but there are some
pretty cool applications out there helping you do that.
All right, so we spoke about there
being specific tools to help you all send your data
to the place of your choice to correlate data.
Well, my favorite place has been a streaming database to
correlate data and to create views that help me do incident
response rather efficiently.
So the best way to correlate a
lot of data and to be able to
take out your correlated data in a separate location
altogether is a streaming database. Right. Especially right now
when I am trying to understand and basically get the hold of fundamentals,
it makes more sense to use a streaming database which
basically has the following working,
right. I love open source tech, and I recently came across
this really cool streaming database which lets
me run it locally and on the cloud and lets
me create something called materialized views. We'll look into that in
just a second. But this is basically how a
streaming database works. So rising wave is the name of the streaming
database that I have been playing with like crazy,
and I love being able to create materialized views with
it basically has event streams,
it ingests it, then it creates something
called materialized view, and then it either triggers
actions or event stream again based on your
application. So before we dive
into the blog diagram of what I created,
let's look a little bit more into how rising wave
works. So this is the
rising wave docs, and like you can
tell, I absolutely love their docs as well.
It's very efficiently
organized in terms of what I need, and I especially
love the tiny gis that they've created all over the place.
But basically this is the key part of what we are trying
to apply.
So rising wave can run in both local
and their cloud that they've recently released in the beta trial
format. I usually prefer the local version.
So how does using wave support real time analytics?
Well, rising wave supports real time analytics by ingesting and transforming
data in real time with a variety of data sources, and then
it maintains fresh results installed.
So what it's basically doing is it's creating
a sort of a filter, right? So when you're sending it tons of
data, you're also creating a filter. You're saying,
okay, that's okay, I have a lot
of data, but now in this new table, I want you to keep this
correlated data for me, right? And because
it's a streaming database, it's easier to kind of play with
its relational part of the
entire query. There are some really
cool examples here as well, which they have done.
Let's have a look at this. The clickstream analysis is something
what I'd compare to front
end observability. So this is a great example
to deploy if you all ever want to get started with
streaming databases, or with databases that help
you create a nice filtered
view of your data. Again, the streaming database
has a very wide array of usage.
It is not specific quickly for
observability, but it does do that task very well.
Now you can see that the documentation also has a
lot of comparisons with its current
from the inspiration rather, where it builds
upon. Again, it's completely open source and you all can also go ahead
and contribute to the documentations, and they
have an active slack as well. If you all want help from the community
real time, they have a very detailed
list of sources you all can connect it with and
a lot of things that you all can connect it with as well. You all
can also go ahead and create your own visualizations depending
on what you want to visualize it with, which is
also what makes it my favorite tool for basically
any observability test that I've been conducting in the past few
weeks. So what did I do to
make my streaming database an
observable tool or an observability tool?
This is the blog diagram that I followed. Now to actually demo the entire thing
would be another 40 minutes session altogether. So I'll just take you
through the blog diagram and then maybe I'll just write a blog post in a
bit to show how I did it and show you
how it works. Right? First I have a simple algorithm generating
art, which is basically a tiny spiral I used
to react front end. And I'll also show you one of my favorite tutorials,
which got me started on observing front end, right?
You want instrumenting, so any data,
right? You want to instrument. Remember, you're not trying to
make your system observable, you're just trying to learn how observability
works, right? So you want to instrument some form of data, any data
works, right? And then you want to send and analyze the data.
So again, this is where the streaming database comes in. So you have
first algorithm. Over here you have an
instrument that is basically sending data to your
database. The database that I used was again pricing wave,
the streaming database. It is creating materialized views.
So like we said, materialized view is basically the filter, which is basically
what observability does, right? It creates correlation
and then gives you states that cannot be predicted.
And from here, ideally you want to send it to
observability stack again, which is basically something that helps you
do incident response. But again, this is us just trying to understand observability.
So for incident response, what we have done is basically we've just added
another parameter in the generative art
and fed the materialized or the new correlated data back into the art.
And just for
the sake of understanding how correlation helps create a lot of
difference, you generate the new parameters separately as well.
Now over here you can keep for the instrumenting
sum or any data, you can keep making it as complicated as
you like, right? So for example, I started off
with just observing a click on the screen,
and then I started feeding the entire parameter into the database
and so on and so forth. You can just keep adding onto it and
creating new, just layers of data and sending it
to the database and then making it
more complicated to understand how it actually
makes more sense. It starts making more sense when
you start making it more complicated. All right folks,
that's it from me. I hope you all had a lot of fun
learning about observability.
And yeah, if you all want to reach out to me you all
can reach any of these contacts
to reach out to me. I'm again learning a lot about observability
on the go so if you all want to come learn with me hit me
up and if you all have any feedback regarding the talk as well
do let me know. A huge thanks and a shout out to Conf 42
for accepting this talk. This is a very very new oriented
talk. I had a lot of fun learning about and presenting
it in such a way and I hope you all had
fun as well. If you all did the intuitive exercise. Now is a good time
to kind of correlate and see how much you understood on what you intuitively
understood made so much sense and once you do it's
going to be very difficult for you all to not remember it the way
in a way that you just correlate everything that comes your way with observability.
All right folks, I shall see you hopefully at another
conference soon.