Transcript
This transcript was autogenerated. To make changes, submit a PR.
You.
Hello everyone. Thank you for joining me today
to observe this talk. I'm delighted to be here at
Conf 42 observability 2023.
Well, Star wars is a godsend,
but in the cloud world failures are always
around the corner. Unfortunately, we cannot prevent
all failures, but we can try to detect them as soon
as possible and they are quick lay this is where
observability comes in and observability is your
friend from day one. In this talk
I will give an overview of how to set up dashboards
and alarms in AWS Cloudwatch for
serverless applications. I will also introduce our
open source observability plugin, Silicwatch,
which automates evade the pain of setting
up dashboards and alarms in Cloudwatch
and enables better levels of observability.
I'm Duran from Forturon.
I have over one year and under two
years working experience in cloud.
I actually have one year and eight months experience.
Maybe I should have said that exact number, but I
couldn't because as we all may agree,
we women like talking, right? But unfortunately
we cannot see this happening in technical events.
I'm here and I really want to encourage other females
to join me. Well,
I'm enthusiastic about serverless observability
and cows engineering. I recently
maintained our open source observability plugin,
Silicwatch, which I will be talking about.
Please feel free to get in touch with me and ask any questions
that you have. See my Twitter, LinkedIn and
GitHub account. Is there a little bit
about Fourtheorem? Fourtheorem is AWS
call synthetic partner and doing a lot of work on AWS.
If you want to learn more about Forturm please visit the websites,
but I wanted to highlight some works we are providing
for our clients. These are cloud migrations,
training and cloud enablement,
building high performance serverless applications and cutting
cloud costs. Our architects write books
and we are maintaining bunch of open source projects.
These are medium, it's the Lambda Middleware
framework and Silicstarter is a complete
starter project for serverless applications on AWS.
And Silicwatch is observability plugin
which I will be talking about and we are also running
weekly podcasts on AWS topics.
I highly recommend you to check the website which
is awsbytes.com.
Let's start with the definition of observability.
Observability is the ability to
measure the internal state of a system
by examining its outputs.
There are three major pillars of observability.
These are logs, metrics and tracing a
log is an immutable timestamp
record of discrete events that happened over time.
Metrics are all the analytics around
and all the data you are gathering so you can see
do I have a problem? How full of my disk drive?
How much cpu am I using? A trace
represents the end to end journey of a request through
a distributed system. So you can see where is the problem
in AWS native, the main tool for
observability is Cloudwatch and what Cloudwatch
gives you. Logs with insights metrics,
dashboards, alarms, cannabis and distributed
tracing with x ray let's talk a
little bit more about Cloudwatch and what Cloudwatch gives you
out of the box first thing we need to understand
it's a toolkit to allow you to build observability
solutions yourself.
Metrics are automatically generated for all services,
which is great. Lots of dashboards,
but oriented by service, not by application and
zero alarms, but of the box which actually makes
sense because the alarming you really want to do
based on understanding your application context,
like at what level you want to alert on Cloudwatch
is already a super powerful tool,
but setting up dashboards and alarms in Cloudwatch is
a bit of pain and requires some work. Like you
need to research and understand available metrics.
You need to decide your thresholds, at what level you want to alert
on. You need to write infrastructure as code for
your application dashboards as well as for
your service metric alarms. And every time
you add a new service to your application, or even
a NIV lambda function, you need to update your dashboards
and alarms and you need to make sure that you do
that for each stack in your application.
To be honest, that's a lot of work, but it
is really worth it to have the observability.
I highly recommend you to treat your observability
as you treat your unit. Test, maintain it,
current feed it, it will definitely pay you back.
Thankfully, there are great places that we can research and
understand available metrics. The AWS
well architected framework is very well written
and the operational exalampillar is the one covered
the observability. If you are doing serverless
work, definitely look at the serverless lines
which applies these six pillars. It provides
very good guidance on metrics,
but you still need to pick your thresholds.
Okay, but which metrics are really essential?
And here is the golden signals come to
stage. These are the four key metrics.
The first metric you want to look for is called latency.
It is the amount of time it takes to
service a request. The next metric is traffic.
It is the volume of requests that a
system is currently handling and errors.
It is the rate of request that fail and
finally, saturation is the percentage
of available resources being
consumed.
So this is what infrastructure as code look like.
And this is an alarm for lambda trotolink
metric. So there is some work
you need to do if you are going to write those for
every single lambda function in your application.
This is certainly a beautiful journey, but a long one.
What if you don't have the time or resources to
focus on that right now? Okay,
can we automate this?
Please keep in mind that question, but let's first understand how
the system works when we deploy applications to
AWS. We typically use infrastructure
as code and most tooling for AWS
infrastructure as code is built around cloud
formation stack. This can be serverless framework,
AWS, SAM and CDK,
and every application composed one
or more stacks and each stack is set of
resources like lambda function, SQL, SQL,
dynamodb, table, so on.
So before we deploy that,
can we see which services should have dashboards
and alarms and magically generate them
without going through the whole bunch of work every
time? And the answer is yes
we can. At fourtheorem we are really
obsessed with observability and that obsession lead us
to build silicwatch. So the idea of the
Silicwatch is just put the plugin into your system,
it will automatically look into your stack,
enrich it with cloud formation,
dashboards and alarms for the resources you are
using. So as simple as possible
with the minimum amount of work to get observability up
and going. So how Silicwatch
does work if we look at the serverless framework
and AWS? Sam both of those are Yaml
configurations. When we run a deploy with
these frameworks, what happens behind the scenes?
It takes your yAml configuration and
builds a big JSON, then slic the cloud
formation which then deploys
and updates your resources in AWS.
So what Silicwatch does bit to hook
into framework lifecycle events on
a deploy and interpret the
cloud formation JSON for
each service and creates dashboards
and alarms for our application.
So augmented JSON, you go
from big JSON to much bigger JSON.
This is what then gets deployed into AWS
and this is how you get observability.
Bake it in from day one.
And of course Cloudwatch is fully configurable.
You can disable what you don't like. Even you
can configure the thresholds if the defaults
don't suit you. Here is the QR code
for the website and the link for the repository.
So please visit the repository and read details
on the readme page. If you have any questions,
please talk to us. We will be more than happy to chat about it
before Silicwatch if you go to AWS
Cloudwatch console, you see no alarms, just a
black screen. But if you add this plugin into
your project, you will get a wave of
relevant alarms created for the resources
from our stack and you can even drill
down the c alarms details. And this
is the complete dashboards with all
the relevant metrics from our stack. So everything
is on a page. You can observe the whole system
just by scrolling up and down rather than jumping
within console windows. And so we
understand how the whole distributed system is connected
and what are the signs when something goes wrong.
The other key feature is to be able
to have alerting delivery to a channel just
by simply adding SNS topic to the configuration.
Silicwatch by default then deliver all of
your alarms to that SNS topic and you can integrate
this SNS topic with monitoring tools like
AWS, chatbot, Opsgene,
Pagerduty, whatever suits you.
So just wrap up then. When you are building
event driven systems, you need to be able
to understand what's going on and you really want
to spot when services are failing before users do.
So. This is really what observability
all about, and observability is your friend from
day one. I hope this is the one takeaway you are
going to get from this talk. Cloudwatch is
already a super powerful tool, but it's a toolkit.
You need to put effort into it.
Automation types away the pain of setting up dashboards
and alarms in cloud to watch, and tools like
silicon gives you that automation.
So we love to hear your feedback.
Let's make it better together.
Thank you.