Observability and SLIC Watch

Video size:

Abstract

We can’t prevent all types of failures in the cloud world, but what we can spot them as soon as possible and react quickly by applying observability. Setting up dashboards and alarms in CloudWatch is a bit of a pain and ‘SLIC Watch’ automates away the pain, and enables great levels of observability.

Summary

Duran will give an overview of how to set up dashboards and alarms in AWS Cloudwatch for serverless applications. He will also introduce our open source observability plugin, Silicwatch, which enables better levels of observability.
Cloudwatch is already a super powerful tool, but it's a toolkit. Automation types away the pain of setting up dashboards and alarms in cloud to watch. So we love to hear your feedback. Let's make it better together.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

You. Hello everyone. Thank you for joining me today to observe this talk. I'm delighted to be here at Conf 42 observability 2023. Well, Star wars is a godsend, but in the cloud world failures are always around the corner. Unfortunately, we cannot prevent all failures, but we can try to detect them as soon as possible and they are quick lay this is where observability comes in and observability is your friend from day one. In this talk I will give an overview of how to set up dashboards and alarms in AWS Cloudwatch for serverless applications. I will also introduce our open source observability plugin, Silicwatch, which automates evade the pain of setting up dashboards and alarms in Cloudwatch and enables better levels of observability. I'm Duran from Forturon. I have over one year and under two years working experience in cloud. I actually have one year and eight months experience. Maybe I should have said that exact number, but I couldn't because as we all may agree, we women like talking, right? But unfortunately we cannot see this happening in technical events. I'm here and I really want to encourage other females to join me. Well, I'm enthusiastic about serverless observability and cows engineering. I recently maintained our open source observability plugin, Silicwatch, which I will be talking about. Please feel free to get in touch with me and ask any questions that you have. See my Twitter, LinkedIn and GitHub account. Is there a little bit about Fourtheorem? Fourtheorem is AWS call synthetic partner and doing a lot of work on AWS. If you want to learn more about Forturm please visit the websites, but I wanted to highlight some works we are providing for our clients. These are cloud migrations, training and cloud enablement, building high performance serverless applications and cutting cloud costs. Our architects write books and we are maintaining bunch of open source projects. These are medium, it's the Lambda Middleware framework and Silicstarter is a complete starter project for serverless applications on AWS. And Silicwatch is observability plugin which I will be talking about and we are also running weekly podcasts on AWS topics. I highly recommend you to check the website which is awsbytes.com. Let's start with the definition of observability. Observability is the ability to measure the internal state of a system by examining its outputs. There are three major pillars of observability. These are logs, metrics and tracing a log is an immutable timestamp record of discrete events that happened over time. Metrics are all the analytics around and all the data you are gathering so you can see do I have a problem? How full of my disk drive? How much cpu am I using? A trace represents the end to end journey of a request through a distributed system. So you can see where is the problem in AWS native, the main tool for observability is Cloudwatch and what Cloudwatch gives you. Logs with insights metrics, dashboards, alarms, cannabis and distributed tracing with x ray let's talk a little bit more about Cloudwatch and what Cloudwatch gives you out of the box first thing we need to understand it's a toolkit to allow you to build observability solutions yourself. Metrics are automatically generated for all services, which is great. Lots of dashboards, but oriented by service, not by application and zero alarms, but of the box which actually makes sense because the alarming you really want to do based on understanding your application context, like at what level you want to alert on Cloudwatch is already a super powerful tool, but setting up dashboards and alarms in Cloudwatch is a bit of pain and requires some work. Like you need to research and understand available metrics. You need to decide your thresholds, at what level you want to alert on. You need to write infrastructure as code for your application dashboards as well as for your service metric alarms. And every time you add a new service to your application, or even a NIV lambda function, you need to update your dashboards and alarms and you need to make sure that you do that for each stack in your application. To be honest, that's a lot of work, but it is really worth it to have the observability. I highly recommend you to treat your observability as you treat your unit. Test, maintain it, current feed it, it will definitely pay you back. Thankfully, there are great places that we can research and understand available metrics. The AWS well architected framework is very well written and the operational exalampillar is the one covered the observability. If you are doing serverless work, definitely look at the serverless lines which applies these six pillars. It provides very good guidance on metrics, but you still need to pick your thresholds. Okay, but which metrics are really essential? And here is the golden signals come to stage. These are the four key metrics. The first metric you want to look for is called latency. It is the amount of time it takes to service a request. The next metric is traffic. It is the volume of requests that a system is currently handling and errors. It is the rate of request that fail and finally, saturation is the percentage of available resources being consumed. So this is what infrastructure as code look like. And this is an alarm for lambda trotolink metric. So there is some work you need to do if you are going to write those for every single lambda function in your application. This is certainly a beautiful journey, but a long one. What if you don't have the time or resources to focus on that right now? Okay, can we automate this? Please keep in mind that question, but let's first understand how the system works when we deploy applications to AWS. We typically use infrastructure as code and most tooling for AWS infrastructure as code is built around cloud formation stack. This can be serverless framework, AWS, SAM and CDK, and every application composed one or more stacks and each stack is set of resources like lambda function, SQL, SQL, dynamodb, table, so on. So before we deploy that, can we see which services should have dashboards and alarms and magically generate them without going through the whole bunch of work every time? And the answer is yes we can. At fourtheorem we are really obsessed with observability and that obsession lead us to build silicwatch. So the idea of the Silicwatch is just put the plugin into your system, it will automatically look into your stack, enrich it with cloud formation, dashboards and alarms for the resources you are using. So as simple as possible with the minimum amount of work to get observability up and going. So how Silicwatch does work if we look at the serverless framework and AWS? Sam both of those are Yaml configurations. When we run a deploy with these frameworks, what happens behind the scenes? It takes your yAml configuration and builds a big JSON, then slic the cloud formation which then deploys and updates your resources in AWS. So what Silicwatch does bit to hook into framework lifecycle events on a deploy and interpret the cloud formation JSON for each service and creates dashboards and alarms for our application. So augmented JSON, you go from big JSON to much bigger JSON. This is what then gets deployed into AWS and this is how you get observability. Bake it in from day one. And of course Cloudwatch is fully configurable. You can disable what you don't like. Even you can configure the thresholds if the defaults don't suit you. Here is the QR code for the website and the link for the repository. So please visit the repository and read details on the readme page. If you have any questions, please talk to us. We will be more than happy to chat about it before Silicwatch if you go to AWS Cloudwatch console, you see no alarms, just a black screen. But if you add this plugin into your project, you will get a wave of relevant alarms created for the resources from our stack and you can even drill down the c alarms details. And this is the complete dashboards with all the relevant metrics from our stack. So everything is on a page. You can observe the whole system just by scrolling up and down rather than jumping within console windows. And so we understand how the whole distributed system is connected and what are the signs when something goes wrong. The other key feature is to be able to have alerting delivery to a channel just by simply adding SNS topic to the configuration. Silicwatch by default then deliver all of your alarms to that SNS topic and you can integrate this SNS topic with monitoring tools like AWS, chatbot, Opsgene, Pagerduty, whatever suits you. So just wrap up then. When you are building event driven systems, you need to be able to understand what's going on and you really want to spot when services are failing before users do. So. This is really what observability all about, and observability is your friend from day one. I hope this is the one takeaway you are going to get from this talk. Cloudwatch is already a super powerful tool, but it's a toolkit. You need to put effort into it. Automation types away the pain of setting up dashboards and alarms in cloud to watch, and tools like silicon gives you that automation. So we love to hear your feedback. Let's make it better together. Thank you.

Slides

Download slides (PDF)

See all 16 talks at this event!

Conf42 Observability 2023 - Online

June 08 2023

Observability and SLIC Watch

Video size:

Abstract

Summary

Transcript

Slides

Diren Akkoc

Junior Cloud Software Developer @ fourTheorem

Join the community!

Featured event

2025

2024

Info

Conf42 Observability 2023 - Online

June 08 2023

Observability and SLIC Watch

Video size:

Abstract

Summary

Transcript

Slides

Diren Akkoc

Junior Cloud Software Developer @ fourTheorem

Join the community!