Observability: one of the strongest muscles for SRE

Video size:

Abstract

All literature about SRE mentions about, what is the most important thing around SRE implementations and you can found in this the team or tools, it’s the greatest muscle of that, but you forget look quick how to looking in a big picture you IT components

Summary

Jonathan Jill: Today we're talking about observability, one of the strongest muscle for SRE. Talk about the observability and how we can use for the speed up in our teams.
Today we're talking about Si ray. Also continue with Dora. How we can use Dora and continue observability, the golden triangle. Why observability for us, the CNCF landscape and a little demo.
Si Ra is exactly the cyberbullying engineering. The objective for SRE exactly is for error the reliability. While DevOps focuses on getting code to production, SRE ensures that code running in production works properly.
The Dora metrics come from the DevOps research and assessment. This is a very huge patch for DevOps and also applies to SRE because with these capabilities for this research have four metrics. Identifying what happened inside of your company is the biggest challenge behind of that.
Observability is a property that I have in my system to degree to which the system can generate actionable insights. There are specialized tools locally observability tools. How do we sell to the managers that SRE is the right path for us if we have DevOps?
The APM is the capability to observe, to SRE, to obtain, but not for generate exactly the impact for the organization. Traces capture timing and context information for each interaction, allowing developers and operation to identify bottlenecks and performance issues. How do you enable the traces?
So why observability? That is the main part of this session because we talk about observability. How do you can see could be the archive or database or load balancer or firewall or cloud provider is wrong. When you enable observability you observe and alerting for that.
The CNCF landscape is very huge tools about the observability and monitoring. Here we have also every tool that every cloud provider has enabled for day. You can enable depends for your expertise or could be you need to take a look more in depth about this landscape for define the current stack for your company.
For that part we enable the capabilities for Docker. You need to install Docker first. Don't ruin this in production because it's not for production environment just for locally it's for your POC. So we deploy first the C advisor second Prometheus and theoregraphana.
So the first part is install Docker, sorry install C advisor. C advisor is a proxy for obtain metrics from the current Docker installation and sends to you and export that metrics. Next we enable Prometheus and enable the per 90 and mount the Prometheus conf for the container. We then deploy Grafana for see what happened for Prometheus.
The next step is enable exactly the new dashboard that we need to take a look for for day. How do you can import that dashboard? Move for the three lines, move for the dashboard here in the new section. Then load. That's the page that load you select exactly your data source, that is Prometheus.
With Grafana and also with Prometheus you can move here very quick and generate the alerting. Also you can drop every rule here in the contact points. You have a lot of ways to deliver that information for using alert manager or communication with another ticket manager.
Observability is a hidden muscle for SiD success. We need to enable observability for our companies. Include investing in the right tools, prioritizing data quality and embracing collaborative and data drive approach.
That is all for this session. Please feel free for drop in the comments or contact me here with jtan 24 for GitHub or YouTube or or Twitter. I hope you have enjoyed this session and see you.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Welcome to this talk. Today we're talking about observability, one of the strongest muscle for SRE. So hello everybody, my name is Jonathan Jill. Today we're talking about this with what part we start talking for myself who is Jonathong, Jonathan deal or Jtan 24? I'm very enthusiast for Linux, very enthusiast for the community. Also I like share a lot of the concepts that I learn every day and my YouTube channel in my repository, in my twitter. So please follow me if you want to start with my current followers to take this tosh for every read that I take. So welcome to this awesome chair. Talk about the observability and how we can use for the speed up in our teams. So what more about this I share with you this awesome phrase. Life is really simple but we insist on making it complicated by Confucius. This is really true because when we have one part of every challenge that we have every day could be some days harder than others. So that is powerful that I liked this phrase from Confucius. So what is the agenda for today? Today we're talking about Si ray. Also continue with Dora. What is Dora? How we can use Dora and continue observability, the golden triangle. Why observability for us, the CNCF landscape and a little demo. So with this agenda we define exactly what topics we want to cover for this talk. How to from my perspective I check and doing in the current stuff or the current job that I done. So first topic talking about Si Ra. So the Si Ra is exactly the cyberbullying engineering could be you take this definition for another talks called but. But the objective for SRE exactly is for error the reliability. Because with reliability we need to think every day what happened with my application, with my infrastructure, with my support. Because every day we need every challenge for our application. How do we move more fast for the application? How do we can support that more fast? How to generate directability for the application. So I take this definition from glossary CNCF IO that is a very high community for generate these mentions. So what is this? Cyrability engineering is a discipline that components operations that sold for engineering the greatest is applied to infrastructure and operations problems specifically meaning instead of building product features, cycability engineers build system to run application. There are similarities with DevOps. But while DevOps focuses on getting code to production, SRE ensures that code running in production works properly. So you have here two mixes for DevOps and Siw. But this is the biggest focus for us. How do we support the code that we generate from DevOps. The result for DevOps how to we take this code and propose the reliability for that code in production, in staging or tools big in dev. Because depend what kind of test I need to generate for every application, how I can test that application for every part of my software, right? So that is the objective for us, how to make more reliability our components inside of the company, how to make more reliability our teams and how to generate t spark very useful. So you can read more in deep about this concept problem that is for ensuring the application rule reliability records multiplicate capabilities for performance monitoring alerting the book into troubleshooting. So that is the thing that the biggest part here, the performance monitoring, alerting the booging to troubleshooting, right? Because when we have DevOps and we start for the DevOps journey, we enable the capabilities for push the code more easily for every environment for Dev staging and product. But what happened when I have deployed that application inside of the environment that I request. So then we need to enable another capabilities for that. These capabilities also was called for. How do I can see what happened with my infrastructure, my application and what happened with the communication for the application and then other components that the application needs to connect. So that is the objective and the principal objective for us. How it helps an SI RA approach minimize the cost, time and effort of the software development process by continually improving the underlying system. The system continues measure and monitors the infrastructure and application components. When something goes wrong, the system points cycability engineers. So when, where and how to fix it. This approach helps create a highly scalable and realable software system by automating operation task. If you check in these three part of the how to sre how to SRE helps the company in every part takes a little part of monitoring or observability or alerting that is part of observability. So that is our focus for SRE. I invite you to read more about the cyberlite engineering, especially for theglosary cncf.com. That is very special because it's a lot of community behind of that. So these SRE little quotes about the SRE the importance for SRE implementation SRE is a critical function for ensuring the reliability, availability and performance of it system and application teams and tools. SRE important to SRE, but equally important is the ability to look at the big picture, the ability to look right. So every part that we're talking about is hey, I need to look what happened with my software, my infrastructure, my communication, my networking. So the objective for that it's very huge because in every part of SRe you need to take some concepts about observability. How do you can see that? How do you can take that information for the application, for the infrastructure, for the database, for the message queue or something like that. You need to identify exactly how your application works and how to do application also as was connected for another task or another system. So that is the objective and this is part why is so important this part of SIW. So with this on mind we have the Dora metrics. The Dora metrics come from the DevOps research and assessment. This is a very huge patch for DevOps and also applies to SRE because with these capabilities for this research have four metrics. That is deployment frequency, lead time for chains maintenance to recover and chain failure rate from the names could be you generate a general approach for these items. But the most important here is hey, if I generate this research for my company what is the objective for that? When you improvements and also take a look for the dorametrics you take a look for your maturity, the maturity that you have inside of the company and what is the capability that you have with generate one new feature or what is the time that you take for resolve new bulk in production or if production is down what is the time that you take for record for that and how do you can measure that? That is the objective because you have here the four components ben. But when you move for I start this metrics and start to how do I can measure that inside of my company is the hard part because you need to identify my repository, identify the process for requirements, identify the repository for features. So you need to take a look for all scopes behind of the IIT process, how to utilize a new feature for the clients, the customers from your internal customer also that is the biggest part that you need to cover how you can see now see or take a look from the repository from the Jira or confluence depends for what is your requirement manager for that tool that you use internally or something like that and take a look hey, how do I can take that metrics and how do I can storage that metrics and how do I generate something behind of that for see what happened with one commit or one branch or something like that. That is the biggest challenge behind of that because every day you need to take hey I have a process for generate that I generate could be a standard for commits. I don't know. You need to identify what happened inside of your company and if you need to move more than DevOps and generate the next step with SRE because with Sre you need to hey, I get more information that currently I have. I need to start cover for the repository, the management for my task or something like that. So that is very beauty because in the way and you addresses your report for that you found a lot of conceptually definition that the people has but nor have they really true. So when you start to read and you start to identify and you start to define, that process is a very nice path. For the other hand, when you identify and you start to call it that metrics also you take a look for the maturity, right? The maturity mentions to you if you have a low performance or could be a medium performance or a high or you are a leader for SRE and you have a best practice for the current ecosystem for it and you deploy every day or could be once, a couple of times during the day or could be in the hour. That is the objective for this research. Identify exactly what is your point, how do you could be moved for the next step. Because what is my new challenge for every part of this research. Okay, start talking about observability. So observability is a system property, right? Notice exactly a definition. It's a property that I have in my system to degree to which the system can generate actionable insights. It allows users to understand a system state for the external output and take corrective actions. I think that is the best part for here because we have currently another tools for generate and take information for my application. But that is observability. Let us check. During the session, computer systems are measured by observing low level signals such as cpu time, memory, disk space. A higher level and business signal included API response times, errors, transaction per second, et cetera. This observable system are observed or monitored. That is the key monitored for the past sentence because currently we monitor it because I enable could be enos the cloud watch or could be I take the logs or could be I saw the information about the cpu or the throw output for my network, but that is the key for that. Or could be just monitor your application. So that is we're talking in this session. There are specialized tools locally observability tools. A list of this tool can be viewing the cloud netting landscape observability section. We will see that landscape in a couple of slides. Observable system yield meaningful actional data to their operators, allowing them to achieve outcomes faster incident response, increase developer productivity and less toils and downtime. That is one of the part for reliability for SRE. One of the pieces could be the agility sells to us. What happened if I generate SRE in my company? Hey, you redo the toil. That is one of the scope. But the other scope is, hey, how do we can observe, how do we can take that metrics? How do we can generate that dashboard? How do we can generate the Roy for our company? How do we can sell to the managers that SRE is the right path for us if we have DevOps? Could be the next step is SRE. Could be, but you need to generate and identify exactly with Dora the maturity that you have currently. Okay. Consequently, you observe a system will significantly impact its operating and developers cost, right. You can deliver for another teams and also the other teams take the most valuable section for that because you enable the system SRE observable currently. And also you forget the current, I don't know, could be your charts about the hey, I need to take a look for the information around for my accounts on production. How do I can take that information? If you observe, you can enable a couple of dashboard and you deliver that dashboard. Then the teams can see what happened for the application without access to the current infrastructure. That is a nice part because you enable the capability and also if you sell that part for security, they open their eyes and hey, that is the way because they can observe the application, but they don't need to access to the infrastructure behind of that. Or could be access to the code or something like that. Just observe what happened inside of that. But it's pretty good. When you sell for your company and sell to your manager, what is the best part that you can take this key, this feature key for your team and how do you reduce the current operational task behind every needs that the team has? That is a big one. X that is related for the NPM, the application monitoring. So observability is different. For application monitoring we have at this part, we talk about the monitor it, but it's the same. Not exactly. That is when we open the path and we generate the bifurcation about what is observability, what is the APM. The APM is the capability to observe, to SRE, to obtain, but not for generate exactly the impact for the organization. Because you just see what happened inside of my components, but you don't put anymore for that. But when you start to move for the next step with observability, you start to generate alerts and could be take that alert and generate, could be an automatic task for solve every alert or solve a couple of alerts that you have identified that is the biggest difference here between observability and APM also exists another ones, but is the part for split that part, right. We talk about for APM observability talking about the golden triangle. The golden triangle is about the information behind of my application or my infrastructure, right? When we talk about infrastructure we talk about the components behind of the my application database is queues, networking, I don't know, another firewalls or something. Another networking tools or elements that you need to know to deploy for support your application. So in what part we need to obtain the logs and also store as that locks the logs is exactly the unstructured data that provides a record of events and action within a system, right. These logs are typically text buzzed and are used to configure information about system behavior errors and donate relevant events logs sre very useful for toolshooting issues and identify patterns on trends over time. How do we can obtain these logs and how do we can storage that logs for the next one? The metrics the metrics is exactly the structured data that provides a quantitative measure of system performance or behavior. Metrics are typically numerical values that can be aggregated over time and SRE used to monitoring key indicators of system health and performance. Metrics are very useful for detecting anomalies, setting performance their jets and making data and driving decisions. So that is for metrics the next races exactly. This is part of the record for the interaction between components or service within a distributed system. Traces capture timing and context information for each interaction, allowing developers and operation to identify bottlenecks and performance issues. Traces are useful for understand how requests flow throughout or a system. That is very important because when you have the traces you identify exactly what is the point the request go to. For example, you have the access to one login page when down to the back end, the back end take that greatest and also generate some operation. What kind of operation could be storage? The access to one user could be read the database for obtain information from the user stable or identify if the users could be success to login obtain information around the users. How do you can identify that you enable the traces and also how do you can enable the traces? That is the biggest challenge here because you need to instrument your application. That is a key pair for observability. How do you can enable this capability for your application? How do you can enable this bigger scope for the application? So that is very nice path for discuss with your dev team and also for architecture because you need to modify Colby your framework for the companies or Colby generate a new framework for that. And also you use a pattern for architectural design for that part. That is very nice part because you define with another teams how to the system will be observed. How do you can expose that metrics and how you can take that metrics for storage. That is very awesome and very nice path for you. So why observability? That is the main part of this session because we talk about observability. Hey, observability seems pretty good but really I need observability. Really I need to take advantage for observability. What happened? I don't have observability in JR, my company. I have any change for that. Really not. But you have a lot of new features for enable for your teams and also for you when you enable observability. Because for example that is a single application that you could be have in your company that you have here the customer, the customer comes to your application using your cloud provider. The cloud provider you have could be generated here a firewall for identify if the request is legal or not and you check and identify here there's a couple of load balancer for sends the greatest for the correct application. And also the application here takes that request and generates the logic behind of that request and could be consumed the database or could be consumed some requests for your storage. But what happened if this start to that? You move for the next step. You start to split your application and you split your business capability and you expand your business. What happened here? You have the same application or could be this part is just a little part of the application and you have here one challenge. Could be you made that but what happened if you start to doing that? What happened with this speedy can that sre men start to identify, hey, the firewall is wrong. Configurated, I don't know, could be the database is down or replica is down. What happened if my application is down? How do I identify that? Or could be the load balancer or could be my region was down. How do you can identify that? Or could be me. My archive has downed also how do you can move for that? How do you identify so you need to enable observability because how do you can see that? How do you can take action behind of the application or the infrastructure or the components behind and support the application? Because normally when the request fails you take a look. Hey, but the request was pretty awesome to the app. The problem is the app. Then you move the request for the developer team. The developer team. Hey, I put idle local and locally worked for me that it's normally or could be you can reply the same request in development environment or QA environment. In that environment works well. But how do you can see could be the archive or database or load balancer or firewall or cloud provider is wrong. How do you identify that? That is exactly the question for that. If the application works well of my components works well for the application or my infrastructure works well. Hey how do I can take a look for that? I need to take this very important observed system for my company, right? So behind of that could be access and other components like DNS or could be generate your security compliance or you generate a new fixes or I don't know, every day you need to cover new feature from the different users not for the customer, just for different users. Like a security could be for a compliance for the loyal from your company something else. Yes. Then how do you can identify exactly in what point the application was wrong. Then you can observe that that system that you identify. Hey yesterday we made some little changes about for new loyal support that request the company. So when we deploy that the company for firewall could be wrong will be the security team generate new feature also because they need to also update for the firewall or the security group of the rule behind of my load balancer. Then the application start fails. So how do you can see that again white observability another part for observability is alerting. It's not when we observe, it's just hey I can see but when you enable observability you observe and alerting for that. When you start alerting you start to move more in the observability ways, right? That is the objective for observability. Another is hey, we need to take a look for dashboard, we need to automate some remediation for x case or something like that. Could be you move for observability as code. There is a new moves for our companies when you could be but to instrument application or could be you have and generate cluster for kubernetes and Kubernetes had an operators inside of that and take information for the applications and generate that. But instrumentation for you that is one way for doing that. So bye bye little speedy and Star wrote about the CNCF landscape that is very huge tools about the observability and monitoring. Here we have also every tool that every cloud provider has enabled for day. So we have the Amazon Cloudwatch Appdynamics application availability service, application manager, Appnata app, optics app, signal alternative azure monitoring bits. You have here a lot of tools for the community and how to the biggest company use that for their services. In another how do you can use for logging you can use gray log, home logi logly fluent Grafana, lockey or elastic or lockstash. You can enable depends for your expertise or could be you need to take a look more in depth about this landscape for define the current stack for your company. Right. How to move for tracing. How do we can enable tracing? Hey there is a very huge community behind of this components that is open telemetry. The community behind of that is the second community bigger behind of Kubernetes Kubernetes spheres. Second is open telemetry. I recommend to you take a look for this community and also they are very active, very active community for generate every day some new feature for you and also supported a lot of languages for you and components behind up. Okay another part is for observability and talking about chaos. Yeah chaos is our part for observability. Then we need to take a look for that. But it's pretty awesome. You here take a look for the landscape for CNCF, that is what happened here. And also information about the continuous optimization behind can observability take a look for hey how do I configure that? How do I can take that log? How do I can take that trace? You enable another capabilities when you start to generate a journey for observability. You also enable the capability for the customization because you see hey I take could be a one instance that have overloaded for the application that is normally or not or could be identify the pod that you deploy in kubernetes cluster are overloaded tools or could be you generate can HPA definition for that or could be you identify the process that they execute. Could be moved for another node. So that is another capability that you enable when you start the journey for observability. But then you need to take a look for your company. You need to take a look exactly how do you move for that part. But first you start to define what is observability, how to decompany your subservality inside of that. Right. So we move for the demo session, that is the repository. If we click here we move so fast for the repository that is here. And also you can see here the information about how you can execute this demo. For that part we enable the capabilities for Docker. You need to install Docker first. That is this disclaimer. Don't ruin this in production because it's not for production environment just for locally it's for your POC and identify how you can enable observability for the Docker stages. Right. So we deploy first the C advisor second Prometheus and theoregraphana. Right? So with that in mind I but here some reference that I take for doing that and reduce this readme file for you and also the listen here is Apache listens for you and your time and your team. You can put and drop this for your code and I don't care. That is part of exactly what happened when you work with the community, right? So I cloned the repository here. You can also download the repository with let me check here. When you see git remote, you see here that is the current information about the repository that I cloned it and also we have their files here and I have one extra folder because in the readme file we talk about them exactly for this part when you try to optional point that is for workloads for your Docker environments. I don't know before for doing that more easier for you. And also I drop some information here in the git ignore for don't put this folder because it's just for using when you download the folder and not for generate garbage for the repository. So with that in mind we start to execute the instruction inside of the repository. The first is install C advisor. So let me check here Docker ps docker ps ip we have here C advisor before Grafana Prometheus let me remove that Docker advisor Prometheus and Grafana right? That is the optional point that matters about that dokierps do Krps here we go, we clean our environment. So the first part is install Docker, sorry install C advisor. What is C advisor? C advisor is called be a proxy for obtain metrics from the current Docker installation and sends to you and export that metrics, right? When we move for the presentation here we have the local host access here so we can copy and paste for the new tab. Here we go. We have currently metrics that the C advisor taking from the current deployed from Docker environment, right? So the next one that is the GUI for C advisor and this is the current metrics. That is a lot of metrics that you can see here because take these metrics for Docker and move and expose for you for you can use for your preference, right? For that preference we enable Prometheus. So we copy the Prometheus Jaml file. But what contains Prometheus Jamil file? Let me see here, that is the Prometheus Jaml file for that part you need to check what is exactly your IP internal address for that deploy. For my case I doing that and also we can start for deploy this part. So we move the Prometheus part here CP prometheus conf tmp and also we can start to run the Docker environment. So let me pass the command here. So we enable Prometheus and enable the per 90 and mount the Prometheus conf for the container and drop to etc. Prometheus. Prometheus Java Yep. And also the image that we use is from Prometheus. Here we go then Docker. Ps we have currently two services in Docker. What is c advisor? That is read the metrics about this metrics will be docker stats. Docker stats and that is the current metrics behind of that. When we up the optional point we see more metrics here. Okay. The next point is hey we need to take access to what kind of metrics we have and cover that for the part of Prometheus. Right. We see here exactly the status for the components, the configuration. I have here the information about the ScEp configs for metrics exactly for the current metrics and also the metrics for C advisor for my internal ip address. Right. When you see exactly the current metrics here graph, no alerts, no status runtimes to configuration common targets. We have two endpoints. That is for c advisor and this is the endpoint and the endpoint for Prometheus that call it metrics currently is actually for Prometheus. So here take the name for the current docker name localhost here localhost. You see another metric here. That is if you follow the instructions here you need just copy and paste. That is more easily way for that right. That is for target. When you saw the target you can check exactly the same way. So we need now deploy grafana for see what happened for Prometheus. Because hey that is very awesome. The metrics it's a text basis but sincerely I don't have time for read that but it's very normal. We can copy this move for the demo session here. Here we go. Do care. Ps we have currently Grafana in the port 3000 move for the presentation part and also copy okay and open Grafana here you need to access for your grafan account. That is admin admin. That is all our default passwords for us and admin admin. Here we go. We have access currently for Grafana. That is pretty awesome because you have hey that is very nice ecosystem for us. And you see here exactly the grafana. Let me move here at this part because the presentation doesn't work well. Okay. That is the first part when we move for the repository we see here. Hey, we need to generate a couple of configurations inside of Grafana. So could be that part is pretty wrong here. But you can move for the raw file readme file and move for the go to in row mode read most readable for you right here. So down here and we enable first go to administration data source, add a new data source, select Prometheus in URL drop HTTP uip 90 90 navigate to bottom, save and text and done. Right, let me drop here this and move for that. So how do we come up for that here the three lines on the top. Okay, three lines on the top. That part move for them. Administration here and data source. Right. When you start here you can add here a data source and select Prometheus. That is all the configurations. Then you need to put here HTTP. HTTP two points. HTTP, that is my ip that is going to take the information for Prometheus will be copy and paste here and just change the port. Right. The port is that we saw in the current information here your port 1990. Okay, I can navigate to the bottom, save and test. That is the green. That is awesome for us because green seem for us everything going well. Okay. Return to the home because the next step is enable exactly the new dashboard that we need to take a look for for day. So let me move it here very quick. And also we execute the information about here. That is the reference. But we deploy these monitors for our system. So for that you can access for this HTTP page and also appear here what is a little tom for the dashboard you copy the it to Kiri bar and also move here. How do you can import that dashboard? Move for the three lines, move for the dashboard here in the new section, let me move here this pretty quick. And in the new part you move for the import paste here exactly the id that you copy in the previous page. Then load. That's the page that load you select exactly your data source, that is Prometheus. And the name could be you can change or could be you generate a new folder for that and import and voila. That is information for the current metrics inside of Docker. That is pretty awesome because you can see here exactly what happened with your docker. But here it is another information for docker monitoring for using Grafano two so you can copy that and the same way three dashboards save and save changes, new import paste the information for the another part I want to change Prometheus and import. Voila. This is another dashboard that you can see the same information with another view. That is pretty awesome because we enable the capabilities for observe our system, right? That is the first part. But with Grafana and also with Prometheus you can move here very quick and generate the alerting. You can drop here. Your alerts you can drop here the rules that you made more sense for you and your company. How to the behaviors you identify exactly for the current docker or your application or your cluster of kubernetes or your infrastructure. You have the possibility to enable here all alerting that you have. Also you can drop every rule here in the contact points. You have a lot of ways to deliver that information for using alert manager or communication with another ticket manager or using discord or email. Or could we put that for Kafka or pageduty or send an alert for slack for your team or generate alert for telegram or something like that. You have the capability for connect Grafana with all scope for your current stack for your IT infrastructure. So that is pretty awesome the webhook because with that you have the capability to send to another don't support exactly a tiered provider out of the Grafana and you can send the information for the alert and what happened with that alert and generated there could be a custom payload for that. And since just the information related and more valuable for you and the alert will be more great for you, right? So we can move here very quick. That is the demo. Please tell me the comments or what appear for you this little demo appear for the zone for observability. Observability is a hidden muscle for SiD success. We need to enable observability for our companies. How to the observability made more bigger our teams. How do we can enable another stuff for the company? And also what is the view of the SRE Andora observability? What wide triangle alerting and remediation that we're talking about and a little demo some practice for observability. That is another part that is effective. Observability requires focus on business value. That is the objective for that and alignment with authorization and goals and acute for continuous improvement. How do you can fix and how do the objective for observability is the same for the company? Not just hey, I have one way to generate observability stack but the company doesn't need observability. I don't know. You need to take this morning deal with your company. Another is the best practice. Include investing in the right tools, prioritizing data quality and embracing collaborative and data drive approach. That is another that you can enable for the observability and thank you. That is all for this session. Please feel free for drop in the comments or contact me here with jtan 24 for GitHub or YouTube or or Twitter. Or you can find me like Johnny Pong in LinkedIn. Please feel free to contact me. I have sometimes for response to you so thank you for your time. I hope you have enjoyed this session. Thank you and see you.

See all 20 talks at this event!

Conf42 Site Reliability Engineering 2023 - Online

May 04 2023

Observability: one of the strongest muscles for SRE

Video size:

Abstract

Summary

Transcript

Jhonnatan Gil Chaves

DevOps Engineer @ Globant

Join the community!

Featured event

2025

2024

Info

Conf42 Site Reliability Engineering 2023 - Online

May 04 2023

Observability: one of the strongest muscles for SRE

Video size:

Abstract

Summary

Transcript

Jhonnatan Gil Chaves

DevOps Engineer @ Globant

Join the community!