We'll use an example application to describe how to define SLIs and SLOs, including an overview of architecture, a how-to for developing SLOs, and suggestions for implementing SLOs. We-ll also focus on how to identify CUJs and recommendations for implementing metrics to use as SLI and SLO targets.
Writing postmortems after incidents and outages is an essential part of Google's SRE culture. They are blameless, widely shared internally, and allow us as an organization to maximize the insights from failures. We touch on how postmortems are written and used at Google, as well as how they can help in making decisions and driving improved reliability. We also show how you can get started with...
Learn for free, join the best tech learning community for a price of a pumpkin latte.
Event notifications, weekly newsletter
Delayed access to all content
Immediate access to Keynotes & Panels
Access to Circle community platform
Immediate access to all content
Courses, quizes & certificates
Community chats