Improve Your Automation to Reduce Toil

Video size:

Abstract

In the course of your day as an SRE, your knowledge and expertise are in high demand. You can’t do every task every person in your org needs from you without the help of comprehensive automation.

Automation can be tricky. Some systems aren’t built with automation in mind, but assume that a human being will be there to keep an eye on things and fix errors on the fly, and we can’t be everywhere when there’s too much to do.

Plus, you want to provide access to automation for the right folks and keep a record of when the tools were used.

In this talk, we’ll cover some things to keep in mind when you’re building out your automation library, characteristics of good automation, and give you a look at PagerDuty Rundeck, a platform that will help you share your expertise with other folks in your organization.

Build automation that works for you and gives you your time back!

Summary

Mandi Walls is a DevOps advocate at Pagerduty. You can enable your DevOps for reliability with chaos native. Create your free account at Chaos native. Reach out to Mandi at any time on pagerduty, our Twitch handle.
Automation is a key component in the management of complex real time systems. To get work done and to do it right, teams rely on automation for their common tasks. Automation can take different forms. After reducing toil, we'll talk about what toil actually is.
Toil is the repetitive tactical work that increases linearly as the size of the environment increases. Leaving it neglected will definitely impact user experience sooner or later. Speed curbing mistakes, reducing toil might be some of our goals for things we are targeting for automation manual processes.
In your future, you spend most of your time refactoring your automation. Smoothing out the tasks required to get basic tasks done leaves more time for doing all the fun stuff. Instead of running manual processes, you'll be performing more value tasks on a daily basis.
Having fully automated environments can hinder skill development for junior engineers. Maintaining the automation tools for a service is part of maintaining that service itself. Learning how to maintain and test the automation and run folks is a key way to help your new team members learn about the services.
There's a wide variety of tools and platforms available to help you automate workflows across all kinds of options and ecosystems. Here's a list of requirements for good automation. We want our automation to be testable, flexible and repeatable.
Idempotency is the property of certain operations whereby they can be applied multiple times without changing the result beyond the initial application. For tasks that you want to automate, you're not going to have a human there. You'll want to add some check to your automation to determine if the change needs to be made in the first place.
You want to automate the tasks you do most often or the ones that take the most time. How comfortable your team is with any of these things being automation definitely varies. What tasks still exist in your environment that could be automated?
Rundeck is an automation tool for production teams. Can combine with pagerduty for auto remediation of issues before they become incidents. Allows teams to securely perform tasks in production and then delegate things to other teams.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Are youre an SRE, a developer, a quality engineer who wants to tackle the challenge of improving reliability in your DevOps? You can enable your DevOps for reliability with chaos native. Create your free account at Chaos native. Litmus Cloud hi, welcome to improve your automation to reduce toil. My name is Mandi Walls. I am a DevOps advocate at Pagerduty. If you'd like to get in touch with me at any point I am lnxchk on Twitter or you can email me. I'm mwalls@pagerduty.com and you can reach out to Pagerduty at any time on pagerduty, our Twitch handle. So let's talk about automation, right? So were going to start with a bit of the basics. Hopefully you've done a bit of automation before, but maybe you haven't really thought about what actually you're doing, like in an abstract way. Right. Were heard the word and have a kind of a concept of what we're after and the goals that we have for automation. So we're going to talk about exactly what that means and how to sort of put it together in a way that is going to be successful for not just yourself but for your team and youre larger organization, right? So when we're thinking about automation for it practices, right, were going to take some manual processes and someone on our team had to run them piece by piece, probably, right. Were going to take that stuff and were going to ask some kind of machine to do it for us. We're going to look for tasks that machines don't need to do any other thinking for things that machines will do very well. They don't require a lot of nuance, they don't require a lot of creativity, they don't require a lot of lateral thinking in the moment. And those are the things we're going to encode so the machine understands how to manage them. Right? And then we're going to let our humans, they're going to go do something more interesting and hopefully more valuable to our organization. Right? Automation is a key component in the management of complex real time systems that a lot of us are immersed in, right? The amount of information that has to be processed and taken into account when you're making a decision or making a change is immense, right? Microservices, cloud platforms, third party and internally developed software, all these things, they all have their own idiosyncrasies and behaviors and there's just so much stuff that one person really can't hope to know. All of the things about all the things that we work on. So to get work done and to do it right, teams rely on automation for their common tasks. And automation is going to help them avoid mistakes. It's going to increase their reliability, it's going to increase the repeatability of those tasks. And overall, we're really. After reducing toil, we'll talk about what toil actually is in a moment. Right. We're going to use automation to build and test and deploy our software. We are going to use it to create and maintain our infrastructure. Anytime we perform a task more than once, or we have a task that more than one person should be able to perform in exactly the same way, we really should be considering automation so that we can ensure that those tasks are completed as expected. Our team members aren't automatons. Right? They're not robots. So when we want to make sure processes are always performed the exact same way, we produce our automation and we create little robots, right? So automation can take different forms. It might be a library of scripts and tools. It might be a set of Yaml files that are ingested by another tool. It might be encoded into the configurations of even just your build server, right? These are things that collect knowledge from your colleagues and store it. In a place where everyone can use all of the amassed knowledge of your entire team, you are collecting up expertise, encapsulating it, and sharing it. So let's talk about what toil means, right? Toil for it, teams is a four letter word, literally and figuratively. Right? It's things that you have to get done, but nobody really wants to do them. They're not fun, they're not interesting. Right? Teams want to do interesting things. They want to do tasks that impact the bottom line, that create value, that require some kind of expertise, but also creativity, right? And toil is kind of the opposite of that. It's the repetitive tactical work that increases linearly as the size of the environment increases. So think about things like deploying software to all of the systems, adding user or system accounts to all the systems, scaling environments, or testing backups, if you ever actually do that, right, even in fully modern systems, the task of regularly building your short gives environments. It just becomes a new version of toil, just rerunning those things over and over. It's not something you want to be doing by hand. We know this work has to be done. It doesn't particularly add value in the moment to your customer or your end user. But leaving it neglected will definitely impact user experience at some point sooner or later, right? It might eventually allow a security breach. It might cause degraded performance. It's kind of like hygiene, right? Think about brushing your teeth. I know we've all been stuck at home for a year and a half, but if you skip a day brushing your teeth, okay, most people won't notice. But if you quit completely or you put it off for a long time, people are going to notice and they're probably going to be concerned for you, right? And that's sort of the same kind of things with these tools based tasks, right? They have to be completed. But fortunately, unlike brushing your teeth, we don't have to do these tasks with humans, right? We really should be automating them so that our humans don't have to do them. So looking at some of the things that we might have as drivers of automation, right, we might have some complexity, speed curbing mistakes, reducing toil might be some of our goals for things we are targeting for automation manual processes in general. Sre prone to mistakes. There's plenty of opportunity for typos to cause havoc, right, when you're trying to work in large environments, anything from typing commands in the wrong terminal window, right, to missing options off a very long command string, to skipping a step when you're copying and pasting out of a wiki onto another page or onto a documentation or out of there into the terminal or whatever it is. Modern IT systems might be comprised of hundreds or even thousands of individual components, right? So you've got your cloud infrastructure, you've got containers, you've got hosts, you got networks, you have services for monitoring, collecting metrics, you have alerting systems like pagerduty, you have log collection, you have authentication and authorization. Maybe youre a b testing or beta testing for new features. You have storage and runtimes and all of this stuff. The number of possible combinations is nearly infinite. So interacting with all of these, you're just multiplying how many possible ways things can go wrong. And any of them can change at any time, right? Third party services and resources get changed by the vendor on their own schedule. They don't care what your schedule is, right, requiring updates and changes. And then you've got youre internal development and that requires changes too. It's hard to keep up with where all the things are and what they're supposed to be doing, even for teams that are super conscientious about documentation, and I know you all are, my list of service instances might be outdated before I finish it, right? Especially if my environment uses sophisticated auto scaling. Any work that I need to perform on the instances needs the most recent data and also probably some kind of query to an API instead of a hard coded list of instances. Hopefully we are all sort of beyond that point where we've got like one big host file for everything. You might not be it happens, right? But it's easy to make mistakes when we're using these manual processes, right? Especially things with lots of steps or complex commands. So one thing automation really does for manual processes is that it allows your team to permanently record all the good options and all the preferences and put them in a place where you don't have a chance to forget them or leave them off or whatever it is, right? So you're getting to a place where all of those good things that you learned are recorded, right? And then finally we automate to avoid toil all that repetitive work that we just talked about and things like patching systems and restarting services or whatever it is, a team still has to know how to do all these things in order to automate them. But performing the same tasks over and over isn't the best use of our team members time and then thinking about what we actually do use our time for, right, a couple of myths about automation, right? One of them is, is it possible to automate yourself out of a job? And it's a fun myth. And I love this XKCD cartoon, right, because it kind of links back to what we were talking about with saving time. But the cartoon is, I spend a lot of time on this task, so I should write a program automating it. And you find that in your future, you spend most of your time refactoring your automation. Hopefully you're not doing that, right? So when you're thinking about what your job is going to be after you've automated all of your tasks, right, you're really automating yourself into a new job. Your team has automated the tools out of the everyday tasks. You're hopefully left with the work that requires more creativity, things that require more long term planning that provides more strategic value to the organization than deploying thousands of patches over the course of a month, right? Things like planning improvements, building more robust disaster recovery plan, building and shipping more new features for your users. Smoothing out the tasks required to get basic tasks done leaves more time for doing all the fun stuff, right? So at some point, your day to day job will look significantly different from what it once was, right? Instead of running manual processes, you'll be maintaining and updating those things periodically and performing more value tasks on a daily basis. So the cartoon little bit tongue in cheek, you definitely don't want to be spending all of youre time maintaining in your automation, but there's going to be work involved and that hopefully will be fun and of a higher value than some of the toil things that you're doing. Another myth is you don't have to know anything if it's all automated. And this is an interesting one, right, because there is research in systems engineering and automation engineering that indicates having fully automated environments can hinder skill development for junior engineers.org new team members to create really useful automation, someone on the team has to have had at some point really robust expertise in the systems and processes being automated. Your team won't be successful if no one knows what's really going on even with automation, though. So you have the automation and you might have maybe that the person who wrote it has left. Automation is part of the lifecycle of your application and your systems, and it will need to change as those services change and are updated. It will probably need to be updated when the operating system or dependencies are updated. Maintaining the automation tools for a service is part of maintaining that service itself. So learning how to maintain and test the automation and run folks and other tools is a key way to help your new team members learn about the services. Does the start stop script need to be updated? Are logs now going to a different location? Are updates now being downloaded from a different artifact repository? These are key pieces of system information that your team will use to maintain all the tools and things that you help run. And they are components that will also help new members gain more expertise into the systems that they're working on. So keeping this in mind, right. Making sure you're doing skill development and knowledge sharing with younger team members, with your junior team members, the new folks on your team, it can be super important. Absolutely. So let's talk a little bit about what actually makes good automation. We've seen it, but how do we know what it is? Right? So I've borrowed a set of requirements from Lee Atchison's book architecting for scale because I things he has really encapsulated the key points here into a nice bulleted list. So there's a wide variety of tools and platforms available to help you automate workflows across all kinds of options and ecosystems. Right. It's hard to know which tools might work best in your environment, but as a baseline, there's a list of requirements here that might be helpful. Right. So looking at some of these here, we want our automation to be testable. We want to be able to test that the automation is correct. It's going to be out there in our ecosystem, it's doing things on our behalf, so we want to trust it, right? Maybe were going to apply some TDD methods for the automation code will help us as well. So relying on our test suite and were making changes and making updates to not just the application code but also the automation around it. And then we want it to be flexible. We want to get a lot of value out of it over time, right? So we're not relying on hard coded system lists or other data when we can add maybe a query or an API call. So variables or version numbers and service names are going to help us for upgrades and making sure that we get a long lifecycle out of this automation that we've invested in, right? And then we want to put our automation into our version control systems and practice code reviews. And that will help us maintain this piece of automation over time, right? It's far better than having a directory full of script sh underscore back files for folks to wade through when they have questions, right? You use your version control, you use your code reviews and your test suites, and that's going to help you manage assumptions and catch issues before they become production problems, right? They also help your new team members, like we were just talking about, become familiar with the services and the automation and all the rest of the pieces in the ecosystem. So keep your automation for related systems the same, right? Make it applicable to all the other things. This can easily get out of hand if your application teams arent required to use the same runtimes or other tools. I totally understand that. But anywhere you can reuse components, you should do it. Create official libraries, best case methods for dealing with your most common components. Make these solutions the easiest way to get work done. Maybe it's a fast track, maybe you don't need a change ticket or other permissions to use it if you're using the blessed version, right? And then finally we want repeatability and audibility, right? Auditability. We want to know that every time the tool runs it's going to produce a predictable outcome. And some tools are far better at this than others and provide easier tracking for who made a change or who ran a command. But overall, across the sort of entire marketplace of automation tools, this has been getting a lot better over the past several years. So hopefully the tools that you're using and the things that you're looking at or considering are also providing that kind of feature. So then let's talk about keeping that in mind, right? All those requirements. Thinking about the big vocabulary word in automation, and that's item potency. I've also heard it pronounced idempotency. I'll go with item potency. It just flows off the tongue a little better, right? Hopefully youre heard this term before when you're thinking about automation and automation products. But if not, don't worry, let's do a quick review. Right, so idempotency has a fancy mathematical definition, right? So idempotency is the property of certain operations whereby they can be applied multiple times without changing the result beyond the initial application. It sounds confusing and potentially ominous, but it's super helpful for thinking about what happens when you run a piece of automation more than once. How are you going to handle any messages that it might generate, right? So what happens when your systems, you want to add a user but that account is already present? What happens if you want to rotate logs but they're already rotated? Does your log rotation create empty files like certain versions of Mandrake Linux did in 2001, which I still have nightmares about? If you are installing a software package and it's already installed, what happens? What if you are concatenating a new configuration line to the bottom of the file? Does it create a new file? Does it delete it? Does it just keep pushing more and more and more copies of that line into the file? What happens, right, for tasks that you want to automate, youre not going to have a human there. By definition it's automation. You're not going to have a human there to read the output and say oh, this has already been done, I don't have to do it again. Right. You'll want to add some check to your automation to determine if the change needs to be made in the first place. Right. If the thing you're trying to accomplish has already been completed, job done, you don't have to do it, right? So this is where automation starts to get pretty complex, right? You don't want your scripts to bomb out, youre don't want them to return an error. If the work that they want to have done is already done. You're already in a successful state, so you should report that, right? So you also want to make sure that the state that the system is in matches what you want and do those things. So some system tools will already have some of this built in so they won't try and redo work that is already done and they won't also then drop an error. Other tools you definitely want to verify because they might not redo the work, but they'll return an error code, which might mean that your automation fails. And that's not fun either. Right? So if you're building your own tools, you can keep this in mind, right. You'll want to build in some of this input and see yourself. I want to create this file if it's already here, what kind of process am I going to take? If it's here but the contents are wrong, what's my next process? Those kinds of things and how you build that stuff in. So over time, as you're building up your automation library, you're going to have more skill around checking things on your systems, depending on what kind of systems youre running on, and get some best practices around checking the state of things and how those things are going to work. So another fun thing to think about when we're automating is what stuff do we bother to automate? Right. You could think about, oh, I want to automate everything. I just want it to run all by itself without me. But that's not realistic, right. You want to automate the tasks you do most often or the ones that take the most time. Right. That's going to save you the most effort over the longer term as well as reducing your overall tools. And another XKCD cartoon, because there's always an XKCD cartoon for these. Right. This is kind of, again, a little bit irreverent plot of how long can you work on making a routine task youre efficient before you're spending more time than you save. And it plots it across gives years, right. So looking at how much time do you shave off the task along the y axis and across the x axis is how often you do the task. So if you're being something all the time, automating that even just a little bit is going to save you a lot of time. But if you only have a task that you only have to run in February or whatever, maybe automating that the time to take to automate that isn't worth it if you aren't going to save that much time. So something to think. But when you are thinking about your tasks, right, we're going to keep in mind our task requirements, we're going to think about our item potency, and we're going to think about the right tasks to automate at the right time. So looking at maybe some tasks that we could automate, thinking about these in a slightly different way. Right. So in this particular graph is looking at things specific kinds of tasks, rather than sort of the abstract view of the last one. We're looking at tasks that we might need for incident response, right? Because I work at Pagerduty and we do a lot of incident response. And the x axis here is impact, right? And it tells me if a change needs to be made and I'm going to automate it, is it a change that makes maybe no impact on the running systems, or is it something that has a high impact? And then across the y axis, we have things that are simple, maybe a single step, and then more sophisticated things that might be multistep or multi node or complex workflows or need a little bit of orchestration, right. So if we're struggling with what things should be automated, you can make a list of tasks and sort of plot them out for your team so that you can kind of tackle the things that are no impact and simple, the things on the bottom left to give yourself some confidence in building automation. Then the tasks with higher impact are things like you're restarting services, maybe it's a single service or a group of service, or you're performing a failure or whatever. And then highest impact tasks can change key pieces of your infrastructure, changing your firewalls, rolling back or redeploying software and those kinds of things. So thinking about the kinds of tasks that one you do most often or take the longest time and you can save the most time off of, plus the tasks that are where your level of comfort is in the kind of complexity and impact the tasks themselves will have for you. So how comfortable your team is with any of these things being automation definitely varies, right? There's definitely places, different teams that are like, we don't want to automate anything. We're super afraid we're not skilled in this. We're thinking like, it's going to go crazy, like the brooms in fantasia or whatever, right? So you might have a lot of things that actually sre pretty complicated, right? You might have a service that compiles all of its libraries into memory, and you can't really do a cold start, restart fast. So might be something that goes to the bottom of the list. And we'll think about automating that after we gain a bit of confidence in youre overall automation skills and then looking at building up our library. The way youre humans interact with all this automation is a bit of maturity scale as well, right? How we look at automation evolves over time. Right. As our team gets more comfortable with specific types of automation and you get better at creating it, the human interaction really should decrease, right? And the automation runs more on its own. Youre building confidence. You're building trust in the processes that you have. So you start out with what we call automation opportunities. Really a fancy way of saying things haven't been automated yet. Everything's really still manual, right? What tasks exist in your environment that could be automated but haven't? Make your plot and figure out your good targets for that. Then we look at human initiated automation. These are our common scripts and other tools that our team members can run on demand when they want something done or need to complete a task in can unscheduled manner. Right? Your basic scripts and pieces in your library directory. Super helpful, right? Automation with oversight is automation that starts running on its own in response to some environmental trigger. This might be simple things like your cron jobs or youre rotate your logs, or more interesting things like restarting a service when it stops responding to queries. Depending on your environment, you might have auto scaling in this sort of thing that you don't quite trust it yet, so you do keep an eye on it, but it kicks itself off automatically. It might still require some humans making SRE it runs okay. You might have a little alert that pops up and says hey, script a is running, please check me out or whatever. So while it might start on its own, it might also let folks know that it's doing a thing. And if folks aren't comfortable with it yet, they can check it out. But eventually you get to automation with fallback and the automation runs and only requires humans to look at it if something goes wrong, if it finds an error or an unexplored edge case, that it has its own escalation functionality to let humans know that it wasn't able to finish its task or fix what it was supposed to do. But if all goes well, the automation doesn't necessarily need to report, right? You'll see a significant reduction in overall toil at this point, right? So when you have built up all of this trust in your library of automation and components and scripts like that, and eventually you might get to the monitor and evaluate phase and you might get to this phase with certain tasks and not others, right? When things get done, edge cases are managed and instead of tasks creating tickets.org alerts, they might just create metrics rather than a report saying hey Brian in April cleared n requests for x task. This week youre might have a metric instead that says x task was completed n times this week by the automation, and then you're still sort of managing it as part of your environment, but your automation is taking care of all the work and you don't have to do that, right? So not all of your automation tasks, not all of your systems will get to this point. They won't all go through all the phases. You might already have some stuff that you totally trust, right? But some youre complex time tasks might only ever get to automation with fallback due to the nature of your complex systems. That's totally fine. And the important part is to be thinking about where you're headed, the things that you need to do and the things you need to accomplish with your automation so that youre sort of constantly evolving and improving and making sure that the automation that you're producing is actually helping your teams get better and reduce their overall tools. So let's talk about just to sort of wrap up a tool called rundeck. And I won't do a full demo, but we'd love to show you all the wonderful things that Rundeck does if you'd like to see them. But Rundeck is an automation tool. It's an automation platform, right. It's a software solution really specifically built for the kind of automation that production teams, teams that are working on services that are customer facing, user facing, and the tasks that need to get done there. Right. Rundeck, you can combine it with pagerduty, right, for auto remediation of issues before they become incidents and accessible tooling for responders during incidents. And we love that because we're driving down our meantime to resolve and those kinds of things that the ability to automate some of those tasks, whether it's something simple like just gathering up the logs or doing a restart, can create a lot of improvements when we're dealing with incident response. But Rundeck itself really provides more of a generalized platform for your team to securely perform kinds of tasks in production and then delegate things to other teams. So what you really have is a way to encapsulate expertise. You have the folks that are the subject matter experts and they can write little bits of automation and different steps. And the best practice for restarting this thing, or here's how we do our patches and updates on this platform, and here's how you rotate the logs for this particular runtime or whatever it is, and you take that piece of knowledge and you stick it into your Rundeck server, and then you make it available to anyone who might need it, and you can hook it into your authorization authentication servers and off it goes, right. So they can build up these complex workflows and allow anybody else to manage things. So if you're coming at it from the perspective of say, an SRE team, you create all the tasks and tools and little bits of things that people ask you to do all the time. And there's tickets coming in and they're like, can you rebuild this dev environment for me? And blah, blah, blah. And what you can do then is take all this stuff that you've learned and put it into Rundeck and say, hey, yeah, here Brian, go run this task. You now have permissions to run this task in the dev environment and that's going to redeploy the thing that you needed and you don't have to ask us for it anymore. So your automation is less and running on its own all the time. It's going to be human initiated automation, but the humans that are initiating it, SRE folks that don't necessarily need to have all that same expertise that say your SRE team does. So it can help you deal with that kind of everyday tools and requests and things like that that come in. And the way that ends up working is you have your users who maybe need a thing done right now because they're blocked on something and they can just go to the Rundeck server, request the task and off it goes. Because someone has already prepared the automation, tested it, and then provided it for them in a secure way. They don't ever have to touch any of the nodes that might be out there living in the real world that you don't want them to touch. You have to stay away from them. And then one of the good things about these kinds of platforms is that you get reports back, right. One of the hard things about writing your own automation and putting those platforms back is like producing the kinds of detailed information for sort of the unskilled users, or not necessarily unskilled, but not necessarily knowledgeable in the tasks that you know about, right. They have other main primary tasks that they do, and you're giving them a bit of abstraction for things that they know what they cant the outcome to be, but they don't know all the details, but some nice green text on a screen, they can tell, hey, the thing did okay, and if they hit some red text and it didn't go okay, then they can reach out to youre. But you're really giving them a way to act like you do when you interact with the system. They're doing the same things that you would do without having to distract you from the work that you're doing on a regular basis. And then one of the other really key pieces for automation is a lot of organizations are when they have hesitancy around automating tasks, it's because they're like, you have to tell us exactly who did what, when they did it and what happened. And when you're building your own automation, you can cobble together some things and maybe have some log files and youre send status over to other components and things like that. But providing it again to the folks that might be youre auditors or doing compliance reports or those kinds of abstractions that really aren't necessarily down to digging your cron tabs or whatever, but want to see what kinds of things were run, then you can provide them with an audit log. And looking at automation platforms and automation tools that provide you with that is another good way to help your team build confidence around the automation that you're providing, the things that you're writing, and over time, trusting all of that stuff in a much more sophisticated way so that they're more likely to cant to automate more stuff in the future. So the next set of tools, tasks that you have, they're more comfortable with producing automation for them. So there's lots of things to sort of keep in mind as youre building your own automation, as you're looking at tools to help you with your automation, thinking about your item potency, thinking about your flexibility and testability and all those great things, and then finally thinking about how your automation impacts how your team interacts with other teams, how your automation is going to impact the perception of youre team, maybe from other places like, oh, those folks will produce lots of tasks for us and give us automation so that we can do it ourselves and we don't have to wait for them, right? So you're seen as the folks are super responsive because you're asking people to do the work themselves via the automation. So lots of things to think about. If your team isn't super familiar with automation, hasn't really gone on that jaunt yet. Some things for them to maybe take a look at and think about. We have some resources. There's lots of stuff on the Rundeck website about the approaches to automation and things like that. We have an entire, we call them ops guides. It's kind of a white paper. It's at autoremediation pagerduty.com for sort of a written format of the things discussed in this talk and to give youre some ideas on how to plot your journey into automation. If you've just been kind of doing it catch as catch can and thinking about it as more of a direct part of the job that you want to do as an SRE or as platform engineering or whatever kinds of tasks that you might be doing. So hopefully this was helpful. We're happy to answer any questions will be on the discord. And like I said earlier, if you'd like to reach out to me at any time, I'm at lNxchk. I hope you enjoy the rest of the conference and thanks for listening.

Slides

Download slides (PDF)

See all 48 talks at this event!

Conf42 Site Reliability Engineering 2021 - Online

September 30 2021

Improve Your Automation to Reduce Toil

Video size:

Abstract

Summary

Transcript

Slides

Mandi Walls

DevOps Advocate @ PagerDuty

Join the community!

Featured event

2025

2024

Info

Conf42 Site Reliability Engineering 2021 - Online

September 30 2021

Improve Your Automation to Reduce Toil

Video size:

Abstract

Summary

Transcript

Slides

Mandi Walls

DevOps Advocate @ PagerDuty

Join the community!