How to deal with technical debt: lessons learned from 300+ engineering teams

Video size:

Abstract

Technical debt is one of the primary causes of productivity loss for Engineering teams. I have spoken to 300 top engineering teams, such as teams at Airbnb, Intercom, and Snyk, to learn how they manage technical debt. Not only do these teams ship quality software fast but also keep their engineers happy.

In my talk, I will share my learnings: tactics, processes, and tools to use when dealing with small, medium, and large pieces of tech debt. You can apply this practical approach regardless of your company’s stage, size, business priorities, and culture.

Talk outline:

What is tech debt?
Why is tech debt a thing?
What we learned from Martin Fowler’s Technical Debt Quadrant.
Tech debt myths to debunk.
Why should companies bother managing technical debt properly?
How to create your tech debt management strategy.
The one cultural characteristic for a healthy codebase.
How to create & think about your tech debt budget
How to deal with ‘small’ debt.
How to deal with ‘medium-sized’ debt.
How to deal with ‘large’ debt.
High-level takeaways

Summary

Stepsize is a SaaS product to help engineering teams manage technical deb debt. Last year alone, I interviewed over 300 top software development people about technical debt. In my talk on how to manage technical Deb debt, I'll share with you all the lessons I learned from these people.
Tech debt is code that you've decided is a liability. It's all that extra unnecessary work that you need to do to get your software out the door. This comic by monkey user does a great job at illustrating what tech debt is.
Software exists in a world of uncertainty. The code that we write to solve a problem is based on our current understanding of that problem. The best teams in the business do know how to handle such high uncertainty. They continuously refactor code that has accumulated too much cruft.
Technical debt is not an inherently evil and bad thing, just like financial debt. It's a tool that we can use to gain leverage and test ideas faster. If we take it on without being prudent, deliberate and managing it carefully, it will screw us over.
There's this myth that tech debt is bad and no one should ever take any on wrong. Too much tech debt means you'll get too many bugs, loads of performance issues, and too much downtime. Companies who have a strategy for technical debt will ship 50% faster.
You should allocate a fixed proportion of your sprint capacity to pay back technical debt. We like to think about these technet budgets like SRE teams think about their site reliability goals. You want to hover around the maximum amount of tech debt you'll tolerate.
Tech debt can hinder your capacity to deliver value to the customer in many ways. You need to have conversations about tech debt in your usual sprint ceremonies. We built an editor first issue tracker that engineers can use to keep track of tech debt right in the code where it lives.
Tech debt is inevitable due to entropy in the code base. The best way to manage technical deb is to create an engineering culture of ownership. Use step size to track tank debt directly from the editor. By managing tech debt carefully, you can start using our product for free on stepsize. com.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

You. Hello everyone, and thanks for having me. My name is Alex. I'm the co founder and CEO of Stepsize, where we built a SaaS product to help engineering teams manage technical deb debt. So last year alone, I interviewed over 300 top software development people about technical debt, and I've been working on products to help engineering teams ship better software faster for over five years now and raised millions of pounds to finance that. In my talk on how to manage technical Deb debt, I'll share with you all the lessons I learned from these people so that you can apply them too. First of all, what is tech debt, and why does it matter? I assume all of you know the metaphor based on financial debt. There are many definitions for tech debt, most of which work very well, and this is my attempt at simplifying them. I say that tech debt is code that you've decided is a liability. It's all that extra unnecessary work that you need to do to get your software out the door. And I love this comic by monkey user, which does a great job at illustrating what tech debt is. The team is digging that tunnel so fast, but forgot to get rid of all the rubble they dug out. A bug's causing water to leak into their tunnel. They're stuck in there because a technical deb these won't be able to do anything about the bug, and they might even die in there. That's technical Deb and what it can do to a business. But let's talk about why tech debt is even a thing. Why can we never seem to avoid it no matter what we try? Well, it's because software exists in a world of uncertainty. This quote by Martin Fowler, who in my opinion wrote the best blog posts about technical debdet. I encourage you to go read them. Essentially, tech debt exists because the code that we write to solve a problem is based on our current understanding of that problem. Now, it sounds obvious, but let's unpack this a little bit. Even if the perfect engineers found the perfect solution to a problem and coded it perfectly, their understanding of the problem will evolve, and quickly. But left unattended, their code will not evolve. And this means that our code will soon no longer be appropriate. It happens all the time and much faster than you might think, especially in a high growth environment. Now, something else to consider is that it's often the case that it can take a year of programming on a project before you understand what these best design approach should have been. That's another quote by Martin Fowler from his piece on the technical debt quadrant. However, I've learnt that the best teams in the business do know how to handle such high uncertainty. Despite all of this, they use the right tools and processes continuously refactor code that has accumulated too much cruft and they won't accept Messi code as technical debt. I want to unpack this last bit a little bit too, so that we're all on the same page. You may have seen this quadrant before, and when I say that the best engineers in the business won't accept messy Curtis technical debt, I mean that they aim for this corner of the quadrant. Technical debt is not an inherently evil and bad thing, just like financial debt. It's a tool that we can use to gain leverage and test ideas faster. Just like financial debt. If we take it on without being prudent, deliberate and managing it carefully, it will screw us over and we will go technically bankrupt. We should never get sloppy and accept any kind of incompetence as acceptable technical deb debt. We should be aware of the current best practices and use them. We should write clean, readable code, and we should know that code left unattended will turn into technical debt. The best engineers teams in the business have ways to handle the uncertainty inherent to building software and end up in the top right quadrant. Now, let's debunk a few myths to summarize the key ideas here. There's this myth that tech debt is bad and no one should ever take any on wrong. If you're sending people to Mars, sure, if you're building software where the cost of failure isn't high, you can use tech debt as a way to gain extra leverage, just like financial debt. And just as we discussed, taking on tech debt prudently and deliberately is fine. And if you have zero tech debt, you should truly ask yourselves if you're going fast enough. Also, it's not realistic to have zero tech debt because of entropy in your code base. That's how Ron parages, who's a vp of engineering at Carter, told me that he looks at technical deb debt. He looks at it as entropy in the code base. It never ends and it's a constant struggle. So adjust your expectations accordingly. The next myth is about how tech debt is only engineering's problem. Nope. We'll talk about how tech debt impacts the whole companys and talk some more about how company culture, particularly how product managers and engineers work together, impact a company's approach. These technical debt. And finally, a lot of people wrongly believe that managing tech debt properly will slow them down. That's super duper wrong. If you're managing tech debt successfully, you'll see your number of bugs go down and velocity go up, and I'll show you how it's done. At this stage, you might be asking yourself, but why even bother managing tech debt? Now this is a long quote, but it basically says that companies who have a strategy for technical debt will ship 50% faster. And Hugh here doesn't want to ship 50% faster. That's the stuff of dreams for people building software. And if you think Gartner don't know what they're talking about, consider this data point from Stripe, who's arguably the best company software company out there. They found that engineers spend 42% of their time fixing the past rather than building the future. And we at stepsize decided to carry out our own research and asked several hundred engineers in our community a simple question. How much faster do you estimate your company would ship if you had tech debt under control? The answers will surprise you. Two thirds of respondents said that they'd ship twice as fast if their companies had tech debt under control. 15% estimated they'd chip three times as fast. Now, how the hell can they say that? Let's unpack this a little bit. Technical deb slows the entire engineering team down within days or weeks and has repercussions across the entire business. Check this. But in software companies, too much tech debt means you'll get too many bugs, loads of performance issues, and too much downtime. They'll create more work for QA, more work for the SRE team, and result in broken slas. All that stuff tallies up to more customer complaints, which means more work for support, customer success, and account management. And it all adds up to unhappy customers. I've heard some version of this many times. We'd be shipping twice as fast today if we'd handled tech debt carefully in the past. And I'm sure you've all seen this happen, too. A feature that you thought would be simple and maybe take a sprint ends up taking the month or more. Now imagine this at a global scale. So now hopefully you see why it's imperative to manage tech debt carefully, and let's talk about how to do just that. First, it's crucial to understand that technical deb isn't just technical, it's about people. It's deeply influenced by your company's culture. And I want to stress that point. So that slide is deliberately blank. Enjoy it for a few seconds. But to give you an example, if your engineers are never recognized for paying down tech debt and it doesn't advance their career. Do you think they're likely to volunteer to address tech debt? Or if engineers get reprimanded at these slightest hiccup in the software by people who don't understand that tech debt can be used for extra leverage? Do you think they'll take on any tech debt? Clearly no, they won't. And as we all know, company culture is a huge topic. But I've gone deep into it and in my opinion, I found the one cultural characteristic that you should focus on if you want a healthy code base, and that's ownership. And I'm not just talking about a hazy concept here. Microsoft did some great research that we can use to quantify this and back up our argument. Turns out that if you analyze your git commit activity to see which percentage of modifications to each file in your code base were made by the main author of the code, you'll see that the files with most bugs are the files where contributors made less than 60% of the edits. In other words, code ownership is a leading indicator of codebase health, and you can use it to predict where things will break and reverse the trend before they do. Now, I want to add a bit of nuance here so that we can draw these right conclusions. This research does not suggest that each file in your code base should be owned by one and only one person, and that they're the only person who can work on it. That would put your bus factor in the risky zone. Ownership is a spectrum. It starts with orphan code all the way on the left, which doesn't have a clear contributor, and therefore no one is implicitly responsible for its maintenance. This is a bad spot to be in. All the way to absolute ownership, where only one person can modify the code in question. And for each file in your code base, you want to be in the collaborative ownership zone, where the main contributor made more than 50% of all edits, but not all of them. And generally speaking, the code is owned by a specific team, not an individual. We won't get into it today because we don't have enough time, but think hard about how you can foster a culture of collaborative ownership in your development team. It's the best way to maintain a healthy code base. So you now have a rough idea of the cultural drivers behind tech debt. Now let's talk about technical debt budgets. You might have heard about this before. It's the idea that you should allocate a fixed proportion of your sprint capacity to pay back technical debt, say 10% to 20% of your time but how the hell do you come up with that number? Well, it should change every sprint and it turns, but that it's not important to explicitly pick out the right number. Right, but I'll tell you how you can think about this at step size. We like to think about these technet budgets like SRE teams think about their site reliability goals. So site reliability is responsible for keeping software products up and running. But interestingly, companies like Google don't aim for 100% uptime. And that's because 99.99% uptime is enough for their products to appear supremely reliable to real world humans. And that last 0.1% is exponentially more difficult to reach and simply isn't worth fighting for. So consequently, if this allows Google 52 minutes of downtime per year, they'll want to get as close to that number as possible. Anything less than 52% conf 42 minutes of downtime is a missed opportunity for taking extra risks and delivering more ambitious features to their customers faster. So think of your tech debt budgets like your site reliability budget provided it's prudent technical debt you're taking on deliberately and you remain below the maximum amount of tech debt you can tolerate before affecting your customers and business. You should feel free to take more risks even if you increase these amount of tech debt, because that's how you'll beat your competitors. Now this pseudograph summarizes the idea. You want to hover around the maximum amount of tech debt you'll tolerate. And your tech debt budget can be in the red. You need to pay some back, or it can be in the green. You can afford to take on some more. A simple way to define your tech debt budget is to find the intersection of things you know you'll work on using your product roadmap and the parts of your code base that have tech debt, but not outside of it, right? You pay back the stuff that's inside, not what's outside. You scope out the work and you'll have your tech debt budget for your sprint quarter or year if you plan that prior into the future. And the key idea here is that you don't need to address all your tech debt right now in one go. You need to address the debt that's in the way of your key goals for the quarter of whatever period you've selected. Now let's get practical and talk about how you can incorporate tech debt management into your day to day agile development process. And the first question that you should ask of any tech debt is is it small, medium or large? I'll define each of them. Small debt is these tech debt that can be addressed right then and there. When the engineer spots it in the code and it's understood that it is part of the scope of the ticket they're working on. It could be refactoring a function, or a couple of them, or renaming some variables. Whatever it is, it can be done. Right now, the best way to think about it is to follow the Boy Scout rule. You always leave the code better than you found it, and small jobs like these don't require any kind of planning, and each engineer should feel empowered to fix this kind of debt without anyone's approval. You see this ownership coming up again as a key factor. Next we have the medium pieces of debt. These are the pieces of debt that can be addressed within a sprint. Usually they should go through the same sprint planning process as any feature work and be considered just as rigorously. That's where most engineering and product development teams fail. I spoke about this with James Rosen, who's an engineering manager at Everlane, who told me that to consider how much time PMS spend curating the set of features to work on, and then to compare this to the amount of time and effort engineers dedicate to managing these business causes for tech debt. And then he asked me, is it that surprising that close to zero engineering time gets allocated to tech debt? Now you all is good software development people should do that by asking the right questions and making room for conversations about technical debt in your agile ceremonies and when planning right businesses rightly priorities work that delivers value to customers. And you might think that at first glance, getting rid of tech debt won't do that. But as we discussed earlier, tech debt does hinder your capacity to deliver value to the customer in many ways. So your goal is to identify key tech debt in the way of key goals. These debt that costs the most engineering hours and productivity losses, or that causes the most bugs or other issues that impact your customers. To do that, you need to document your tech debt and to be clear about the impact it's having on the business. Only then will you be able to make a proper business case for any given piece of debt and to prioritize things properly. And this is where tooling has failed us so far. As Jake, who's a lead engineer at Uncork, told me, Jira is a great place to manage projects, but a terrible place to track and monitor tech debt. Just take a look at your tech debt epic or backlog, where the 7000 tickets that your engineering team diligently logged went to die until everyone gave up on the idea. And code quality tools are helpful at surfacing one facet of tech debt, but they won't catch most other types in technical debt. Now, fortunately, this is the problem that we're solving at step size. We built an editor first issue tracker that engineers can use to keep track of tech debt right in the code where it lives. So you'll want to use these code based issues that you've tracked to identify this subset of tech debt that's in the way of your sprint. During sprint planning for each feature, you'll ask, is there an opportunity to address that debt as part of this feature work? Or which tech debt could we address to make delivery of this feature smoother and faster? And then you actually scope out the work. You import your step size issues into Jira, and you add the relevant issues to your sprint. You need to have conversations about tech debt in your usual sprint ceremonies if you want to have a chance at creating the right culture and managing it properly. Finally, large debt is the tech debt that cannot be addressed right then and there, or even in one sprint. The best companies I've interviewed have quarterly technical planning sessions in which all engineering and product leadership participate. Engineering managers are tasked with bringing up large pieces of tech debt that their team leads have reported to them and to make the business case for the ones they judge to be the most important ones. To make the business case, you need to explicitly state which business priorities will be put at risk by the tech debt in question. So these could be key items on our feature roadmap security KPIs, like velocity, much more anything that matters to your business. If the debt is likely to get worse if left unattended, you need to explain why, right? So, for example, because many engineers will ship lots of code in the parts of the code base where the debt is, and you need to add a guesstimate of how much this debt will cost to the business, for example, in time, if the main risk is productivity or bugs, if it's quality or employee retention, if people just loathe their lives when they're working on this part of the code base, et cetera, et cetera. And once leadership has approved each large piece of tech debt, they can be scheduled onto the roadmap, just like any feature work would. Now, let's quickly go over the key takeaways from this session. If you leave with anything, I'd like you to remember that tech debt is inevitable. It's not that anyone's doing a bad job. It's due to entropy in the code base, which might as well be a law of the universe. But importantly, you can use tech debt to gain extra leverage and beat your competitors. However, if you manage tech debt carefully, sorry, if you don't manage it carefully, it will come back to bite you. And the best way to manage technical deb is to create an engineering culture of ownership, include tech debt in your agile processes, and to use step size. So I'll take a few seconds to show you what step size is about. Companies like Sneak use our editor first issue tracker to maintain a healthy code base while shipping at pace. So with step size they can track tank debt directly from the editor and the issues they create are always linked to the code that they relate to. So they then use our web app to get an overview of the tank debt in the code base, prioritize improvements based on what matters to the business at the moment, and once they've done that, they can use our dura integration to add them to their sprint and finally fix the debt that matters. And I'll leave you with these wise words paraphrasing Arlo Belshi from dig deep tools, who said that ten x engineers may not exist, but 100 x code causes certainly do. That's what you're after. By managing tech debt carefully, you can start using our product for free on stepsize.com and I encourage you to follow us on Twitter at alexandre omeyer and at stepsize HQ, where we share the latest and best content regarding tech debt software maintenance and we like to think these best memes. Thanks for listening.

Slides

Download slides (PDF)

See all 39 talks at this event!

Conf42 JavaScript 2021 - Online

October 28 2021

How to deal with technical debt: lessons learned from 300+ engineering teams

Video size:

Abstract

Summary

Transcript

Slides

Alexandre Omeyer

CEO @ Stepsize

Join the community!

Featured event

2025

2024

Info

Conf42 JavaScript 2021 - Online

October 28 2021

How to deal with technical debt: lessons learned from 300+ engineering teams

Video size:

Abstract

Summary

Transcript

Slides

Alexandre Omeyer

CEO @ Stepsize

Join the community!