Conf42 DevSecOps 2022 - Online

Is Technical Debt the right metaphor for Continuous Update?

Video size:

Abstract

The environmental pressure on software, mainly security, has dramatically changed in few years. Sticking to the Technical Debt category, will crush IT, and the business. So let’s introduce a new term: Technical Inflation, and change how we plan, budget, manage changes and implement automation.

Summary

  • Giulio Vienne: Today we must take into account the entire application environment. Critical vulnerabilities are no more a rare event. Vulnerabilities are found in all languages and platforms. Traditional monthly patching is not enough anymore. We need to update the entire software chain.
  • Ward Cunningham introduced debt as an explanation for a technical problem. The technical debt metaphor matches the three core elements that I described before. These are the will, the principal or capital, and the interest. I'm suggesting other terms borrowed from economics to explain the continuous update phenomenon.
  • In practice, continuous update is not continuous delivery nor continuous deployment. The necessity of frequently updating a system independently of source code changes necessity. This applies anywhere immutable infrastructure is implemented, not just containers. The sheer cost of rebuilding can be huge and impressive.
  • Is technical depth the right metaphor for continuous update? Perfectly working software weakens over time and need continuous maintenance. The speed of decay is noticeable much more similar to inflation. Thanks once more for staying to the end and hope you enjoy the content.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Good evening. My name is Giulio Vienne and today I will talk about the continuous stream of patches to update our systems, how it impacts the technological value streams, and its relation to what is known as technical debt. Today we leave a change in information technology that impacts more than security packs, development, deployment and operations. In other words, the entire devsecops pipeline. Critical vulnerabilities are no more a rare event. Every day a new vulnerability is discovered in some component part of our systems, so much that people are waiting for the next major event to happen this month, December 2022. Vulnerabilities are limited to the operating systems, to code written in CE or in assembly. They are found in all languages and platforms, and the traditional monthly patching is not enough anymore. What happens is not developers fault. Security is a hard problem and tooling is constantly catching up with new problems. If it is not developers fault, clearly this cannot be classified as technical debt, can it? I hope to convince you that we enter a new era where we must take into account the entire application environment. In the past, with a slow pace of change, we were able to reckon this work of continuous update as technical depth. But today we must bring into the light and put a different label on it has jungle savages barbarian may elicit some smile, but won't help a conversation with engineering, management and businesses I organize the content in three moments. In the first section, I'll focus on the frequency of updates for our system and applications. In the second part, I propose some definition for continuous updating that may help in discussing with management and planning. Finally, in the third and final section, I will hint at the engineering behind continuous updating, what current tools offers and their limits let's start our journey exploring the current landscape of updates and patches, a simple but not simplistic description of any execution environment. Consider three layers, the operating systems, the application stack, and the libraries. The operating system is your Linux distribution, the docker base image and the Windows installation. Whatever the application stack or runtime is your node JS interpreter, the Java runtime engine, the net runtime, the DLL or shared objects linked to your c executable, and so on, so forth. Clearly this could be part also of a docker image, and finally the libraries that may be statically or linked to your executable. Some languages like go, they embed the runtime, the final executable. The runtime version is so tied to the compiler and linker version. Each layer has its own independent sources of updated and patches. Depending on your architecture and deployment mechanism, each layer can be updated independently from the others or they are bundled together in a single deployment unit. I spent some time researching these sources of patches, how frequently each layer receives and updates, analyzing public repositories and sources. I found this data for operating systems. In average, we got patches for the operating systems every three or four weeks. Moving to the next layer application runtimes we can observe more spread between different sources of data and patches. We must not be deceived by major or minor releases. The frequency of patches is substantially higher. It ranges from a bi weekly patches for the most commonly used client runtime, which is the Google Chrome browser, down to the Java SDK, which usually have three months between patches. Someone may wonder why I included MongoDB in this list. From a practical perspective, any modern application relies on many different components, databases, caches, queues, messaging system, et cetera, et cetera. Their update reflects the entire application security and behavior, so I included Mongo as a representative of these other components just to see how often it gets updated. The next and third layer are all the libraries included and linked to the application. A modern trend is the increased dependency on third party libraries. I can remember when people worked in c plus plus they used a couple of libraries, maybe four, maybe five. Nowadays it's a lot more common to use open source libraries instead of buying libraries from a vendor and has we can see from this chart from son of type. The trend is hugely increasing year after year. Every major open source repositories is registering a huge amount of increase year over year, in this case from 2020 to 2021 in the absolute number of downloads, 50%, 70%, 90% even. This is related to the language platform, and above that we have some interesting considerations. Different runtimes they have a different number of average libraries. So for example, JavaScript uses a lot more separate libraries compared to a Java or a. Net application. They use a lot less in the order of tenths compared to a JavaScript application that uses hundreds of libraries. The net result is a constant shift of the stack, all three layers, the need to protect the entire software stack and the chain that produces it. So we need to update the software a lot more frequently. Let's move to the second section and consider what is technical debt and if the phenomenon that we described can be defined as technical debt. We all know that technical debt is a metaphor. We use the word depth to better communicate with known technical audience. The listener may be an expert with a master in economics, or have a simpler common sense idea. I put in this slide three definition for depth. There are common elements across the definition and these elements are three and key to understand what technical debt is compared to the normal debt. They are the will, the principal or capital, and the interest. A person does not enter debt without agreeing or accepting it, or searching for it. It is always an act of will to borrow some money from someone else. He borrows an exact amount of money, the principal, for a certain duration of time, more or less rigid, and part of the contract is the interest to pay as time passes could be zero interest, or usually it's more than zero. And debt can be renegotiated, for example by delaying some payments. Moving to the technical debt it was introduced by Ward Cunningham as an explanation for a technical problem. So the technical debt metaphor matches the three core elements that I described before. The engineering team knows that a design solution is suboptimal and which other solution would be preferable. This is matching perfectly the wheel element. You don't get technical depth by chance. The team have an estimate for a future solution and for the temporary solution. If not, they will never choose the worst and shortcut solution if it's not able to implement in a lesser time compared to the other one. So it must be cheaper to implement. Finally, the capital element. Oh shite. Finally, the interest. Shite. Sorry, shite. Ward Cunningham introduced debt as an explanation for a technical problem. The technical depth metaphor matches the three core elements I described before. The engineering team knows that a design is suboptimal and which other solution would be preferable. This is the will element. The team has an estimate for the future solution and for the temporary solution, and they know that the temporary solution is substantially cheaper to implement. Otherwise, choosing to take the shortcut would be an irrational way of thinking. This estimate to implement the solution is the capital element. Finally, what matches the interest? It is the delays in paying back the debt. The longer it takes you to go back and implement the desired solution instead of the temporary solution, the harder it becomes, the harder is to fix the software and maintain and evolve the software. So this is like the interest that the amount to payback grows and grows over time. Continuous updating is something different from refactoring. It is a simple adaptation to the environment. No new domain knowledge or user feedback require architecting or refactoring. So which analogy should we use to explain it? I do not see the three elements that characterize debt. Thus I'm suggesting other terms borrowed, pun intended, from economics. My first proposal is the term depreciation. It captures the fact that investment in software is not a constant value, but decreases over time. This analogy has a limit and is that depreciation is an accounting, a fiscal technique completely disconnected from external events. So we don't capture the external push with vulnerabilities that we have to update software. Another terms might be inflation. This metaphor has an advantage over depreciation because it catches the external elements. Inflation is something that happens outside your control. The ratio of the software value changes is not a constant. There can be periods of high inflation and this could be resemble a technology leap or new attack techniques. So high inflation, the value of my solution due to these factors and there are period calm with very low inflation, there are no big external events and forces us to update our software very frequently. Another final way to describe the need to update constantly is an increase of operational costs to keep the lights on, the cost to rebuild, track, store and deploy. In my opinion, all these ideas operational costs, technical inflation, technical devaluation are all three more effective to explain the continuous update phenomenon compared to the concept of technical depth, we reach the third part where we analyze what continuous update means. In practice, continuous update is not continuous delivery nor continuous deployment. The necessity of frequently updating a system independently of source code changes necessity. We must update to prevent attacks frequently because we probably have to update every day. If you average what you have seen changes in operating system level in the runtime in the libraries we have frequent updates independently. Means that when there are no functional changes, we still have to rebuild and redeploy to production all the patches in the many other layers which are not our application. The source code changes excludes the portion of build and deploy scripts which identifies the version of dependencies. So these changes are not considered source code in this definition. In practice, we can be lucky and have a monolith application running on a virtual machine. In this case, it operation is responsible for updating the operating system and maybe even the runtime because it's preinstalled on the machine. So developer and the pipeline are used only when there is a change in the libraries, so less frequency of update and redeploy because we have independence between layers. But if we consider a microservice architecture, each service is packaged in a docker image and the docker image contains all the layers, application libraries, runtime and operating system. So we have to redeploy the whole image no matter which layer we need to patch. This is an interesting downside of containers. It applies anywhere immutable infrastructure is implemented, not just containers. The simplest way to implement continuous update is to rebuild and redeploy all your application every single day, no matter what? This is easy if your portfolio is just a few application and pipelines, it clearly does not scale very well for an enterprise portfolio. The sheer cost of rebuilding can be huge and impressive. To do an optimization, we need to pick up our software bill of materials and our configuration items and bring this information together so we can easily locate which component requires a rebuild and redeploy. As soon as we have the information of what needs to be patched. Could be a piece of operating system or a library or whatever. The next logical step to automate the process is an automatic update of all the references to the patched component. This could be build scripts like a Palm XML CS Pro J packages. JSon for JavaScript could be the docker files, so the reference to the base images could be ansible puppet or chef scripts that are patching operating systems or infrastructure as code in the form of terraform, cloud formation, arm templates, scripts. Finally, we need to kick off the pipeline. After we patch these scripts, the pipeline will rebuild the component, the image, and redeploy it. Note that in this scenario the normal approval process may face short you can have a lot of software to be rebuilt and redeploy thinks of log four j and you cannot approve 100 different pipelines in one go. You need something different, some expedite approval process that will encompass all the impacted components. Consider what is required to automate the process in full. This is the flip side of the coin. We need a strong test harness around the application to guarantee that a patch will not break any core functionality. If you have a big portfolio of components based on the same technology, for example Java, a single library upgrader can start a buildstorm with hundreds of build queuing at the same time and running for hours or more. This will kill not just your build infrastructure, but also your production infrastructure, because you are deploying everything simultaneously. At the same time, huge spike in disk, network and cpu usage, not counting the downtime that user might suffer. Another side effect might be the increase of the usage of storage for all the intermediary artifacts, especially if you're using the immutable infrastructure pattern. So you are producing a lot of images, everyday new images for a lot of applications. The current image technology is not optimized for frequent and small changes to images, but this will probably improve over time. Finally, the increased adoption of safe languages like rust may translate that in a few years. Adopting these languages, we won't see so many security patches, hopefully. Back to the original question that gives the title to this speech. Is technical depth the right metaphor for continuous update? I think the answer is a clear no, at least in my opinion. Technical depth is a great metaphor for some kind of problems we face while delivering software. We borrow time when we judge. Better taking a shortcut than going through the long way. Nowadays we face a new phenomenon. Perfectly working software weakens over time and need continuous maintenance. The speed of decay is noticeable much more similar to inflation. Our money value less and less as time goes by. My name is Giulio Vian. I'm italian. I work in Ireland as a principal engineer for Unum, a Fortune 500 insurance company. I post my thoughts on DevOps and some random topics in my blog. I repost on LinkedIn and some other media. To prepare this talk. I researched a number of sources which are listed in the next three slides. I won't apologize for the font because it's not meant to be read. Now you can easily download the deck and follow the links or any articles or paper that is linked here from the conference site or from my own site. I want to thanks again the sponsor that made this event possible. I want to thanks the organizer and I hope you have lot of questions and we can interrupt and discuss the idea that I throw to you today. Thanks once more for staying to the end and hope you enjoy the content. See you soon.
...

Giulio Vian

Principal DevOps Engineer @ Unum

Giulio Vian's LinkedIn account Giulio Vian's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways