Compelling Code Reuse in the Enterprise

Video size:

Abstract

Sharing code and internal libraries across your distributed microservice ecosystem feels like a recipe for disaster! After all, you have always been told and likely witnessed how this type of coupling can add a lot of friction to a world that is built for high velocity. But I’m also willing to bet you have experienced the opposite side effects of dealing with dozens of services that have had the same chunks of code copied and pasted over and over again, and now you need to make a standardized, simple header change to all services across your platform; talk about tedious, frictional, error-prone work that you probably will not do! Using a variety of code-sharing processes and techniques like inner sourcing, module design, automated updates, and service templates, reusing code in your organization can be built as an asset rather than a liability. In this talk, we will explore the architectural myth in microservices that you should NEVER share any code and explore the dos and don’ts of the types of reuse that you want to achieve through appropriate coupling. We will examine effective reuse patterns, including what a Service Template architecture looks like, while also spending time on the lifecycle of shared code and practically rolling it out to your services. We will finish it off with some considerations and struggles you are likely to run into introducing code reuse patterns into the enterprise.

Summary

This talk focuses on how do we identify reuse properly. Service templates and service chassis can provide a helpful way to reuse code in an effective way, especially in a distributed microservice world. It can be very difficult to know when to reuse and how to reuse effectively without causing a bottleneck.
Developer experience is the activity of sourcing, improving and optimizing how developers get their work done. At SPS commerce, we think of developer experience in terms of capabilities. These are identified horizontal fast tracks to be curated for maximum productivity within the organization.
Compelling Code Reuse intentionally uses the word compelling. There are appropriate ways and compelling ways to produce and use code reuse. It's always harder to work with microservices as they grow in terms of the volume of them. How should it be distributed?
Do I duplicate the code or do I reuse the code? Understanding that you're going to have to make this decision, especially if you haven't made it yet in your distributed service system. Think about copying code versus reusing code in terms of the law of diminishing returns.
How do we mature our code reuse practices? We're going to first look at coupling. Coupling is the degree of interdependence between software modules. We'll talk about reuse, talk about duplication, and then have a little bet of a dive into reference and reference code.
When we think about coupling code reuse, let's look at an example. This seemed like a really good identified piece of code we wanted to distribute. But it actually broke down and caused a lot of the pain. We didn't correctly understand the contract or the interfaces that we need.
Emerging need is not enough. High duplication, though, is interesting. You need to have at least three places in the wild that you're not planning for this exist. A great place to identify reusable code is for best practices and principles.
We're looking for that balance of the right level of low coupling, but still having reusable code that provides an opinion in there. Like everything in architecture, that balance is staying in the middle. There are three key areas that we can use appropriate coupling to help us with feature flags.
Another way to think about appropriate coupling and the difference between duplicating and reuse is the traditional animation process. In the same way we think of that with code reuse in a distributed microservice world, there's a ton of cross functional concerns that are just the background. The real business logic is often not something you want to duplicate.
Sometimes understanding if you have an incorrect abstraction is important. Low overhead to savings ratio. Is it already built? And of course you want to think about feasibility here. Sometimes there are just other opinions, there are too many opinions on how to build something.
shared utility libraries with coupling. Much like many utility libraries, they don't actually belong together. Keep those libraries small, keep them task focused and specific.
A project seed is a very high level reference point for starting a new application. It typically provides standardized folder structure along with SDLC workflow via templates files. We're going to build on that with service templates and then build onthat with service chassis.
A service template is an opinionated reference for specific application and language types. It reduces boilerplate setup and provides consistency on crosscutting concerns. Service chassis changes this around by including configuration for those packages that are being pulled in. You might combine project feels, service templates and service chassis altogether.
Using service meshes and proxies, we can begin to automatically build in mutual TLS tracing, egress, logging metrics, errors, and auth. Now we can make platform level updates, but also roll out distributed piece of code.
There are lots of different tools to do that. Tools like dependent bet. If you're using GitHub, dependent bet automatically submits pull requests to your repository with any package updates. It'll change the way you think about distributed code in terms of your velocity as well.
SPS service template allows you to take advantage of best practices in your organization. Includes a bunch of your own capability, additional security, middleware, swag or sentry. Everything is configurable and ejectable, meaning that you have an escape hatch at every point.
In some cases, achieving a full service template might be a mile high order. Instead of developing grand designs for an internal code framework, it's often best to start small. There is a path forward for effective code reuse for distributed microservices.
Code reuse is the holy grail of software engineering. I'll let you decide on which definition is appropriate. Keep thinking about appropriate coupling for your decisions. Take a look at some of those service templates and chassis that you can build internally.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

You. Hi everyone, and welcome to this session on compelling code reuse in the enterprise. This may seem like a funny introductory slide to get started with this person programming in the background that is clearly doing it, maybe anonymously or in the dark, but in reality, sometimes this can be us, right? Especially from a code reuse perspective you might have experienced in your organization. The reality that is simply wasn't valued to think about code reuse because you work in a distributed microservice world, maybe that just didn't fit. Maybe you see the value and the only way you got it done was to be that person who added the extra value at nighttime. And I've definitely been there myself, where I've pursued this with a passion and seen how code reuse can be very compelling, be very valuable, but honestly, it can be very difficult to know when to reuse and how to reuse effectively without causing a bottleneck or a problem. So that's what this talk is really all about, focusing on how do we identify reuse properly, and then looking at a key example through service templates and service chassis that can provide a really helpful way to reuse code in an effective way, especially in a distributed microservice world. This quote I love from Eric Raymond, which says good programmers know what to write, great ones know what to rewrite and reuse, because understanding when to rewrite something and of course our discussion today when to reuse something is pretty difficult. Just as we get started, a little bit of information about myself. My name is Travis and I work as a principal software engineer for SPS Commerce. You probably haven't heard of SPS commerce. We're a business to business organization that focuses on exchanging invoices and purchase orders between retailers and between suppliers. We're the world's largest retail network and have hundreds and thousands of retailers and suppliers in our network. And my focus specifically there is on developer experience. And you might initially ask yourself, what exactly do you mean by developer experience? It's a pretty loaded or overloaded term these days, and specifically when I talk about developer experience, I try to define it and put a boundary around it. And it's a relatively new term. It's developed over the past five or six years. You might be thinking it's kind of like developer advocacy or developer relations, and it definitely is tangentially related to that, but it's something that's slightly different than that. And this is a great definition that I've seen from the appslab.com, which is developer experience is the activity of sourcing, improving and optimizing how developers get their work done. And so when we think about how developers get their work done, in a lot of cases we're thinking about the Persona of the developer and how they're moving information to production using the development principles of your organization. So that's where we tie together the experience along with the development principles which are very unique to your organization, really form this Venn diagram, this concept of developer experience. The reality is that engineers in different organizations, depending on how old your organization is, have really fought to understand what tools they should use. Are there standardized tooling that they should use for CI CD for observability? In some cases yes. In other cases no. There's just a plethora of different tools they could use. And so your engineers and your leads are fending for themselves and picking and choosing the right tools within this jungle of tooling that may exist. And in many cases these tools have come online out of necessity and need, but they haven't really been thought through on the complete experience on end to end moving something to production and how they integrate with each other. And that's why we think of this quote as fairly helpful for understanding the problem that developer experience solves, which is developers work in rainforests, not planned gardens. The idea that you never had an opportunity from a greenfield perspective to plan exactly what your delivery process would be like and the tools involved in it. But rather it grew out of many different requirements and needs over the years. And so developer experience is really starting to take a look at these particular areas and starting to hone in and focus on some of them. At SPS commerce, we think of developer experience in terms of capabilities. Capabilities help us put a bit of a boundary around exactly what we're discussing and exactly what is involved in developer experience. So these are identified horizontal fast tracks to be curated for maximum productivity within the organization. And so if we draw the organization like this in terms of development, operations, cost, security, we been to draw these horizontal fast tracks. In developer experience, one might be a very common one which is building and deploying a new feature to production. How do I do that? Am I using feature flagging? What am I using for CI CD? How does all that integrate together? How do I observe the metrics at the end of it might be building and deploying a new application from scratch and having a developer portal to help you through that experience. It might be API design which can be a large part of developer experience in how we consume and how we produce APIs. Or as we move towards our discussion today, it might be code reuse, how do you think, but code reuse, how do you think about inner source within the organization? How do you know when it's appropriate and effective to think about reuse? And of course, developer experience and code reuse go hand in hand so closely as we think about it. And code reuse is defined as the act of recycling or repurposing code parts to improve existing or to create new software. We write it once, we use it multiple times, right? Pretty straightforward. And as we think about the stack and the technology that we've all built our professions and our careers on, and that slowly grows here. In terms of your low level machine container runtimes and the OSI model for networking, the application runtime that you're using, the web application framework you're using is a library, a piece of shared code, much like you might develop, but yours would be more focused internally on your domain. And so when we think about developer experience, we're actually thinking about the entire usage of all of these stacks, everything from your local iDe and what you use and how you do it all the way up to how you're building software on top of that, and what are the specific pieces of reusable code within your or custom domain. So today we're focused here on this particular horizontal, which is your custom domain, your logic, your information that is specific to how you do things at your organization, your best practices, your tech principles. And the title of this particular talk titled Compelling Code Reuse intentionally uses the word compelling, and it has two varying definitions, automated with, that are both very relevant for what we're discussing. And the first is to force or push towards a course of action. And my goal today is not to force or push you, nor would you be for you to go back and force your organization to do code reuse, but rather thinking more appropriately on the second definition, which is sharing a powerful and irresistible effect, requiring acute admiration, attention, or respect. In other words, if we're doing code reuse correctly, which I hope I've convinced you by the end of this presentation, there are appropriate ways and compelling ways to produce and use code reuse, then it is an irresistible thing. That mentality of whether we should do it or not, especially within microservices, begins to shift and change. So let's go back to basics a little bit and think about theoretical code reuse. And here we're just simply talking about the general day to day practice of how you incrementally share code and move it across your particular project and the software that you write the first is a simple module. A module could be a class, it could be a project, it could be any level of granularity that you want. And of course inside that module we have a function a function a we reuse inside module b. And that's pretty straightforward. We can already reference that and make use of it. But if you're working through an n tier architecture for whatever application you have that has layers for different reasons, architectural purity and that sort of thing, then module c may want to reference function a. And it wouldn't be appropriate for to reference it directly because it should have no knowledge of module a which has the function in it. So you do normally what you would do, you abstract and you pull out and you create different reference components for it, and you create this module d and apply function a in it. And of course that creates a standard project that you have and you're happy, you're reusing that. And that was pretty easy to do. And there's not a lot of concerns on what it would take to refactor that as you launch a new project. And that project might exist in the same repository or same proximity or location. Of course, if you want to use function a there, you would have to abstract and pull out function a into a new project and then have both these projects reference it. It wouldn't be appropriate for project b to have full reference or understanding of project a. And as I said, these are inside a similar boundary within a repository, and that's pretty straightforward to do. That isn't a big concern. That allows me to refactor and move quickly. But a repository, like other natural barriers, is where we begin to find code reuse difficult. When that's in another repository now it becomes a really interesting question, a much different question, which is, how do I effectively reuse module a, which has function a in it, knowing that it's not accessible to me? How should it be distributed? How should we look at it? So that's why when we think about code reuse and we think about many things, they get much harder as they become a distributed problem. And so today we're really talking about the pattern of distributed code reuse, and it's always harder at scale, right? We always see that it's always harder to work with microservices as they grow in terms of the volume of them than it is having a monolith. But of course there's a lot of advantages to that, and in the same way, there are a lot of advantages here. We just need to understand them and the characteristics of them correctly. So if we were to take an example to try and understand the complexities we might encounter now from a distributed code reuse pattern, let's look at a couple different microservices. We might have service a, service b and service c, and in these microservices we'll want to reuse our function a. And of course, easiest way to do that is to distribute them in a package. Might be a Maven package deployed to Maven Central, or a package you've pushed and deployed to your internal JFrog usage of it, and as you begin to roll that. But there's a creator of function a that puts it together, this team that owns it, and there's different teams that are consuming it, and they may or may not have knowledge or understand who that team even is. But we begin to see a lot of drawbacks here that people start to encounter, and that's why this is maybe discouraged in a lot of places. The first is large dependency chains, especially in Java applications, right? That's no surprise. We can have a lot of dependencies here. And as those dependencies begin to grow inside function a, service a now has no choice but to bring those dependencies along for the ride. The initial dependencies that they brought in with function a may have grown over time as they upgrade versions. And so we see what we call the dynamic equilibrium, or this shift over time of technical debt that can be built up as a result of that. And in some cases, dependency chain can be pretty large and can be pretty dangerous if you're not familiar with who created the package. We also see enhancement friction become a problem where one of the services decides it needs a very small update to function a, to introduce more flexibility or a new feature or capability of it. And of course function a, the team owners that may or may not be aligned to that change, they may also decide that they don't have the time to look at your pull request or look at that change. And so it becomes very difficult now for a small change to happen, which isn't something that we value in microservices, where we can make changes and deploy very quickly as a part of, in some cases, your standard two pizza team. Of course you have administration and cost and ownership involved here. Who actually owns this as it grows, who's going to continue to maintain it? It's all fun and happy in the first six months, but after that it's maintenance work. We have to continue to update it and patch it. And is the team who originally pushed it out, are they willing to take that on. Do we even know or have slas on expectation for them? And that leads us into roadmap on politics and opinions. And whether the team that owns function a really thinks it should even proceed in that particular direction or not, that a particular service wants to take it, it can deviate and begin to change. Do we understand the roadmap for it? And of course we think about exponential flaws and vulnerabilities. Well, we have the advantage here of having a single reasonable piece of code. If we're not maintaining it, we're not supporting it and staying up to date with it. Well, now we're using up vulnerabilities and flaws much faster than if the team had to be aware of what that logic was in that code themselves. And of course, over time, this can be compounded by multiple active releases, meaning that the team has quickly pushed out of version two. But there's lots of services on version one. If there's fundamental differences or changes between those major versions, can a team even easily upgrade to version two? And so you end up maintaining one, two, three major active versions that all have to be handled, and that just compounds the time that needs to be spent. And of course that gets even worse with competition. So as services decides that there's too much friction here, I can't get what I need, the dependency chains are too large. I can do that better. Sure, they go ahead and they try to do that better. They create another version of the repository, they manage it, and quickly. The same thing happens for consumers of that repository. But now the problem is worse, because there is two function a's and copy of function a. And of course this maybe is an accurate picture in your organization. But in a lot of organizations, especially with some of the values of microservices, a polyglot ecosystem is a very real scenario that you might encounter. So as we see that with service d might be a net six service, and it might be consuming or intend to consume the same function a. But of course it can't. And so rather than distribute it via a reasonable package, we often might say, well, this is where we should have just created another microservice that can consume that, and it can be agnostic of the language. We don't have to care. And it can begin to move away from some of these problems. But the reality is, and what we're going to discover today is some of the best things, some of the most appropriate things that we want to reuse through distributed packages cannot be made into a microservice. They are things that are used for cross functional concerns and for technical bootstrapping. And so we can accomplish, or I should say we can overcome a lot of these scenarios, but we have to apply very specific intention. We have to know exactly what we're building in order to do it effectively. And this requires intentional, necessary effort. We don't get distributed reusable code for free. And so we then understand more or better about this myth and the reality of why we should not share code within microservices. In fact, diving is some of the most popular books over the last few years. One in particular that's always been close to me has been building evolutionary architectures, which says microservices askew code reuse. Adopting the philosophy of preferred duplication to coupling. Reuse implies coupling and microservices architectures are extremely decoupled. These are opposite characteristics I don't want to couple together. So we're going to have to think about coupling pretty heavily when we talk about distributed code. Of course, as we read further in the same book, we find that code reuse can be an asset, but also a potential liability. So we understand the liability portion of it, but also recognizing that if harnessed correctly, it can be a big asset, making sure the coupling points introduced in your code don't conflict with the goals in your architecture. So we're going to have to expand. We're going to have to understand that a bit more as we dig in. But the question remains still then, do I duplicate the code or do I reuse the code? Understanding that you're going to have to make this decision, especially if you haven't made it yet in your distributed service system. And of course, if we think about copying code versus reusing code in terms of the law of diminishing returns, we see some interesting characteristics developed too. If you're not familiar with the law of diminishing returns, it's a principle stating that profits or benefits gained from something will represent a proportionally smaller gain as more money or energy is invested in it. So what we mean by that is you might have a line that looks like this. Comparing cost and resources for copied code I copied the code initially and that was pretty easy to do. In fact, it was a lot easier because I didn't even have to write the code in the first place. So with using this with one or two additional resources, that you're compelling it to different places that you'll put the code, it actually gets cheaper. But eventually, as you have to maintain that piece of code and you've copied it across four, five, six services. Now, when a change comes, or when it has a package or dependency that it needs, and you're keeping that up to date, this begins to start to cost you more than if you had distributed in the first place. And so we see that change then in the cost start to rise pretty substantially as we increase the number of resources. So we have to consider the fact that depending on how many times I need to reuse this code, it can be a lot cheaper or it can be a lot more expensive compared inversely then to reusable code, we see the opposite characteristics up front. The first time I want to reuse something, but I want to make it reusable code and distribute it. That actually costs me more the second time I do it. But the reality is that it's probably even much more than shown in this curve. That initial inclination to use distributed code can be pretty costly if you're not sure your projected outcome and where you're heading with this. So let's keep these characteristics and diminishing returns in mind as we consider code reuse. And for today's discussion we're really talking about then how do we mature our code reuse practices? And we don't have time to dive into all of the characteristics and dimensions today, but we're going to first look at coupling. Coupling is the best gauge that we have to understand how others might make use of our particular piece of distributed code, how coupled would they be to it. And the second part of that then is assuming we've made the decision to reuse code, let's look at a scenario that can offer us, in some cases appropriately, highly coupled scenarios that are actually effective in providing value to the organization through templating. And specifically we'll looks at service templates and service chassis there. So, diving first into compelling, let's understand exactly what do we mean by compelling? It's fair to always just jump back into a definition for that out of the book we've been working with, which is building evolutionary architectures. It defines coupling as how the pieces of the architecture connect and rely on one another. And that's helpful. But I think that actually for once, the Wikipedia definition is even better here, which can help us break this down. Coupling is the degree of interdependence between software modules. So number one, a measure of how closely connected the two routines or modules are, and number two, the strength of the relationship between the modules. So we're sharing to blend and move into the area of domain driven design, and how close the different domain aspects are to this. So let's dive in. We'll talk about reuse, talk about duplication, and then we'll have a little bet of a dive into reference and reference code. But first to get started, when we think about coupling code reuse, let's look at an example. And this is a real world example that I've worked through and that I've lived and felt the advantage and the pain of it was building an s three multipart upload. And this s three multipart upload seemed like a distributed piece of code that we wanted to write to make use in a couple of services. It was going to be a net based project and it was going to be used at least in two services, and likely several more after that. And so we had a clear identified need and we knew what we needed to build. And it was a bit more low level code than our business logic or the business logic of our services would care about. Meaning that it's s three multi part upload, it's chunking, it's streaming, it's using buffers, it's resetting buffers and streams using that type of code, which is fine to say that, I'd like to make that reasonable. That makes sense. It is a little error prone. If you don't reset those buffers or those streams at the right part as you're chunking and calculate the number of bytes properly, then it can be problematic. And because of that, it's also difficult to test because we don't actually have anything other than maybe local stack or a mock version of it to use locally. And so performing and writing integration tests and ensuring that it actually functions against the real thing is not something that we want to distribute across all our services if we don't have to. And this seemed like a really good identified piece of code we wanted to distribute, but it actually broke down and caused a lot of the pain that we talked about earlier, why it was inflexible and specific. We didn't correctly understand the contract or the interfaces that we need. With only having the two services to develop it with upfront, the module began to grow. It turned into a different set of roadmaps and opinions from just here's a multipart airplane component turned into a cloud package which had much more in it, which grew the dependency chain part of that dynamic equilibrium problem. And of course then it really became this proprietary library. And this proprietary library then required us to use it the way it was meant to, which is we needed a download service. And so the download service wrapped the s three multipart upload and you needed to use the download service and the download config object in order to pass that into the cloud package, which we then use and instantiate the s three multipart upload. That was a mouthful. And that's one of the problems here is it required some deep understanding of this that really I didn't need to have just to use can s three SDK. And as a result, then we also saw the dependency chain became a problem. We were on one particular dependency, very, very coupled. In this case, it was AWS SDK version two versus version three in. Net, which is substantially different. And importing and using two different versions of the same SDK in the same app domain in. Net is fairly difficult thing to do. And at the end of the day you ask yourself, well, there's some friction there, but maybe what we identified as the benefits was more valuable than what we lost. And the reality is it wasn't. And that's something that's hard to gauge without experience. But at the end of the day, when you look at it and you think you saved 100 lines of code, that's a good indicator, a good gauge to say, wow, maybe this wasn't the way that we should have gone. So with that example in mind, what are some reasons, some reasons when we should want to reuse? Well, emerging need was good. We had an identified emerging need in that example, and that was appropriate. But understanding the maturity of your organization is important because emerging need can be very different for someone in a new organization versus an older organization. Your roadmap and future plans may show different emerging need or problems from others, and so identify what that need might be and continue to look for other characteristics here that we'll see as well. Emerging need is not enough. High duplication, though, is interesting. And we identified that we had two places of duplication, but maybe that wasn't enough for us to really identify the contracts that would be in place to make it flexible enough. And so in this case, I often think about the rule of three at least. So you need to have at least three places in the wild that you're not planning for this exist, but that do in fact exist because requirements and code evolves and changes. And so it's not theoretical. I need to actually have three places this exists in production to evaluate and say, I see it's exactly the same in three places, or it's slightly different, and I can build a contract to build in that flexibility. You have to think about high complexity as well, if something is really complex and there is additional overhead to building it, but, and duplicating it, that might be a characteristic to say I should actually move towards building that sooner. And of course it might be high risk. And when I think about high risk, I often think about authorization type of code. We're often taught and told not to rebuild authorization mechanisms where you don't need to use the stuff that exists in your organization in order to do that, rather than building it each and every time, because that can be high risk. And authorization told be a good indication of something that is reusable. But you might also look for stuff that is a high change frequency. So we're not looking for stuff that is highly change because it's different every time you use it. You're looking for something that I have to change it often, but it's used in many spots exactly the same way that would allow you to change in one spot, test it in one way and then distribute it out to a lot of those places. Of course, at the end of the day, going back to our initial characteristics in looking at this, we have to ensure that we have low coupling on the architectural dimensions that you care about. And those dimensions are going to be specific to your application. And so if it's important for you to move fast in a particular dependency, then we need to ensure this doesn't have that dependency. It might just mean generally using a lot less dependencies in your particular package. You don't necessarily need that left trim package in order to do a left trim right. You can use something else for that internally or copy inside your particular code base and distribute it. One of my favorites too, that is a great place to identify reusable code is for best practices and principles. Anytime that you can codify your best practices and your technical principles and move them into your code base and then distribute that, that's a huge win that spiders across the organization. And so you're seeing materialize then a services of characteristics that lead us to a clear set of reasonable information, which is technical and cross functional concerns in a distributed world. What do we mean by that? We're talking support code, we're talking authentication, authorization, standard configs, platform level features and sdks that need to be integrated with logging and monitoring can have a great representation here. We'll talk about that in a minute. HTTP client sdks and wrappers are also very important. Those are things that don't necessarily change, but can provide a lot of value, error handling and validation. You don't need to do that differently in different app domains in a lot of cases. And of course, serialization, your serialization can be standardized in more interesting ways across the organization. Coming from SBS commerce, then we've had the opportunity to distribute a lot of these types of things in reusable code, and we use a monolithic, or I should say a mono repo style approach to that, where we build out a lot of these reusable modules in a shared repository in GitHub. And one, for example, is logging. We have an opinionated logging structure where we look for consistency of operations, dashboarding review, meaning that we push out this structured JSON log format that is the same across all of the particular applications we install it to. And that means that the log format is the same. Our operators, whether it be our engineers themselves who are monitoring production or other teams entirely that want to look at it, they have an immediate understanding to the log format that's there. Dashboards can be made in a reasonable way as well, because they can use the existing log format for it, and it's very quick and easy to review as needed. Errors can be another great example where, if you're using an API design first approach, do you have standardized error formats for your APIs that you're pushing out? There's no reason every service, every microservice you have or deployable unit should be building that on their own, or worse yet, just using the completely their own strategy or unique schema we should be able to standardize. And if you're in the API design world, you're familiar with RoC 708 seven, which proposes a standardized way to do that. Now you need to represent that model and distribute it. Identity identity is our package that allows us to handle authorization and authentication in a standardized way, tested in a single spot, and distributed it in a highly effective way. And serialization, and I mentioned this earlier in the last slide, serialization is not something that necessarily has to change between service to service. In fact, you can find a lot of additional interoperability and capability by just having that taken care of. And I'm not just talking about are you choosing camel casing or snake casing, talking about more interesting things around how you handle enumeration serialization, or how you handle nulls, or you ignore them or add them. What you do when you want to ignore something, all sorts of different detailed serialization questions that are often overlooked, and of course, secrets. At SPS Commerce, we use the AWS secret manager, and we have a very custom and proprietary way that we use it and organize it in a multi account, cross account world, sorry, I should say multiregion, cross account world. And so what we're seeing here is that there's still some level of coupling to these aspects as you pull these into your service. But this is where we move in and we talk about the concepts and the characteristics around appropriate coupling. Because you see, you might think, or you might believe that the term coupling is always bad. I always want low coupling, when in reality there's also appropriate, compelling. And if you're asking what appropriate coupling is, let's go back for a definition, which is dimensions of the architecture that should be coupled to provide maximum benefit with minimal overhead and cost, meaning that there is a benefit that compelling can provide. When we balance that, we also compare it against this additional quote, which is that the idea that more reusable code is the less usable it is. Meaning that if we go to make our code too reusable, too flexible, it then doesn't provide an opinion that might be specific to your organization, and therefore it's less interesting, it's less usable. So we're looking for that balance of the right level of low coupling, but still having reusable code that provides an opinion in there. And that's a really difficult balance to find. Like everything in architecture, that balance is staying in the middle, not finding each extreme of the ditch. So let's look at two examples that might help clarify exactly what we mean by this balance that we're looking for. First would be consistent logging format that we just talked about. The balance there, of course, is the advantages that I mentioned around dashboard usage, and we can have operators then that look at this in the same way. But if you've created, for example, a reusable dashboard, and that dashboard is a single instance where you can select a service and now switches between all those services very easily across your different teams and it can read them all, that's great. What happens if that reasonable dashboard is actually a template that gets copied every time it rolls out? Now, if I were to change my log format, that doesn't necessarily help me in the same way because the new log format rolls out to the different services and will break all the reasonable dashboards that are out there and cost me time to update and redeploy them. So thinking through how the different coupling characteristics might interact are essential here. Another one of my favorite examples is with feature flex. So you're not familiar with feature flags. Think of them as a decision point you can add into your code to decide if you're going to execute a piece of code or not, you're reusing a feature. And you might ask another service to say, hey, is this feature on or is it off? And when we think about it, there are three key areas that we can use appropriate coupling to help us with feature flags. First would be flag keys. So when I ask that provider, hey, is this flag on or off? I have to provide it a key. And that key is a string text that has to match what's in the feature flagging decision provider. If it doesn't match, it's not going to work. And so if I can create a package that distributes those particular keys across different services that might need to consume the same feature flag toggle, that's a big advantage. That also helps me and is can advantage when I want to clean it up. If I want to clean up one of these flags across a whole bunch of different deployable units, I need to ensure that I go across them all and remove it. And so using distributed package to remove it and build it into an enumeration as an example, and then remove that enum now allows me to actually redistribute a package and I'm using to actually break the build as people upgrade, and so they'll upgrade and be able to go to their code and know that they should remove that. That flag is no longer active or shouldn't be considered. Similarly, user context is important. When I work across a distributed environment, I need to provide the same context and say my user is name Travis. Now if my other service asks and says name is Travis John, then that's not going to work. That's not going to be appropriate. Those are two different names that it's provided. Maybe one's using name, one's using first name, last name. The context isn't the same. So using a distributed library to help with the abstraction of the feature flag context can be enormous help in appropriate, compelling as well. Another way to think about appropriate coupling and the difference between duplicating and reuse, I find, is the traditional animation process. And in my household I have three young children, and the Lion King is a pretty big, I guess, show that we watch often. And it's interesting when you look at the traditional animation process, I have a lot of respect for the animators because this is a ton of work to do. Some of these minute changes in detail shifts. When you think about it, we can break this into two core parts. The first is the foreground, the second is the background. And here the background really only has two components. And the foreground, though, in animating the different animals as they move and shift is pretty unique, right? Every time I got to redraw it, do I want to have to redraw the background that hasn't changed every single time? And of course, the answer is no. And when we think about that, then we think about that in terms of duplication. When you're going to redraw those animals in this way, you don't want to redraw the background every time. You're going to have a ton of little changes and small changes to make. And this is what the most important part of the scene is, is these animals, not the background. And of course, the background then is the reusable portion that we want to have there and available. And in the traditional animation process, they were done separately. You can take a look at behind the scenes with the lion king and see how that looks and works. But essentially we have some transparent sheets that are overlaid over top of each other, a series of overlays that build this together. So the animator isn't responsible for drawing the whole scene and redrawing it every time he wants to slightly shift or move the arm of the lion, for example. In the same way we think of that with code reuse in a distributed microservice world, when we stand up a new microservice, there's a ton of cross functional concerns that are just the background. They're there to make the service work, they're there to set the context. But the real business logic, the thing that makes this particular service unique, is often not something you want to duplicate. It's domain driven, and it should be existing in this particular domain. It belongs in this microservice. So with that in mind, though, sometimes there are reasons just to copy it. And so I want to run through some of these quickly with you. Sometimes understanding if you have an incorrect abstraction is important. Right. And so when we look at this, the idea is that you should prefer duplication over wrong abstraction. If you don't have enough information to understand the abstraction that you want to build, the interface or the contracts that you're building, then don't do it yet. Like that simple example in s three we looked at earlier, we didn't have enough understanding of what the abstraction should be. Low overhead to savings ratio. So we think about what is the actual reuse savings going to have? And of course, a lot of this goes back to that law of diminishing returns we saw earlier. And this, in a lot of cases, there's no simple formula to say this is the amount of time you're going to save. And a lot of these things are intangible to some degree. But as you begin to think about this and look at this more, you'll develop a real sense for it, a real gut understanding of it, but in many cases, understanding how often you plan to reuse it and some of those other characteristics in terms of high risk emerging need. Is it already built? In three spots that I can see how it's been used are really helpful guidelines for how to approach that. And of course you want to think about feasibility here. And when we think about feasibility, we're thinking about the idea that maybe this isn't something that your team should be building. Maybe you've bitten off too much and there's more than you can chew, right? If you're a particular delivery team and you're working on a small aspect, and you're putting together a large application framework, reusable package, maybe that's something that your platform engineering team should be building, maybe it isn't even you. And so it's something to consider that if it's not feasible, if, for example, there are too many dependencies being added and you can't legitimately do it without a high degree of coupling, then don't try and do it. If that's the case, it can be better to copy a little code than pull in a big library. For one function, dependency hygiene trumps code reuse. And of course last but maybe of most interest is diversified opinions. Sometimes there are just other opinions, there are too many opinions on how to build something. And if that's the case, maybe you should consider actually not building it. Bet. Until you've actually had a chance to land on an opinion within your organization that would actually make it reusable. Otherwise, you might find that no one wants to use your nice reusable code that you've built because they have a different approach to the performance of it or to how serialization should work. So with that in mind, I always like to bring up this idea around shared utility libraries with coupling. This is often how I feel when I see shared utility libraries. And let me explain a little bit about what I mean by that. And so if we were to use a library kind of associated term here, and in the library you had a book and you wanted to pull that book off the shelf, and it might be called Myorg module utilities. And there are two core pieces of content in that particular book. There is how to cook craft dinner in the microwave, and building custom furniture so different from each other. Much like many utility libraries that are just made up of different random things that people put in there that do different string parsing utilities or enum parsing all the way to HTTP clients for APIs, the reality is that they don't actually belong together. And because you've coupled them together in the same distributable project or single piece library that's there now, they can't be used effectively to get rid of some of the problems and the high degree of coupling that we've seen thus far. And so the reality is that cooking craft in the microwave and building custom furniture have much different dependencies involved in them. Or if they were actually pieces of code they told have much different packages they were consumed. But now you've bound them together and you're pushing them out, forcing consumers to think that well, number one, the dependency tree is going to be awful for that. Number two, do you really have any authority over either one of those? Why would you put them in the same package? Maybe they're not even accurate or correct. And so it begs a lot of questions. Keep those libraries small, keep them task focused and specific. Don't build a utility library. But in a lot of cases a utility library has been built simply because someone just needed a place to start copying these kind of one off functions and put them somewhere. And that's not a bad thing to have. It's just probably more appropriate not to distribute it and just keep it in a GitHub repo, keep it in something like stack overflow. We think of those as snippets, right? So you might use different methods for keeping track of those and having them available. Maybe at some point it makes sense to see how they grow, and maybe there is a package that is nicely coupled together that would make sense to build. But over time you'll likely find that there's a good chunk of stuff in there that should never be released together in a single libraries. So with that in mind, let's assume that you've made the decision to go ahead and start building some of these cross functional concerns into a libraries, and that libraries you can go ahead and build it out and im sure it would be effective. But what I want to touch on here is the idea of templating. Because templating can really help us think about how to position our reusable code across your ecosystem. And so the idea around templating takes us one abstraction layer above that with the intention of introducing a grouping of packages to form a standard, a more opinionated way of implementing something that can be cross cutting, some of these cross cutting technical areas that we want to apply to. So we're going to talk about project seeds. We're going to build on that with service templates and then build on that with service chassis. So diving into project seeds, what do I mean by a project seed is very important, which is a very high level reference point for starting a new application that typically provides standardized folder structure along with SDLC workflow via templates files. And so it's fairly simple and straightforward. Your seed is made up of things like metadata folder structure. Do you use source or src? Do you have a test folder? Do you have a standardized GitHub actions yaml file you put in there that gives you the defaults of your template or workflow? Really any of these types of files or folders that exist, it might be GitHub specific files like a dependent bot file or codeowners file, or even a readme MD of how to get started. And when you go ahead and create that, you're creating a copy of it. And so in most cases when using a project seed, it's a one time copy to start the project. Here's your skeleton, go ahead and start ripping it apart and changing names and moving stuff around. And so the value here, there's a little bit of value, it helps you notionally get started on a new project. In some cases people are already copying previous projects they work on in order to start. And the cost to maintain is pretty low because it's typically language agnostic. And you see that as a repository template in GitHub where you can mark any repository as a template and then it simply gets copied and pasted into the new repo as you selected and start. So there are mechanisms out there and there are tools at GitHub that just have that built in and it's low value, low cost as well though. But more interesting is moving to a service template. And a service template is defined as this, which is an opinionated reference for specific application and language types that reduces boilerplate setup and provide consistency on crosscutting concerns. The important part here with the template is we're actually moving to language specific scenarios where we want to provide distributed and reusable code. And so here we have security classes, we have external considerations, loggers tracing all the types of code that we talked about earlier that we think is appropriate to reuse have been copied and are available here. Now this typically a service template can be created with tokenized parameters, and so tokenized parameters are hey, I want to change the namespace this is in, or the class name to be prefixed with some specific name for this service, and it does that transformation and dumps it in your repository for you. This again is a point in time snapshot on creation. It doesn't update or change after that. And so here we see the value is a little bit larger because I can get started pretty fast with some pretty great opinions. But the cost over time, especially as we think about the law of diminishing returns, then that I'm copying these pieces of code means that that's great when you got three of them, but when I got ten or 20 of these and I need to make an update to the security class, oops, that's a big change to make. Nevertheless, we see the ability to do service templates in tools like backstage. IO has a great template library and marketplace that you can go into, and you can create custom donts and choose them and tokenize the parameters to get started, and that works fairly well. The reality though is we're still copying that code and we're still seeing some of the coupling points are really going to get our way over time. So we really need to think about the concepts around service chassis. And a service chassis again is not something that you would just do on your own or have to do on your own versus a service template. You might actually combine project feels, service templates and service chassis altogether to form a great experience. But at a high level, a service chassis changes this around by instead of including all the code inside of our service template, we're actually just including configuration for those packages that are being pulled in. And so now they are copied references. We've copied by reference, or I should say passed by reference. And so when we create the new service, that's great. It is a point in time snapshot of the service template. But because I can change and modify those pieces of code, I can very easily now reference them and reference newer versions if I want when you're pushing them out. And so here we see the value is pretty high because not only can I get that point in time snapshot and get flying on a new service very fast, I can also begin to augment and change the opinions and the best practices and have individuals update with them over time. Now it gets a little bit better than that. We can actually take this one abstraction further. In this case, I have a series of static references. I have exactly five references. They've been added. That's all I can ever change at a global level within the organization, unless I go back and add another reference inside the service template. But then only new projects get that. So we abstract it one bit further with the service chassis and we use this particular chassis. We might have a chassis named specifically for building rest APIs in the organization, and it provides a configuration of all the packages above, maybe some other configurations relative to rest API, and that way our service templates becomes much smaller. Anytime we can eliminate components in the service templates but provide the same functionality, that's a benefit. So here the service template is very small, it's just a bootstrap that says config. Use this one package that we have as a reference to the abstracted reference. And of course when we go ahead and create then the service off of it, it references a single reference package, but it references all these other packages. And of course the benefit there is I can add new stuff, I can change high level configuration, I have a lot more control on the integration of that within the service. And so the value here can be much higher and the cost is much lower, not assuming the cost of creating the other packages. And so this is a pretty large advantage where we can start to combine then the surface chassis concept with the service template and the seed to build a pretty nice experience for reusable code. Of course, there's another problem that we often experience in the organization, and I call that the service mesh gap. And the reason I call it that is because we think about platform engineering. We think about building out these platforms that a lot of our organizations are building now, and a lot of that is built on Kubernetes cluster, other container orchestrators perhaps, but at the end of the day, using service meshes and building that platform, we see a lot of the functionality provided for us. So areas where we might have only been able to do in code before, like distributed tracing. If you're trying to do that without a service mesh, you had to put it in your code. But now as we start to move some of these things and we use service meshes and proxies, we can begin to automatically build in mutual TLS tracing, egress, logging metrics, errors, and auth is just default part of that, even if your container is just hello world and has nothing else as a part of it. But the reality is to effectively do the logging, it still has to come. Often cases in some type of agreed upon format or distributed tracing works in the service mesh without any changes. But we can add a lot more context to it if we want to build its maturity to the next level. And so we need to meet a contract in these cases to metrics, to logging, to tracing, and that can be a great place for the service chassis to fit as well. The reason is now we can make platform level updates, but also roll out distributed piece of code that can also automatically roll out to these different services that have already implemented and used that particular service chassis. Now the next question on top of this I know is what you're thinking, which is that my teams never update their packages. That's great. You're saying that they can update to a newer version, but why? Told they they're not even thinking about that necessarily. And if we want to get this to a point of highest effectiveness, we need to have some level of communication with the teams that's letting them know when new components or new versions of the package are available. And that needs to be something that happens fast and quick in order to keep your velocity flowing in a microservice world. So there are lots of different tools to do that. If we think about the problem and we have a library, in this case a nuget package could be a maven package in your app. The only reasons your team is going to want to update is because while it's initial install, they detected a vulnerability. There was perhaps a major upgrade, or they need a new feature that they're looking for specifically on that package. Beyond that, teams aren't going to upgrade, just go look at it on their own. Typically you have some really great engineers who do that, but typically they don't. And so we can use tools to help us solve this problem. Tools like dependent bet. If you're using GitHub, dependent bet is a really easy component, especially for open source, that you can enable that automatically submits pull requests to your repository with any package updates. And to answer your question, yes, it does support private feels. So if you want to build your internal service template, deploy that to an internal package repository, or JFrog or even public, you can configure that and include it. It is highly configurable for other purposes. So if you're saying I don't want to bump all my version numbers all the time, though that might be a good idea. You can configure just to say for these packages you should do that. And this is incredibly helpful. And we encourage all our teams to do this so that you're staying up to date with, if not other things, at least with the distributed packages that are there, that could be information you include in your documentation of your library. That that is the expectation for consumers. It does work across all major language ecosystems that I've looked at and used, and it does interact nicely through pull requests. There are other tools out there if you're large in the. Net world. Newkeeper used to be a good option. It's not a hosted service though, it's one you'd have to build out and host yourself. But renovate is a great option and dependent bot is nice and it works. But renovate provides a lot of additional options, especially in terms of grouped updates, ensuring that certain packages get updated at the same time and the same pull request, which can be problematic. Independent bet so take a look at this type of tooling for dependency management and dependency consumption and see if you can make use of it. It'll change the way you think about distributed code in terms of your velocity as well. So with that in mind, moving on then, here's an example of a service template that we have at SPS, and this incorporates a lot of these capabilities. This particular example is demonstrating how we moved and we use the service template to move between error formats, to move to structured JSON output, to move our secrets from AWS parameter store to secret manager, how we handle and move from standardized resilient HTTP clients for identity authentication to more distributed auth handlers, which is pretty cool tracing. And we moved from AWS x ray to open telemetry like many modification of our serialization routines. And all this was built inside a standard API chassis that had an opinionated set of best practices within our organization that it created and set up by default. And then at the same time include a bunch of your own capability, additional security, middleware, swag or sentry. Build that all into the application so that with a simple single install and a simple package reference, we can begin to take advantage of your best practices in your organization without having done anything, without having the overhead of that code even in your repository. And so for us, this is a particular example. In. Net we also have other growing service templates in Java and also in go and in Python. But this example I can simply create a new scaffolded web application using. Net new web API, which is the default template from Microsoft, not something we created. And I can then do a. Net install of our chassis and update the runtime host. So updating your program Cs essentially, or if you're in Java, your program Java, and here you can come in, you can specify then that you want to use the SPS host. This is our service chassis. Add it in and then specify that you want to use the middleware as well as the dependency injection, and then after that everything is configurable and ejectable, meaning that you have an escape hatch at every point. If you decide you don't want to use a certain feature of that package without ejecting from the whole service template altogether. Of course, there are teams that don't want to necessarily couple themselves to this particular large service template, and they can use some of the other distributed packages independently and individually if needed. So that's been a huge advantage in what they're producing. Well, we're almost at our time for today, but it's important that we talk about what incremental gains you can achieve within your organization. And I want to make sure that this was understood before leaving here, that in some cases, achieving a full service template might be a mile high order, something that you can't get to with what you have. And so instead of developing grand designs for an internal code framework, it's often best to start small, develop iteratively, and progressively build on small successes. So take advantage of if you can't build a full service chassis today, start with some of the smaller concepts, build one package and then include that package reference in your service template and start to roll it out. Think. But whether you need a full service chassis, we didn't talk about it today, we didn't have a chance to. But if you're in a polyglot ecosystem, it can be a lot of work to maintain those. And your dynamic equilibrium is all about. Do you have enough people supporting it on what the needs of the organization are? So a lot of this comes back to the resources you have at hand and the capabilities that you have. But no matter what you decide, there is a path forward for effective code reuse for distributed microservices, and I think that you should investigate it further, and I hope that's compelling enough for you. Thanks for taking a look today. I'll leave you with this quote from Douglas Crawford, which I love, which is code reuse is the holy grail of software engineering. And whether by Holy Grail he intended to mean that it is all about the journey, that you may never find the holy Grail, or the potential that it may or may not exist, or whether it is in fact just the ultimate treasure that we are looking for. I'll let you decide on which definition is appropriate, but keep thinking about appropriate coupling for your decisions, and dive in and take a look at some of those service templates and chassis that you can build internally in your organization. Appreciate your time today. Take care. You can always find me online. Mine as well. On my website or also on Twitter.

Slides

Download slides (PDF)

See all 41 talks at this event!

Conf42 DevOps 2023 - Online

January 26 2023

Compelling Code Reuse in the Enterprise

Video size:

Abstract

Summary

Transcript

Slides

Travis Gosselin

Principal Software Engineer, Developer Experience @ SPS Commerce

Join the community!

Featured event

2025

2024

Info

Conf42 DevOps 2023 - Online

January 26 2023

Compelling Code Reuse in the Enterprise

Video size:

Abstract

Summary

Transcript

Slides

Travis Gosselin

Principal Software Engineer, Developer Experience @ SPS Commerce

Join the community!