Conf42 DevOps 2023 - Online

Is It Time To Put Your Pet Kubernetes Down?

Video size:

Abstract

Chris will explores how Kubernetes is the new pet in town and the consequences that presents. How to know you’ve got a pet, and what to do about it.

Summary

  • Chris Nesbittsmith explains how to use Kubernetes in production. Like the physical servers, your workload failures can be detected and replaced seamlessly. With more and more features being kind of out of tree, these are optional add ons.
  • The open source community is awful at packaging things up in this way for consumption, introducing needless abstractions. The kids doing kubernetes seem to have not learned from the past. Everything, literally everything that exists around us depends upon it.
  • Make your cloud vendors do more than just provide compute. Keep it stupid simple, or keep it simple stupid. And embrace the shared responsibility model on offer. Is it time to put your Kubernetes cluster down? Yes, it is.
  • Chris Nesbittsmith is the host of CNS me. Since I'm awful at self promotion especially on social media. CNS me contains this and other talks and they're all open source. Questions are very welcome on this or anything else.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
You. Hello. Well thanks for joining me here today and allowing me to stand up and use my clicker and everything's a privilege and thank you very much for joining. So to kick things off, my name is Chris Nesbittsmith. I'm based in London and currently work with some well known brands like Learn K eight, Control Plane, Esnergy and various bits of UK government. I'm also a tinkerer, open source stuff. I've been using or abusing Kubernetes in production since it was zero four. So believe me when I say it's been a journey. I've definitely got the scars and wall wounds to show for it. We'll have time to be able to deal with questions in the chat, so please do drop a line in and let me know kind of where you're joining from and any questions that youve may have. If I don't get to them in here then please do feel free to find me on LinkedIn and have a conversation there so the history of pets versus cattle terminology is muddy, but most link it to a presentation by Bill Baker from Microsoft made in around 2006 around scaling the SQL server. Way back then in the before times we called ourselves sysadmins and treated our servers like pets. For example, Bob the mail server. So if Bob goes down, it's all hands to the pumps, the CEO can't get his email and it's near. On the end of the world we see some incantations, make some sacrifices to an altar and resuscitate Bob, bringing him back from the dead. Crisis averted. Cue the applause and accolades for our valiant Chris admins who stayed up late into the nights. In the new world, however, servers are numbered or maybe just given a UUID. So they are like cattle in a herd. For example, web one to web 100. So when one server goes down, it's taken out the back shot and replaced on the line. So why am I telling you this rather morbid story? Well, Kubernetes deals with all of this, right? And saves us from the tyranny. And you're right, it does. All of your computers are called nodes and they're abstracted and given arbitrary names. Auto scaling groups and such will automatically detect the sick in your flock, take them out and bring a replacement in, all while seamlessly rescheduling the workload that was on that failed machine. And Kubernetes takes that a step further. Your workload also has unique names, so like the physical servers, your workload failures can be detected and replaced seamlessly. So where's the pet, you might ask. Well, what's the first thing we do with a brand new Kubernetes cluster? I'll give youve a hint. It's not deploying your application or actually anything that the business itself cares about. Something like that look vaguely familiar? Yeah, we had to do a load of things just to make this cluster able to start running our workloads. And it's worth noting that with a trend towards more and more features being kind of out of tree, which is to say that they are optional add ons and don't ship as part of core Kubernetes. So example of this, things like flex volumes policy and basically all the Kubernetes Sig projects that many find essential is only exasperating the issue. Well, that might work for when you've got, say, a single cluster, but what about when you've got dev integration, staging and Qa that your app all needs to run on? Or worse, when you need separation between your teams or products. So maybe you've automated all of that, say some bash, ansible, terraform, whatever you like. Well, cool, good on you. However, you'll find it won't be long before there's an updated version, perhaps patching a vulnerability that you care about. And you may be stuck trying to test every single app and permutation across your estate. So this is what we're calling day two operations. We used to call it BaU, or business as usual, and it's where we find reality catching up with our idealistic good intentions. So you'll quickly find that clusters are running various versions. So given the rate of change in the community and industry, it's unrealistic to run like latest everywhere confidently, at least without breaking production and disrupting your operational teams. So if permutations of seemingly common tools and choices. So some teams might use Kong, others Nginx, another Apache, another envoy, all for plenty good reasons I'm sure will find yourself seamlessly infinite possibilities across the estate. Emerging sad times, right? Congratulations, you're now the proud owner of a pet shop. Or if youve managed to automate the creation of them, you can call it maybe a pet factory, but it's a headache. So how does this hurt you, you might ask? Well, maybe you like pets. Well, assuming, of course, you're in cloud, your world could roughly be summarized into a few tiers. So, apps, well, these are the things that your boardroom level can know about and can probably name them. So think your public website, shopping cart system, customer service apps, online chat interfaces, email systems, and so on. So these are all implicitly providing some value in of themselves to your end customers. Infrastructure with cloud Chris is hopefully all a commodity. Thankfully the days where anyone in your business should be caring about the challenges of physically racking up hardware got overloading the weight in the cabinet, taking pride in how well they've rooted, all the cables have hopefully all passed and you're now just consuming infrastructure. Hopefully you've codified this. But even if you're into click ops, making sure it's running is not really your problem. No one in your business is concerned with hardware failures patching routers every time there's a critical vulnerability, testing the ups and generators, regularly, upgrading the HVAC when you add more kit and so on. Yawnorama as my 16 year old would say, and then curse me for repeating. But your interactions with any of this are basically a few clicks or lines of code and some infra is then available to you with an SLA attached to it from your cloud vendor. If only the story ended there though. Sandwiched between those is a gray layer of all the operational enablers, so it's where your DevOps or SRE team live. So think log aggregation, certificate issuers, security policies, monitoring service mesh and others. These are all the things you do because of all of the sorts of reasons ranging from risk mitigation to emotion and technically unqualified opinion, or just without the foresight of what was round the corner in say, six months. Let's just make the leap and assume for a minute that you are more technically competent than your Goliath multi billion dollar cloud vendor. You've completely negated many of the benefits of going to cloud in the first place by ripping up that shared responsibility model. All of this, whilst technically fascinating for people like me to stand and stroke my beard at this, is delivering absolutely zero business value. Unless of course your business is building or training on those products. And who'd want to get into that business, eh? And that's not all. What about recruitment? So you might think that you want a DevOps, right? Oh no wait, a DevOps with Kubernetes experience. So maybe a CKA or cks. Oh yeah, well it's on AWS and we use Linkerd and in some places istio got the current version or even the same version everywhere. A mix of pod security policy, Caverno OpA policy, some terraform, helm, Jenkins, GitHub action, super going on, all in a monorupo. Apart from all that stuff that isn't well, we're well outside the remit of commodity skills and back to hunting unicorns. Sure, you'll find some victims. Sorry, I mean candidates that you'll hire. Well, now you've got one hell of an onboarding issue before they can do anything useful and help your business move forwards faster than it did without them. And if you've hired smart people, they'll come with experience and their own opinions of what worked for them before. So your landscapes get bigger and bigger and more complex and diverse. I did some googling, so this is what the CNCF landscape looked back way back in 2017. Choices, right? Choices and logos as far as the eye can see. Have you seen it recently though? I mean, this has got a bit out of hand. I'd say someone might shorts have a word, but I suspect that would probably just make things worse by adding yet more things on the board. And don't get me started on operators. I mean, nice idea, but they end up betraying any of the ideas of immutability with crazy level of abstraction. And have youve seen the crazy of mutating admission controllers too? I mean, if you're really mad, you could nest these things with operators that create crds for other operators that are all mutated. I mean, heaven forbid someone bumps the version of anything. And no doubt all held together with sticky tape, chewing gum, glue, pipe cleaners, thoughts and prayers and helm, a string based templating engine where any community module has to eventually expose every parameter in every object file abstracted by a glorified string replace. So now I've got to have in my head all of the complexities of a Linux or Windows host, how the container runtime works, the software defined networking storage, the hypervisor before the container, the scheduler, the controllers, the author policy, the mutating policy in the cluster before I worry about how someone in the nested helm chart mess of hell has mapped the replica count of one of the deployments to a string called DB replicaccount, and how that has changed on a new version of a dependency that wasn't following semver to database replica count. So instead of having my expected three, I've now only got one, when I could have just written a yaml patch for the replica account in the deployment object of the database resource using a stable API versioning with the schema validation all for free. The kids doing kubernetes seem to have not learned from the past. Don't get me wrong, I love the open source community with all my heart, and it's so important and it's simply not possible to do anything without it. Sorry, not sorry. Yes. As a sidebar, every talk, pretty much this decade has got to reference log four j. This is my slide. Deal with it. It's not relevant. It will come out, hopefully soon. Everything, literally everything that exists around us depends upon it. And the community is brilliant at building some truly remarkable, very high quality things. But we must accept that the open source community is awful at packaging things up in this way for consumption, introducing needless abstractions. But enough of that. I'm definitely going to hell. You can send me all your hate in the mail. Okay, happy place, Chris. Happy place. So where was I? Okay, yes. Through all of this, I can't possibly think of a faster way to go. From enthusiastic engineers playing with the new, exciting, shiny tech to deeply unhappy ones trying to fix something at 04:00 a.m. And before they can do anything meaningful, they've got an orienteering exercise to switch mental context to whatever the intended permutation of things it is that they're meant to be looking at. Meanwhile, your business value delivering apps are all offline, or perhaps worse, at breach. Okay, so rewind a minute. We didn't want any of these things. How do we get here and what can we do about it? Honestly? Bin it. Bin it all, kill it with fire, and then learn to love vanilla. Vanilla is great and delicious, too. Does anyone remember kiss? No, not the band. Keep it stupid simple, or keep it simple stupid. And embrace the shared responsibility model on offer. Make your cloud vendors do more than just provide compute. Turns out, as it happens, they're actually not that bad at doing it. I'm not daft. I know it's not sexy and exciting, and you might even find recruitment a bit harder if you're used to hunting magpies who follow the shiny and don't like boring stuff, that just works. So, to answer the question posed by the title of my talk, is it time to put your Kubernetes cluster down? Yes, it is. And in the immortal words of s club seven, if you can bring it on back immutably from code, all without anyone noticing, I'm referring to the original version of those lyrics. Of course, then maybe, just maybe, it can earn the right to stay to die another day. So I've been Chris Nesbittsmith. Thank you again for joining me today and enduring my self loathing. Like subscribe whatever the kids do on LinkedIn GitHub, and you can be rest assured there'll be no spam or much content at all. Since I'm awful at self promotion especially on social media. CNS me just points at my LinkedIn talks. CNS me contains this and other talks and they're all open source. Questions are very welcome on this or anything else. I'll be kind of in the chat and on LinkedIn if I'm not responding there, I need to go and have a sit town. Thank you very much.
...

Chris Nesbitt-Smith

Consultant @ UK Government

Chris Nesbitt-Smith's LinkedIn account Chris Nesbitt-Smith's twitter account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)