Transcript
This transcript was autogenerated. To make changes, submit a PR.
You. Let's go ahead and dive into what
cost and resource optimization actually is. We're going to be diving into the
theoretical piece of it first, and then we're going to get a little bit hands
on and see a couple of tools that can help us with this. All right,
so, first things first. Why should engineers care about saving
money? Don't we have unlimited resources? Well,
no. Well, yes and no. Let's just dive into that.
First things first. Why should engineers care about money now?
It's actually funny and ironic to me when finops and the whole
cost optimization thing started coming out, because I can
remember going back ten plus years as
a sysadmin, even in the help desk space when I was setting up desktops
and laptops and stuff like that, we always had to care about the
finances. In fact, we typically worked pretty closely
with the finance teams to say, okay, we have this budget.
We can't buy this. No, we have to change this spec, et cetera. So it's
actually ironic that a lot of it already works with finance teams.
And funny enough, there are a few.
I wouldn't say a lot, but there are some organizations where
the director of it, for example, will report to
the CFO. I've seen that a couple of times. So they're definitely in
conjunction pretty well. And I would say it
has always been this thing where it didn't make money, it always spent
money. I think that mindset is definitely shifting in
today's world, especially since it's obviously technology driven.
Of course it is. I guess I'm dating myself here. I have some gray hair
on the sides, if everybody has already noticed that. So, yes,
we should care about money. We should care about saving costs.
But I want to be very clear here. Even though
we want to care about saving money, we should not do that
if performance is degradating. Now, what do I mean by
that? For example, let's say you have three worker nodes running and
you want to save money. So you're like, oh, I'm going to scale down to
one. No, don't do that, because it's going to mess up performance on the
cluster. So you want to make sure that you're saving money, but at the same
time, you don't want to be saving money if it's messing
up your environment. Now, the next thing is,
don't we have unlimited cloud resources? Well, on prem, we know that we don't
have unlimited resources. If you want an extra server,
you got to go get one. You got to talk to your reseller. It's got
to get shipped, it's got to be configured, it's got to be put in the
data center operating system, yada yada, blah blah blah. But in the cloud, no,
we still don't have unlimited resources. There are limits in regions,
there are caps. I forget exactly what region
it was, but I would say about a month ago or so,
I believe it was the azure storage account service, it ran out
of storage. So in one of the regions you
couldn't create a new storage account or
add stuff to a storage account. So yeah, no, we don't have unlimited resources.
So when it comes to resource optimization, really what we care about
here is to ensure that what's running is needed,
is necessary. For example, if you have 20 worker
nodes and you've never had to use more than six,
well you probably don't need an extra ten. You should keep
an extra couple around just for scalability purposes, just in case
you have a spike and it goes up to seven or eight. But from a
resources optimization perspective, you want to ensure that what you're using
actually makes sense in your environment. Because if it doesn't, whether it's from
an application perspective, whether it's from a cluster perspective, whether it's from a
network, from a storage perspective, if you're just spending money and
overalllocating resources or underallocating resources
for no reason, you're going to have a problem there.
Now, speaking of under allocating resources,
that's where scalability can come into play for both overutilizing
and underutilizing. So from a resource optimization perspective,
with scalability, I feel like we kind of always go to the, like we
need to scale up. We need to scale up. Yeah, we need more nodes.
Yeah, we need the ability to scale up, et cetera. But there's also the thing
of scaling down, and you want to be able to scale down as well.
That's arguably just as important as scaling up because guess what?
Maybe you're in peak season, maybe you're an ecommerce site,
cyber Monday, got to scale up. Maybe you need an extra two,
three worker nodes. But guess what, six months out of
the year, eight months out of the year, you don't need those two extra worker
nodes. So because of that, you want to scale those things back down.
Otherwise you're spending money for no reason. So cost and resource
optimization both kind of come into play with each other. Now, speaking of
cost optimization, don't spend if you don't have to.
That's arguably the biggest thing that I'll say, don't spend unless
you absolutely have to. There's no reason for it,
you're going to lose budget, people are going to be angry,
all that fun stuff, nobody wants to deal with it. So when it comes to
cost optimization, ensure that what you're spending makes sense.
Ensure that your resources are optimized. Because guess what? If your resources
are optimized, cost optimization is pretty much just doing its
thing in the background anyways. So you're good there.
All right, so now there are various tools in this space.
There's cast AI for cost optimization, resource optimization, there's Stormforge,
there's Cisivio, there are even cloud specific tools in AWS
and GCP and Azure for all of this cost and resource optimization.
Now we can't go into every single tool here, but I want to pick
out two for you, Socivio and Stormforge. And we're going
to see what both look like because one is more of
a managed service, like a SaaS in a sense, and then the other one is
you're actually managing it yourself. So let's kind of see how both of
those work here and we'll dive into our Kubernetes cluster. So the first that we'll
take a look at is Socivio. So what you want to ensure is you want
to have at least two nodes running, right?
So the first thing that you're going to want to do is you're going to
want to go to the download page and then what's going to happen is you're
going to get an installation based on your operating system.
So there are installations for Mac, Linux boxes,
windows, et cetera. Right. So I'm on a Mac.
So I've actually already brought the installation down. But what I could do is
I can tar it and then I can actually run the installer. So if
I CD into cost and resource optimization,
I see that I actually have that installer right here.
So I'm literally just going to go ahead and run it.
As we can see, we get some terminal output, we have the ability to
choose where we're running. So in this case I'm on an aks cluster,
but if you're not, totally fine, of course.
Next I'm going to choose my cluster name.
All right, we'll use the default. Now in production
you're going to want to set your domain suffix, but in this case this is
a demo environment so I don't care, I'm just going to do example.
We're not going to hit it from that domain anyways. Of course. And what we're
going to see here is it's going to do the full installation,
it's going to connect to the environment and then we're going to have the ability
to see it via the UI. So let's go ahead and just give this a
few minutes here and we can see that that was installed here. So what we're
going to do is we're going to use the Kubectl port forward command.
All right? And then we're going to go and we're going to hit this URL.
All right? And then if I just go back to vs code here really quick,
this is the password that we're going to use to log in for the first
time. So admin and password.
All right. And then if I zoom in a little bit, we can now see
that Susivio is installed. Now, again, want to just
point this out here. This is a managed tool that you're
managing. It's not SaaS, it's not managed for you. It's really
awesome and I love it and it's great and it has a lot of capabilities
as we can see here. But you do have to manage it yourself.
So definitely do just keep that in mind when you're getting this thing
up and running. All right. Now the next tool is Stormforge
and this is going to be a tool that's more
SaaS based. So you're just going to log into a portal. So I'm going
to go ahead and type in my environment name.
All right? And as we can see here,
we're going to go ahead and we're going to copy those
helm values. What I'm going to do is I'm going to go to vs code
here, I'm going to create a values
yaml file. All right, I'm going to paste it in,
I'm going to click continue and then I'm going to go ahead
and I'm going to install
via helm. Now I am going to make
this change here because I just called the values file. Values yaml.
All right, let's go ahead and run that and then we'll
wait for our helm chart to install all
the way it was deployed. But I'm sure there's
some resources still coming up.
Let's go ahead and check that. Oh,
sorry, Stormforge system.
All right. And as we can see, pods are still initializing and all that fun
stuff. So it'll probably take a few minutes and then we'll be able to
see everything in the portal, but we can just do a verify install
here really quick. All right, we can see that that was installed successfully.
Maybe it took like 15 seconds or so, 1015 seconds.
So we'll click finish were, and then as we can see
here is our portal. So we have everything from what's currently
being used, what we can optimize our
cluster information, the efficiency around our clusters
and around our namespaces. Again, total current request,
total optimized request. There's nothing going on here because this is just a demo
cluster, right? But we can see all of our information here
based on cluster, based on namespace, which is really
cool, and then based on workloads that are running, we also
have this optimize pro capability, which this
is more of a paid piece here. Stormforge is a paid
tool in general, and then we can
click on that performance button and we can create some new performance
testing, which is pretty cool. It's like benchmarks if we
want to. All right? And those are two tools that we
can use in cost and resource optimization to ensure that our
environments are running as expected. Again, we have one tool,
Stormforge SAS managed for you costs
money. And then we have Socivio, we can use it out of
the box. We do have to manage it ourselves. And with that,
thank you so much for joining me today, really do appreciate it and I hope
that you enjoyed the session.