Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hey everyone, hi, I'm Pritam. I'm currently working as a
back end developer at Uber. So today
in my talk, I'm going to talk about dynamic resource allocation and how it is
the most cost efficient tool for cloud computing.
To go over the topics, we'll start with
a brief introduction. We'll talk about few autoscaling,
how it works in the real world and what are
the cost savings and how do we use
auto scaling systems. And also we'll talk about
the future trends, which is regarding the serverless computing and
what is the future direction we are going with cloud
computing and how dynamic resource allocation
is gonna play a pivotal role in it. And then
we'll also briefly touch upon the future research opportunities that
are available on this particular topic.
So to start with, what is dynamic resource allocation
and why is it very important in cloud computing?
So computing is always
resource intense. Computing is all about using,
is very heavy on various resources,
may be the data storage or the
memory for computing and so on. So the
dynamic resource allocation is a
great technique which would help the
cloud providers efficiently manage these resources.
What we mean by that is,
depending on the computational needs of a particular
service or particular application, the resources are
dynamically allocated to it. And when there is a
certain downtime or when there is not very heavy traffic during that application,
the resources are clubbed out of it.
To give you a better example, if you are talking about an e commerce
application, the business during the business hours is when they
need much of the resources. And when there is a fire sale or
certain event that is happening, that's when that application would require
maximum resources. And during the store closed
hours or during the non business hours,
you don't need much of the computational resources.
So based on these schedule,
the cloud can provide certain, can allocate more resources
during the business hours and even more resources during certain events,
like we mentioned, like file sale or something, and reduce
the resources and use those resources elsewhere when there is not
much of the activity going on. So by
this concept of reallocating the resources
as per as per need basis, it has twofold advantages.
It is it helps the business provide
the services to their consumers in a more efficient way
and also helps business to reduce
their costs. So that's why this is one of the most important game
changing features in cloud computing.
So how does auto scaling work in
today's world? Almost all cloud providing services
like AWS, Google Cloud, Azure, everyone offers
a wide range of auto scaling capabilities.
So with we can
majorly look at, there are multiple techniques here and
most of these techniques can be broadly categorized into two categories.
One is predictive scaling and
the other one is reactive scaling. So, predictive scaling, as the
name itself mentions, it's like it forecasts what
would be the better time or what would be the resource usage
at a given instance. So like a, like the ecommerce
example we mentioned before, like, you can
anticipate that you need the maximum resources during,
during any sale event,
and you need a moderate number of resources during your
business service and the least number of resources during your downtime.
So that is an example of like predictive scaling. So based
on these predictions, we can pre allocate these resources.
And reactive scaling is like the name mentions,
it is a reactive like, for whatever reason, there is a sudden
rush for business going with the same hour ecommerce
example, when there is a sudden rush for business,
we instead of our systems going down or becoming unavailable,
more resources are dynamically allocated to
keep the business afloat and not ruin the user experience.
And so, and this,
the pioneers in this are Netflix and Airbnb. They were very
effective. Netflix in particular was very heavily
was and is heavily using the dynamic
resource allocation on their AWS cloud.
Netflix predominantly uses this predictive scaling approaches
based on their ton of data on the
users, on their viewing patterns and everything. They predict
the usage of their services and
accordingly they schedule the resource allocation for their
services on the cloud with the
cloud providers. So moving
on. Like, is it really useful? So this is according
to a survey produced by AWS
on their auto scaling. So they took all the consumers
they have and they quantified the value
they were able to produce. With the auto scaling, they could
clearly see that both the average
daily costs and monthly costs were reduced by 50%
for their customers. So this
is like the big testimony by one of the leading providers that
auto scaling techniques are actually beneficial, both for the
cloud providers and also for the
cloud customers.
And then when coming to how do we configure
these auto scaling systems?
So one of the key things for auto scaling is
we need to have a great observability and monitoring around it,
because say, for example, if you go with the predictive
analysis, which is more cost efficient than reactive
scaling techniques, we need
to have this robust observability and
alert systems in order to identify
if anything is going bad or if it is not meeting the needs.
So, and again, another major thing, again,
there are multiple case studies here in the early stages of this auto scaling
thing. Like we need to always beware
of the thresholds. We need to set down so we should always put
the upper bound and lower bound thresholds while opting
for the auto scaling systems. Like the reason for that
is the upper bound threshold stops
the cloud providers from providing abundant of systems.
And the problem with providing abundant of systems
is the cost exponentially increase when more and more
resources are added to the service.
So based on the business needs, that could really
lead to a big bill at the end. So the system should
configure an upper limit threshold so that with
the number of resources allocated are not more than a certain limit.
And similarly we need to also maintain certain cooling cooldown periods
after heavy d scaling. You don't want to immediately downgrade
to the lower volume after the teeth.
So the cooldown mechanisms have to be followed so that
we don't lose, we don't lose any transactionality of
the transactions that were happening during
that time and others and other things.
So this is a very, so the configuring this auto
scaling system needs to be this careful balance of performance and cost.
So like I mentioned, we cannot put
very high upper limit because that would negatively impact
the cost that needs to be paid to the cloud provider. So that
is the careful balance during the configuration
that we should consider that we should optimize for performance,
but also budget in the cost we are willing to pay
for our services to the cloud. And this is where
the machine learning integration is playing,
already playing a pivotal role and also would be the future in
my opinion. So this predictive analytics,
when powered by the machine learning models can actually become
very, very powerful because using the,
using a lot of data that is accumulated over the years,
the machine learning algorithms or machine learning models
can come up with a very accurate prediction
on the resource usage that would
really, employing these would really
empower in coming up with a perfect schedule or near
perfect schedule for the, for this auto scaling and
resource allocation, thus optimizing both the cost and
also not compromising on the performance of the services.
And apart from this, the, and the auto scaling
is going to become more and more important with the future of how
the cloud is moving on. Now we hear this term called
serverless computing. So this is a new
paradigm in the cloud computing world.
What do we mean by serverless computing? Is all
the, your entire application and all the resource,
all the services frameworks that needs to run that
application is not allocated to a single server,
more or less. There you can imagine this. And Docker
is one of the great examples here where you put, you create
these containers which has all the things that
are required to run your service or power your service.
And this docker can be hosted on any of the servers.
That is the, at a very, very high level, that's what
we can think of as a serverless computing. So,
and as you can imagine, we don't have the server, so it is.
And that's where the auto scaling becomes even more important
in this scenario, because we are not dedicating the services
to a single server, which means these services can
be put on various servers as per the need or replicated
across various servers as per the need, thus providing greater services.
So with the future going to a serverless computing,
auto scaling is only going to grow in importance and going
to be the key feature for it.
And talking about the future trends,
like I mentioned, reinforcement learning techniques,
they hold a great promise in this auto scaling
policy. So auto scaling is all about how we determine the
policy for auto trading depending on our business needs and
what the service is going to offer.
So this reinforcement learning techniques,
the machine learning models are going to only help
us positively in improving this decision making process.
They can, like I mentioned before, they can go through the churns of
tons of data and evaluate them correctly and
produce these schedules or suggestions
on the configurations and so on.
So and again, integrating cloud services with IoT
and edge computing can enable real time processing and resource management
for time sensitive applications. And also
the new paradigm that is happening in the cloud
computing world are these concepts of hybrid cloud,
where an application which is performed by various services and
each of these services could be hosted on different clouds. Like you
could have one of your microservices on AWS and one of your services on
Google Cloud, and maybe you can have your database on Azure and
all these work together seamlessly to serve your application.
So in the hybrid cloud world, auto scaling
strategies also are, are going to be very interesting.
And that leads us to the research opportunities. The biggest research
opportunity, in my opinion, in this area of
dynamic resource allocation and auto scaling is around these
multi cloud or hybrid cloud initiatives.
So during the hybrid cloud initiatives, when the auto
scaling happens on one cloud, you need
to have the similar auto scaling happening on the
other corresponding clouds as well. Otherwise you would,
you are open to rate limitings and so many other
issues again, which causes a degraded performance.
So that is a very keen
area right now, and there is a lot of collaborated
work between various cloud providers on this. And that
would be a great interesting research opportunity
over there. Also, one of the, one of the
major challenges on this cloud,
auto scaling is always the complexity,
it's adding more resources doesn't automatically
work for most of the services, so the resources added
need to be consumed accordingly. And when you and that,
there is always a lot of technical complexity around that
area which needs to be addressed. And, and each
application also needs to evaluate accordingly how what
type of auto scaling and how auto scaling would have impacted
them. Their services or applications and
data privacy and regulatory components are also the
future directions in which the cloud companies
and also the cloud usage
companies are going to or investing their research focus
on. And as always
with any research in computer science, a great collaboration
is required between the industry and academia to
drive further innovation in this particular field
of dynamic resource allocation.
So to conclude the talk, I want to again
emphasize the importance of auto scaling,
how it plays a pivotal role in the cloud
computing landscape. Like this is almost be
like going to be like the fundamental feature
or aspect of cloud computing going forward,
because that is, and also
like the hybrid cloud is going to be the future
of a lot of research and investment that is going to happen.
And the, the machine learning models,
the usage of machine learning models in coming up
with the predictive scaling policies and
all that is going to be like the,
in the very near future, that is how we would be using
AI and ML in this particular dynamic
resource allocation and auto scaling. And this
unlocks an unprecedented levels of efficiency and
scalability. So like I mentioned again,
this is going to be the most important
cost efficient feature on the cloud computing landscape.
Thank you everyone for attending the talk and I hope you have a great day.