Dynamic Resource Allocation and Cost Efficiency in Cloud Computing: Empirical Insights and Future Directions

Video size:

Abstract

Explore how auto-scaling transforms cloud computing efficiency and performance! Dive into real-world success stories from AWS, Netflix, and Airbnb, uncovering strategies that cut costs by 50% and enhance system responsiveness. Discover future trends like AI-driven scaling.

Summary

Today in my talk, I'm going to talk about dynamic resource allocation and how it is the most cost efficient tool for cloud computing. And also we'll talk about the future trends, which is regarding the serverless computing.
dynamic resource allocation is a great technique which would help the cloud providers efficiently manage resources. Most of these techniques can be broadly categorized into two categories. One is predictive scaling and the other one is reactive scaling. How do we configure these auto scaling systems?
Serverless computing is a new paradigm in the cloud computing world. Services can be put on various servers as per the need or replicated across various servers. With the future going to a serverless computing, auto scaling is only going to grow in importance.
The biggest research opportunity, in my opinion, in this area of dynamic resource allocation and auto scaling is around these multi cloud or hybrid cloud initiatives. A great collaboration is required between the industry and academia to drive further innovation.
Auto scaling is going to be the fundamental feature or aspect of cloud computing going forward. This unlocks an unprecedented levels of efficiency and scalability. Thank you everyone for attending the talk and I hope you have a great day.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Hey everyone, hi, I'm Pritam. I'm currently working as a back end developer at Uber. So today in my talk, I'm going to talk about dynamic resource allocation and how it is the most cost efficient tool for cloud computing. To go over the topics, we'll start with a brief introduction. We'll talk about few autoscaling, how it works in the real world and what are the cost savings and how do we use auto scaling systems. And also we'll talk about the future trends, which is regarding the serverless computing and what is the future direction we are going with cloud computing and how dynamic resource allocation is gonna play a pivotal role in it. And then we'll also briefly touch upon the future research opportunities that are available on this particular topic. So to start with, what is dynamic resource allocation and why is it very important in cloud computing? So computing is always resource intense. Computing is all about using, is very heavy on various resources, may be the data storage or the memory for computing and so on. So the dynamic resource allocation is a great technique which would help the cloud providers efficiently manage these resources. What we mean by that is, depending on the computational needs of a particular service or particular application, the resources are dynamically allocated to it. And when there is a certain downtime or when there is not very heavy traffic during that application, the resources are clubbed out of it. To give you a better example, if you are talking about an e commerce application, the business during the business hours is when they need much of the resources. And when there is a fire sale or certain event that is happening, that's when that application would require maximum resources. And during the store closed hours or during the non business hours, you don't need much of the computational resources. So based on these schedule, the cloud can provide certain, can allocate more resources during the business hours and even more resources during certain events, like we mentioned, like file sale or something, and reduce the resources and use those resources elsewhere when there is not much of the activity going on. So by this concept of reallocating the resources as per as per need basis, it has twofold advantages. It is it helps the business provide the services to their consumers in a more efficient way and also helps business to reduce their costs. So that's why this is one of the most important game changing features in cloud computing. So how does auto scaling work in today's world? Almost all cloud providing services like AWS, Google Cloud, Azure, everyone offers a wide range of auto scaling capabilities. So with we can majorly look at, there are multiple techniques here and most of these techniques can be broadly categorized into two categories. One is predictive scaling and the other one is reactive scaling. So, predictive scaling, as the name itself mentions, it's like it forecasts what would be the better time or what would be the resource usage at a given instance. So like a, like the ecommerce example we mentioned before, like, you can anticipate that you need the maximum resources during, during any sale event, and you need a moderate number of resources during your business service and the least number of resources during your downtime. So that is an example of like predictive scaling. So based on these predictions, we can pre allocate these resources. And reactive scaling is like the name mentions, it is a reactive like, for whatever reason, there is a sudden rush for business going with the same hour ecommerce example, when there is a sudden rush for business, we instead of our systems going down or becoming unavailable, more resources are dynamically allocated to keep the business afloat and not ruin the user experience. And so, and this, the pioneers in this are Netflix and Airbnb. They were very effective. Netflix in particular was very heavily was and is heavily using the dynamic resource allocation on their AWS cloud. Netflix predominantly uses this predictive scaling approaches based on their ton of data on the users, on their viewing patterns and everything. They predict the usage of their services and accordingly they schedule the resource allocation for their services on the cloud with the cloud providers. So moving on. Like, is it really useful? So this is according to a survey produced by AWS on their auto scaling. So they took all the consumers they have and they quantified the value they were able to produce. With the auto scaling, they could clearly see that both the average daily costs and monthly costs were reduced by 50% for their customers. So this is like the big testimony by one of the leading providers that auto scaling techniques are actually beneficial, both for the cloud providers and also for the cloud customers. And then when coming to how do we configure these auto scaling systems? So one of the key things for auto scaling is we need to have a great observability and monitoring around it, because say, for example, if you go with the predictive analysis, which is more cost efficient than reactive scaling techniques, we need to have this robust observability and alert systems in order to identify if anything is going bad or if it is not meeting the needs. So, and again, another major thing, again, there are multiple case studies here in the early stages of this auto scaling thing. Like we need to always beware of the thresholds. We need to set down so we should always put the upper bound and lower bound thresholds while opting for the auto scaling systems. Like the reason for that is the upper bound threshold stops the cloud providers from providing abundant of systems. And the problem with providing abundant of systems is the cost exponentially increase when more and more resources are added to the service. So based on the business needs, that could really lead to a big bill at the end. So the system should configure an upper limit threshold so that with the number of resources allocated are not more than a certain limit. And similarly we need to also maintain certain cooling cooldown periods after heavy d scaling. You don't want to immediately downgrade to the lower volume after the teeth. So the cooldown mechanisms have to be followed so that we don't lose, we don't lose any transactionality of the transactions that were happening during that time and others and other things. So this is a very, so the configuring this auto scaling system needs to be this careful balance of performance and cost. So like I mentioned, we cannot put very high upper limit because that would negatively impact the cost that needs to be paid to the cloud provider. So that is the careful balance during the configuration that we should consider that we should optimize for performance, but also budget in the cost we are willing to pay for our services to the cloud. And this is where the machine learning integration is playing, already playing a pivotal role and also would be the future in my opinion. So this predictive analytics, when powered by the machine learning models can actually become very, very powerful because using the, using a lot of data that is accumulated over the years, the machine learning algorithms or machine learning models can come up with a very accurate prediction on the resource usage that would really, employing these would really empower in coming up with a perfect schedule or near perfect schedule for the, for this auto scaling and resource allocation, thus optimizing both the cost and also not compromising on the performance of the services. And apart from this, the, and the auto scaling is going to become more and more important with the future of how the cloud is moving on. Now we hear this term called serverless computing. So this is a new paradigm in the cloud computing world. What do we mean by serverless computing? Is all the, your entire application and all the resource, all the services frameworks that needs to run that application is not allocated to a single server, more or less. There you can imagine this. And Docker is one of the great examples here where you put, you create these containers which has all the things that are required to run your service or power your service. And this docker can be hosted on any of the servers. That is the, at a very, very high level, that's what we can think of as a serverless computing. So, and as you can imagine, we don't have the server, so it is. And that's where the auto scaling becomes even more important in this scenario, because we are not dedicating the services to a single server, which means these services can be put on various servers as per the need or replicated across various servers as per the need, thus providing greater services. So with the future going to a serverless computing, auto scaling is only going to grow in importance and going to be the key feature for it. And talking about the future trends, like I mentioned, reinforcement learning techniques, they hold a great promise in this auto scaling policy. So auto scaling is all about how we determine the policy for auto trading depending on our business needs and what the service is going to offer. So this reinforcement learning techniques, the machine learning models are going to only help us positively in improving this decision making process. They can, like I mentioned before, they can go through the churns of tons of data and evaluate them correctly and produce these schedules or suggestions on the configurations and so on. So and again, integrating cloud services with IoT and edge computing can enable real time processing and resource management for time sensitive applications. And also the new paradigm that is happening in the cloud computing world are these concepts of hybrid cloud, where an application which is performed by various services and each of these services could be hosted on different clouds. Like you could have one of your microservices on AWS and one of your services on Google Cloud, and maybe you can have your database on Azure and all these work together seamlessly to serve your application. So in the hybrid cloud world, auto scaling strategies also are, are going to be very interesting. And that leads us to the research opportunities. The biggest research opportunity, in my opinion, in this area of dynamic resource allocation and auto scaling is around these multi cloud or hybrid cloud initiatives. So during the hybrid cloud initiatives, when the auto scaling happens on one cloud, you need to have the similar auto scaling happening on the other corresponding clouds as well. Otherwise you would, you are open to rate limitings and so many other issues again, which causes a degraded performance. So that is a very keen area right now, and there is a lot of collaborated work between various cloud providers on this. And that would be a great interesting research opportunity over there. Also, one of the, one of the major challenges on this cloud, auto scaling is always the complexity, it's adding more resources doesn't automatically work for most of the services, so the resources added need to be consumed accordingly. And when you and that, there is always a lot of technical complexity around that area which needs to be addressed. And, and each application also needs to evaluate accordingly how what type of auto scaling and how auto scaling would have impacted them. Their services or applications and data privacy and regulatory components are also the future directions in which the cloud companies and also the cloud usage companies are going to or investing their research focus on. And as always with any research in computer science, a great collaboration is required between the industry and academia to drive further innovation in this particular field of dynamic resource allocation. So to conclude the talk, I want to again emphasize the importance of auto scaling, how it plays a pivotal role in the cloud computing landscape. Like this is almost be like going to be like the fundamental feature or aspect of cloud computing going forward, because that is, and also like the hybrid cloud is going to be the future of a lot of research and investment that is going to happen. And the, the machine learning models, the usage of machine learning models in coming up with the predictive scaling policies and all that is going to be like the, in the very near future, that is how we would be using AI and ML in this particular dynamic resource allocation and auto scaling. And this unlocks an unprecedented levels of efficiency and scalability. So like I mentioned again, this is going to be the most important cost efficient feature on the cloud computing landscape. Thank you everyone for attending the talk and I hope you have a great day.

Slides

Download slides (PDF)

See all 22 talks at this event!

Conf42 Observability 2024 - Online

June 13 2024

Dynamic Resource Allocation and Cost Efficiency in Cloud Computing: Empirical Insights and Future Directions

Video size:

Abstract

Summary

Transcript

Slides

Preetham Vemasani

Senior Software Engineer @ Uber

Join the community!

Featured event

2025

2024

Info

Conf42 Observability 2024 - Online

June 13 2024

Dynamic Resource Allocation and Cost Efficiency in Cloud Computing: Empirical Insights and Future Directions

Video size:

Abstract

Summary

Transcript

Slides

Preetham Vemasani

Senior Software Engineer @ Uber

Join the community!