Conf42 DevOps 2023 - Online

Autonomous Multi-Cloud Serverless Deployment and Optimized Management

Video size:

Abstract

This presentation describes how to implement Multi-Cloud native strategies using advanced an open source framework that allows for Cloud-agnostic Multi-Cloud deployment and optimized management of the serverless applications based on flexible monitoring, context-aware maximization of the application owner’s utility of the deployed serverless components, and autonomic reconfiguration based on the application’s current execution context.

Summary

  • Open source framework that enables full deployment of your app using different cloud providers. The platform itself is called melodic, yet I will today also use the word morphemic. Morphemic gives some additional features to melodic that I will describe later.
  • The app is called genome. It is a real case scenario that was developed with collaboration of University of Belistock. There are multiple large data sets of both genomic and genetic type. The first step is fetching offers from cloud providers. The next step is deploying all the virtual machines.
  • Grafana enables us to see the details of the app that we have deployed. Here in genome we have six metrics that I will later discuss in the different views in Grafana. Based on those metrics, melodic would increase the number of workers to make sure that the computations are done on time.
  • Melodic is an extension of morphemic. One of the additions is called forecasting model. Using forecasting model, we can predict the values of metrics. With such a tool, how you can improve your application.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
You. Hello everyone, welcome in the presentation about how to go multicloud with your application. Today I will show you the framework, open source framework that enables full deployment of your app using different cloud providers, which is important. It is not only deployed, I will show you also how the platform self can enhance your application. So when this already deployed, how we can optimize it, how we can improve the current configuration of it. The platform itself is called melodic, yet I will today also use the word morphemic which is a natural extension of melodic as both projects are open source. Morphemic gives some additional features to melodic that I will describe later. I don't want to go too much into the details of how to use the platform, but to show you the main key features and how we can benefit in our application when using melodic. Here you can see the main dashboard so we can just get into the starting deployment. Yet before I want to show you the credentials that we need to set. As I mentioned before, the important part of multicloud is that we can use different cloud providers. You can have here a list of different credentials. Here I have one of the most popular AWS and also the OpenStack which is University of Oslo cloud provider. I need to specify all the details here to be able simply to communicate with this cloud. So for AWS, which we will use today, we need to have the credentials of our account that we need to put here. And the same would go for the OpenStack. And this is pretty much all for the initial startup. We'll then go to the deployment part when we can select and start our application. Here I will check, I will choose the XMI file that I will describe in a few seconds. Let me give me a few seconds. I need to find it and upload it. And here's the important part. As you can see, I am importing uploading a file which is informaticsmi. It is a converted file that has been described in camel model. Camel model is based on terraform. It's a total independent language that is fully cloud agnostic, which is really important that we can configure our application in this language. And at this point we are not thinking about specific cloud provider. In this file there are many settings that can be set, many important parts that are later used for improvement of the application. This is for example, or the prerequisites or the requirements of the hardware of the machines that will be set up, but also of the metric system that will be used. It can sound a little difficult yet we have a GUI modeler called camel designer that enables to use drag and drop and also other easy and much much easier and visible images to do it. Here you can see I need to put additional credentials. This is totally requested by the application itself as it will have an access to the s free database. So this is not needed by the melodic. It is just an additional usage for the application itself. Another variables. Okay, I don't want to go too much into the details of this file, how it was created with Camel, yet it enables to define many different aspects of the application, which are important and how it can improve it later. The next step is just choosing the credentials that I have earlier defined can move forward and then the last part is just pushing the big green button. As you can see it is quite simple. The starting the deployment is extremely easy. The main part is the definitions, all the definitions inside camel. Yet as I mentioned, it is much easier to done with camel designer. Here are the main steps of the deployment that are run by the melodic. I will go into the details of each of this box in seconds as I believe this is important to understand what actually melodic is doing to find the best solution to also to optimize the application. But before I want to show you a small diagram about what application we are actually using. And as you have seen before, it is called genome. Here will be a small diagram, maybe a small diagram of the app. And I want to point out here that genome is not just a testing application, it is a real case scenario that was developed with collaboration of University of Belistock. And the idea there is that there are multiple large data sets of both genomic and genetic type. And there is a need to like. The algorithm is based on feature selection, based on the binary variable. The algorithm uses the extensive search of the entire space of tuples. So this is extremely time consuming. It's extremely resource consuming. Yet as there are multiple simulations to be performed, multiple computations, it is quite easy to be paralleled, to be optimized. So it makes it a perfect example for this platform. Here you can see a diagram that I mentioned before of how it will be of the setup of the initial setup. So we'll have two components, component master and multiple workers. The master is based on Spark master, which is a tool used for distribution of many computations of big, large data sets for big data computations. So we would have Sparkmaster that schedule jobs for workers and melodic is handling this setup. So it will be up to melodic to determine how many workers are needed and it will be also communicating with both master and workers, let us show you. Maybe I will minimize it. That will be visible during the explanation of each of the steps that modic is performing. Okay, as you can see, we have five steps. It will still take a few minutes until the deployment is finished. So we have a few minutes now to discuss it. And the first one is fetching offers. This is the first moment when the melodic is communicating with cloud providers. So at this point it gets the information of what hardware, what machines, what software, like for example operational system, what such cloud provider offers. So this is the first step. We communicate with the cloud. Later. There is an important step which is called generating constraint problem. Inside camel you can put many constraints. You don't need to put hard coded specific values. You for example, put limits to them, like for example minimum ram value or maximum ram value for a component, but also many other. You can put some other variables, your custom ones. And here is like this creating context problem that later would be solved. So this is some sort of, let's say preparation step. The main part is done in reasoning, where a solver is finding a solution to the earlier created constraint problem. So based on your requirements, your constraints, you put on your application. Melodic would find best solution here. So it would find the specific values of those variables. It will find the specific values of ram, hardware or other variables defined by you that would meet your needs. And I mentioned that it's find the best solution. And I think this is important to say what does it mean the best? And we of course want to optimize cost and performance, but this is up to the user, up to you, how you define what is called utility function and the utility function simply give us information of how good the solution is, or pretty much what we want to optimize. And this function can be really sophisticated. It is not just performance to performance to price ratio, but it can be really complicated. There is also in melodic, there is also attached an additional tool that enables graphically create such custom functions. It totally depends on the applications what really you want to optimize. Here at the step we are using a function that is based on both price and both price and performance. We want to have the best solution to pay the last possible, but still to get all the computations done by the workers. At this moment we don't have that much data that we wish to. It's hard for us to say what is the performance of the current solution, as it is not yet deployed. So at this moment, any quite simple one is chosen. We would start with one master component, with one master node and one worker, that would be the initial setup most likely, yeah. And as we can see, the cardinality of worker equals one and the cardinality of master equals one as well. Okay, the next step is deploying. This is the part where we communicate with cloud provider to set up all the nodes to start all the virtual machines. It is also important that it is not just saying for example AWS to start the machines as melodic is also communicating with those machines. There is a need to for example, install some software. So for example, if your application, Java, Java will be installed there. And also, which is important, the connection established to gather those metrics inside camel, as I mentioned before, you can define metric system. So pretty much you put names of the metrics that you want and how they are gathered. And in this case you can use some default metrics. So it would be simply just expressing that you want, for example price metric or ram usage metric or cpu metric, cpu usage metric. But they can be totally custom ones like simply sending some data about those workers. Okay. And here in genome we have six metrics that I will later discuss in the different views in Grafana. And as you can see, the deployed is finished. So I think this is the moment that I can navigate to the Grafana tool. And Grafana is a monitoring application that enables us to see the details of the app that we have deployed, in this case genome. Let me log in once more. Okay. I will navigate to dashboard of genome. Okay. Here are the details of our app, the genome application. And as I mentioned before, just a reminder that we will have multiple computations to be performed by workers. This will be scheduled by the master node. The number of simulations at the beginning is as far as, remember almost 600. So we start with 576 computations and which is important, we also put the constraints inside camel, we put the constraints of the constraints of time. So we want all those computations to be performed in 60 minutes. As you can see here is the time left indication. So we want all of them to be performed on time. Yet, as I mentioned before, we also want to optimize the cost. So we are not just setting up multiple machines to finish pretty much every simulation instantly, but we want to have it done in this time limit with regard of the cost of the entire setup. Okay, so here you can see the number of instances. We start with two, which is one master and one worker, and 600 simulation that already started to be performed by one worker. Okay, inside camel, which is now that we can smoothly go to the part of reconfiguration which is extremely important as so far you have seen how to deploy an application in multicloud, which is I think already quite much yet. What now is being done is really important because based on those metrics, we have multiple metrics here indicating simulations, performance or current setup. Based on those metrics, melodic can make decision if there is a need to. Maybe I'll close it can make a decision if there is a need to reconfigure. And in this case, as I mentioned before, it would be increasing or decreasing number of workers. And this decision is made totally automatically. So as you can remember, from the moment that I have pushed the bitcoin button, there is no need to interfere at all. The melodic makes sure that performance is fine, that the cost is fine, that everything is at this specific moment in the best possible reconfiguration. And as you can see, this is actually what will be done here, as this red colorful light will indicate if current constraints are met. So in this case, we have estimated time and as you can see, the estimated time is higher than the time left. So at this moment, the current solution would not make all the computations of time. So melodic would reconfigure. So melodic, based on those metrics, would increase the number of workers to make sure that the computations are done on time. And I think I can navigate again to the view of demolodic. Oh, it is already done. So the first reconfiguration has been done and I will just explain what happened here. Now it's the second one as this is possible to have multiple reconfiguration. In this view we can see what is currently happening. And as you can see, we have reconfiguration process which is free boxes. Then in the end we would have a better solution that better meets our needs and which is important. I think what is important to say here is that those box are pretty much the same that you have seen during initial deployment. So reasoning is what I told you before, that it is finding values of the variables, finding the best solution based on the utility function. And the difference here is that we have much more data than we used to. So during initial deployment, we didn't know much about those metrics, values that are also needed to find the best solution. Here we have all those data. So based on those metrics, based all those data that we have, the melodic is improving, melodic is setting up a better configuration and deploying is again communication with the cloud provider. So for example, setting up a new machines or deleting the old ones or that can be all the front part of communication with the cloud provider. I will go back to Grafana. So I assume we will see an increase of the worker in a few minutes. We will see. It takes a little time. We can see here with the solution that has been found. We can see the cardinality of the master is one, as it is also the limit put inside camel. And also worker cardinality has increased to two. So the melodic made the decision to start up a second machine, a second worker. Okay. And here you can see this is the diagram indicating the cardinality of the numbers, the cardinality of the workers. And as you can see, also the estimated time decreased. So I will assume that in a few seconds we will see that the current solution is just on spot, but I guess also needs a few seconds to process everything. Ah, so the estimated time decreased and the estimated time increased. And now it is lower than the time left. So at this moment it will be a totally sufficient solution. Yet have in mind that we don't have that much time to wait until all the perfect, you can see green light, as this is currently the best solution and all the constraints are met. We won't wait until all the computations are performed. We still have like 500 of them. Yet have in mind that it is totally possible, and very possible that we'll have many reconfiguration. For example, if the estimated time goes, the difference between estimated times and time left would be too much. It indicates that it's possible to optimized cost. So it is very possible that one of the workers would be stopped, or maybe if some of the computations would take longer to perform, which is totally possible, maybe it would be an increase of the workers. And as I mentioned before, it is done totally automatically. So you can see I haven't interfered with the application at all from the moment I started it. Okay, so this is the main part of melodic. I also want to go a little bit in the extension of melodic, which is morphemic. And one of the additions, there are quite many of them, but one of the addition is called forecasting model. And as you can see, we can see here, as you can see, there are few diagrams indicating the values of the metrics, yet you can imagine that we have such, I don't know, we have time left, for example, or we have simulations to be performed in weeks, or, I don't know, months, and the platform would handle it. But yet with forecasting model, we can predict the values of those metrics. So at this moment, when I just started, there is still not sufficient data for forecasting model. To forecast anything yet, I will go to one of the deployed, which is also a genome, but with much more data and a little bit, a little bit bigger time constraints. So I will show you, for you to this one that you could have a look at how at forecasting model, which is pretty much all those colorful lines. And as you can see, it has been quite a bigger example. We started with almost 800, 800 simulations at the beginning and I had to start it at the beginning of the day as it takes around seven to 8 hours to finish all of them. Yet those few hours is needed to gather enough data to get predictions. And the forecasting model is gathering based on those data is learning to perform predictions model what is used here for forecasting is quite many different machine learning models. We have here some of them that won few competitions in data series forecasting. So for example, we have Arima, TFT, NBITs and a few others like CNN or profit. Yet, which is important, you can see a few of them are not very, not very accurate. It's still, there is the forecasting model is quite complex. There is some ability to. There is a model that is also taking into account many different models. So this is not just one model taken in account. There is also a model that is merging them and assembling them to make sure that we always have the best predictions. But why actually we want to predict those metric values. And the point is that as you can see, that based on the predictions, we can look into the future and we can optimize, we can start a reconfiguration before. So in this case, maybe in this case, it's not that visible what improvement we can get. But for example, imagine a totally different application where for example, we have multiple nodes that are handling requests from the users and based on those metrics, we can for example, predict when there will be an increase of the users. So like an easy example would be that for example, we have more requests after 05:00 p.m. And after a few days, the melodic or even sooner morphemic would learn that such an increase happens and would start reconfiguration even before this increase is seen. So we would not be adapting the platform for the current situation, but we would adapt the application just before it is needed. So based on the predictions of, for example, increase of the users, your application would be ready. You would have enough nodes, enough nodes to handle all the requests, all the requests that appear after 05:00 p.m., okay, so this is the example of also forecasting model that is an addition from morphemic. There are many more. Also a few of them are under development. For example, also an ability for the platform to improve also the architecture of the current solution. So for example, change node to work to change node from cpu to gpu usage to make sure also that we have always the best configuration and best architecture. To learn more you can go to the website of morphemic and melodic which is already under use and today this will be all for my presentation. Thank you all and I hope you enjoyed and I could see that even with such simple be not simple, it's not simple, but accessible tool as it is open source and everyone can contribute to it or use it. With such a tool, how you can improve it improve your application. Thank you very much.
...

Maciej Riedl

Software Developer @ 7bulls.com



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways