The Future of the Cloud is WebAssembly

Video size:

Abstract

This talk illustrates the hype behind WebAssembly and why it may be the future of the Cloud. Talk includes:

What Wasm is, how it works and why its suited for Serverless - How one can get started with Wasm using open source frameworks
Live Demo

Summary

Sohan Maheshwar is a dev advocate at Fermion. He talks about why the future of the cloud is webassembly. He says containers are just too expensive and over consume resources. He also says serverless does have a cold started problem.
The next wave of cloud compute will be powered by webassembly, specifically serverless webassembly. Webassembly was developed in the mid 2010s by a bunch of companies working in the browser and front end space. Another cool thing about webassembly is it's security sandboxed by default.
Spin is the open source tool for building webassembly serverless apps. It supports 15 plus languages. With three commands, we can build and test spin app locally and with the fourth command we will also deploy it to the cloud.
2023, everyone said, is going to be the year of Webassembly. You can really pack and have dense workloads with multiple functions running in the same piece of hardware. The key to success for cloud computing is multitenancy.
Spincube is a completely open source project with contributions from companies like Microsoft, Liquid Reply, souser and Fermion. It gives you hyper efficient serverless on kubernetes, completely powered by Webassembly. With this, you can actually run 5000 serverless apps in one Kubernetes node.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Everyone, my name is Sohan Maheshwar. I'm a dev advocate at Fermion and I'm here to talk to you about why the future of the cloud is webassembly. Now, if you haven't heard of Webassembly before, no problem. That's what I'm here for. So let's dive straight in. I've worked in cloud computing for a while. My previous role wasm in Amazon Web services, or AWS. So I've really seen the evolution of cloud computing. I'm old enough to remember a time when the pre cloud days, when there was a server room in every office which had an air condition. It was locked, heavy security, no one was allowed to enter. It was a different time. And then finally, with the release of things like EC two, we actually saw virtualization and widespread adoption of virtual machines. But us, we as us as developers, we still had to use or we had to take care of things like the kernel, the drivers, the operating system, the utilities, and of course the business logic that made up your app that ran on this virtual machine. Containers took that a step forward in sort of removing some of that effort that you required. You still had to configure things like either docker or kubernetes for orchestration while still looking at things like your OS and utilities and building your app. Of course, the latest and greatest in computing is serverless, where you can solely focus on your business logic and the platform really takes care of the rest. So where I'd assume most of us right now are, are at the containers or the serverless stage. And with the current state of containers and serverless, there are a couple of problems that really plague the ecosystem. One is that containers are just too expensive. They over consume resources. We do know that containers can get complicated, but teams have built out large platform engineering teams and DevOps to be able to run these containers. But the thing is, when we're provisioning for a container or for a workload, we actually look at what the peak of that consumption is, even though our average is much lower. So as a result, we end up over provisioning containers all the time. And these containers sit idle. They're running, they're consuming electricity, so they're not sustainable, they're also costly, and they leave systems idling. Right? And this report at Andersine Horowitz actually estimates about 100 billion of market value being lost purely because of this, which is pretty insane. Secondly, app configs are complex. I think we know this. If you're a small company, you don't necessarily need like a huge Kubernetes orchestration all the time, but because this gets complex and more complex, the bigger your app becomes. And honestly, modern apps do comprise a large number of frameworks, language dependencies, libraries, all of which need to be shipped alongside your containers and your code in the cloud. So this just increases the complexity, so the portability of your workload or your code base, and the simplicity is often sacrificed so that you can actually support a variety of architectures and platforms. And specifically with serverless. One big problem that we see and we hear from customers is that serverless does have a cold started problem. So roughly how serverless works is an event occurs. This event basically pings some compute resource that's running in the cloud, say AWS, lambda, azure functions, whatever. So there is a finite start, a finite amount of time for that function running in that compute in the cloud to start up, execute the program and send a response back. And that is called a cold start time. So solutions like lambda or Azure functions, they take about two to 3 seconds and on an average we see about 250 to 500 milliseconds of cold start time. Now for applications where this is vital, this could be a problem. So there are ways to work around this of course by keeping the instance warm, for instance. But this cost is borne by you as a dev. So keeping all of this in mind, I'm going to make a bold statement, which is the next wave of cloud compute will be powered by webassembly, specifically serverless webassembly. And honestly, if you don't believe me, I'm just a random person on the Internet. Take a look at this tweet from, I think it was mid 2010s or at least 2018 or 19 by Solomon hikes. Now he's the founder of Docker and he actually mentions if WASM, which is Webassembly, plus WaSM, I'll explain that existed in 2008, we wouldn't have needed to create Docker. That's how important it is. So if Solomon Hicks has an insight into tech, the next wave of cloud computing will be powered by webassembly. Let's talk a bit about that. Right, so what actually is webassembly? Now the boring answer is it's just another bytecode format. So for a program to run on a piece of hardware, it has to be written in an intermediate type code or a low level code that a computer can understand. And essentially that's what a bytecode format is. So WebAssembly is just another bytecode format. The interesting thing for you to know is this technology, it's a general tech, it's not owned by any company. It was developed sometime in the mid 2010s by a bunch of companies working in the browser and front end space. So I think Mozilla and few others. And the idea was for it to be able to run any program on a browser and hence the name Webassembly. And it wasm designed from the ground up as a portable compilation target. And this means that you could write code in any language, say Python, Java, JavaScript, ideally any language that could compile to Webassembly and that could run in any browser. So that was the idea. You'll hear me say wasm a lot. And wasm is just short for webassembly. So how this would work essentially is in a way similar to how the Java virtual machine worked from back in the day. So you wrote a program in Java, this compiled to something called Java Bytecode, the thing that I mentioned earlier. And this Java bytecode could execute in any Java virtual machine. And these Java virtual machines could run on Arm processors and x 86 processors. They could run on windows and Linux as well. The cool thing about Webassembly is you can write any program in any language, again ideally, and compile this to a WaSM module which is in the format of the webassembly bytecode that I spoke about. Now this WaSm bytecode can execute in any runtime that supports WaSM. The industry standard or the most popular one right now is a runtime called WaSM Time. So any wasm bytecode can run on a webassembly runtime. And this runtime is designed to run across architecture. So Arm x 86 whatever operating system, so windows, Mac, Linux, et cetera. But you can also run it on Kubernetes, on Raspberry PI and so on. Literally any place that has support for a webassembly runtime. A couple of other things to know about WebAssembly is, like I said, it originated from the browser, but now is available outside as well. And really the idea was for you to be able to compile it once and run that code on any number of targets. So once you've compiled from a program, say in Python to a WASM format or a webassembly module, this webassembly module should be able to run anywhere. And that was the idea. Another cool thing about webassembly is it's security sandboxed by default. This is very different from the different ways we have been coding in the past where you had to specify what you had, resources you could deny it wasn't sandboxed by default. Webassembly, though, is the other way around. It's completely sandboxed by default. So for anything to access a webassembly module, you have to explicitly give it permissions. This is how you would compile and run a Webassembly module. You'd write code in any language, you would compile that to Webassembly, and Webassembly would then run in a, let's call it virtual machine, but any WaSM runtime. Now, all of these things, right, the portability, the security sandbox, the fact that you could write once, compile once, and run it anywhere, all of these things made it very good for the browser. People were like, hey, hold on, this could actually make it ideal for server side, but for anything to run on a server, you need things like access to files and file systems, you need access to a system clock, you need access to the network as well. So what happened was sometime in 20, 18, 19, something called Wozi was introduced, and Wozi stood for a new kind of system interface, or Webassembly systems interface. In short, it allowed you to run webassembly outside of the browser. And the cool thing is it gave you access to all your operating system like features, including files and file systems, clocks, random numbers and so on. The good thing is it wasn't tied to any browser or front end or web API or JavaScript. You can literally run it on a server side, and it extends this security sandboxing to include things like input output. So you still have the default security sandbox if you take a look at it, among the top 20 languages in Redmonks ranking, which by the way, I think they released a new one a couple of days ago. So I should update this slide the top 20 rankings, that is, as you can see, Webassembly supported by most of the languages. Of course, things like CSS don't really apply here, but JavaScript, Python, Java, Php Net, C plus plus, typescript, Ruby, but also Zig C, Rust, all of them have good or very good levels of support when you write code in WebAssembly and compile it to these languages. So now this, I know you're a technical person and you're seeing this, so let's get into some code, right? How do you write your first WebAssembly app on the server side? And then we'll talk about why this will be the future of the cloud. I'm going to show you this through an open source project called Spin. Spin is the open source tool for building webassembly serverless apps with three commands that you see on the screen here. I'm actually going to build and test spin app locally and with the fourth command we will also deploy it to the cloud. Spin, just to reiterate, is completely open source and we have a commitment for it to be open source. It supports 15 plus languages. Right now we have about 4.6 thousand stars on GitHub. We also have a discord server, so join in there and at least I personally think that the developer experience is really good. So let's just jump into the CLI and try it out. So I've already installed the CLI, so I'm just going to say spin new and you can see on the left, you can see HTTP and redis on the left, which is basically the trigger to run your serverless function. Remember, serverless is all event driven, right? So it has to be triggered by something. And right now spin supports HTTP and redis, but there are also community created triggers for MQtT and SQs. On the right you'll see languages such as C, go, grain, pHp, Python, rust, Swift, et cetera. Let's just choose rust. These are the different languages that are supported. I will call it conf 42. Rust description and this is the HTTP path. You can specify when your serverless function is triggered. So say checkout or resize or whatever. I can leave it blank, which means this is default and it will be triggered when this path is hit. So let's go into the folder and I will open it in my favorite code editor, which is vs code. Just two things you need to know about a serverless webassembly app using spin. The first one is something called the application manifest, which is this, right? So this think of it as a manifest file. It's written in a toml format. All you need to know is this is a trigger which we just specified. So by default this particular component will be triggered. You can actually specify multiple routes and have different components for each. So say for example you're writing a calculator. So maybe the plus root will trigger a plus component. The subtraction route will trigger a subtraction component, and given that this is webassembly, you can write each of these in a different programming language. So you can write addition in Python and subtraction in JavaScript and this would still work. And as you can see, this is the wasm file that it eventually compiles to, which I will show you in a bit. Looking at the source code, it's fairly straightforward. You don't need to worry too much about rust itself. All you need to know is there's a request that comes in, right? So when an event is triggered, this particular function, this is the request that comes in and you send a response back. We can just modify that into sing. Hello Conf 42. And yeah, this is the response that's being sent back. So I promise that with three commands, we'll get like an app up and running. I've said one, which is pin new. I will say the second one, which is spin build. So this is command number two. It's rust, and this is a one time sort of compilation. All the crates are compiled in rust. And then we will use our third command, which is spin up. Right, which basically, there we go. What it does is it creates a local instance for you to test out your app, thereby giving you pretty good developer experience, because you can test out your app locally. There we go. And yeah, I just did a curl to that thing that Wasm running here and you can see hello Conf 42. Right? So three commands. Got a serverless webassembly app from scratch up and running to test locally. I'm just going to close this. I did say with the fourth command I could deploy this to the cloud. So fermyon does have a fermyon cloud with both free and paid tiers. I've already logged in here on my ClI, but with just one command of Spindeploy, you can see that this particular app will be deployed to the cloud and we can actually test that out as well. So, yeah, that's it. I can view the application here and I can manage it too here. I'm just going to do the same curl here and we got the same result. You can feel free to open this app on your browser and you will see literally the same result again. So super easy to go from nothing to creating a serverless webassembly app that's running in the cloud. Just going to close this and open spin. Like I said, it's completely open source. And with this SDK, you get access to a large language model, which is the llama two model. So you can do serverless AI. So you don't need a large language model running in the cloud and paying lots of resources. You also get a key value store, a NoSQL database, custom domains, bunch of other cool things. So do check it out. So I showed you the experience of building a webassembly app for the server side. The four things that really make it suited or make it ideal for doing this is this, right. So one is binary size. A simple rust hollow world is only two MB and ahead of time. Compiled Rust hello World is about 300 kB. The app that I showed you now, which is a simple HTTP API written in Rust, is about 2.3 MB, just in time compilation and ahead of time. If you compile it, you can bring that down to about 1.1 mB. And I can also show it to you. Let me do CD. I think it's. Let me do an Ls first. Yeah, CD target. Right. And yeah, you can see that this is the wasm file that you can see it's about two MB, right. So it's pretty small. I can actually do. And this is the bytecode basically, right? Yeah. So we can't understand most of this stuff. So that is the wasm bytecode that you're looking at. The startup times are comparable to near native. So in the benchmark that you see there, it's about 2.3 x slower than native. I think that's where there is a bit of a trade off in terms of binary size versus startup time, but it is still comparable, and it is near native performance for something that's not written in rust. The portability we spoke about where you can build once and run this anywhere. Right. So that wasm file that you saw should theoretically run on any wasm time, sorry, on any webassembly runtime out there. And lastly, there is a security sandbox that I spoke about. It's completely a capability based security model. In fact, if you look at the spin toml, you can actually see something called an allowed outbound host. So if this model, sorry, if this module had to make a HTTP call, for instance, you had to explicitly allow it to make a HTTP call to a particular URL, only then will it work. Similarly, if you want a file to access this particular module, you have to give it access to this particular module. So it is security sandboxed by default. So the big question is, how is this going to change cloud computing? And my answer to that is gradually and then suddenly 2023, everyone said, is going to be the year of Webassembly. And it didn't really take off in the way that people expected it to. But now in 2024, we are seeing so much about webassembly. Part of my job is to speak at conferences, and I do that maybe three, four times a month. And there's just so much of an increase in the number of talks and the number of questions and queries about this thing of running Webassembly on the server side and in the cloud. The key to understanding the success of the cloud is to understand this concept of multitenancy, which essentially is how multiple applications can run in a started environment. Now, I'm not going to go into the science of this, but the analogy is like an apartment building, right? So instead of one small family, or like few people staying in a really large building, you break that down into multiple houses in the same plot of land, which many tenants can inhabit. And that's the general idea of cloud computing. The idea, again, is that any of these tenants that are hosted in your piece of hardware shouldn't interfere with the other, intentionally or unintentionally. And that's the key to success for cloud computing. Now, people have driven, or companies have driven more and more towards bringing the cost closer to value, which means increasing the number of tenants in the same piece of hardware. Because the value of this piece of hardware is based on your long term average traffic, and the cost of running the system is based on short term peak traffic. So the more value you can extract out of the system, that means you have got more bang for your buck for that hardware itself. If you look at again, the waves of cloud computing, when we first started off with just virtual machines, we could run very few apps on the same hardware. But I think with containers, you slowly increase the number of apps you could run on, say, a Kubernetes cluster. In this final form of serverless webassembly, you can really pack and have dense workloads with multiple functions running in the same piece of hardware. The analogy I love to draw when I'm talking about this is think of how atoms and molecules think of how molecules are structured in liquids, solids and gases. So on your left, what you see is like a gas, where you have molecules that are kind of loose, and then in a liquid they are maybe a little closer to each other, but they're really densely packed in a solid, which gives it its shape and texture and format. And that's how serverless webassembly will look. You can really have a high density of functions in a workload. In fact, as a serverless unit, webassembly is so ideally suited, because the people who created the Firecracker VM, which is the base for AWS lambda, essentially wrote a paper, and I highly suggest you read that paper that I've linked at the bottom here, about the characteristics of an ideal serverless unit. And they define six characteristics which included isolation. I mentioned it, you could run multiple functions on the same piece of hardware, overhead and density, where you can run thousands of functions on a machine with minimal waste. Three is performance. You should have consistent and near native performance at all times. Four is the ability to switch quickly, right? Essentially not have cloud start times, but open a serverless unit, run something, shut it down, and then switch to something else. The ability to allocate or soft allocations, where if there is a spike in one of the tenants, you should be able to overcommit resources, CPU, memory and so on. And lastly, it's compatibility. I think we as devs we want to use our favorite libraries, our favorite frameworks, hosts, et cetera. So it has to be compatible with a bunch of things. We sort of compared a microvm such as firecracker to webassembly with these six parameters. So in terms of isolation both are sandboxed. So a microVM is sandboxed via the Firecracker KVM, and webassembly sandboxed via its own security sandbox model. There are two places where at least I personally think webassembly really shines compared to a microvM. The first one is overhead and density, right? So to run thousands per node on a microvm you needed a 48 core 382 GB RAM with 3360 GB disk. So that's your hardware spec. But you could do the same thing with an eight conf 42 GB RAM 100 GB disk if you use webassembly, because it is so lightweight and performant performance in both are near native, so nothing to compare there. Fast switching, I think is the second thing where webassembly really shines. We did mention microvms do have cold start times from 125 to even 500 milliseconds, whereas with webassembly that's down to about a millisecond. Right? And you can scale up to like thousands of functions in that node and then scale back down to zero in under a millisecond, which is very impressive in terms of soft allocation. I think with things like Lambda and even azure functions, they've been tried and tested that you can run these in production in enterprise grade with oversubscription ratios as high as ten X. Webassembly is new, so it's untested. But I have a feeling by the end of this year we'll really get to see how software allocation would work when it comes to Webassembly. In terms of compatibility, microvms are Linux and KVM only. Most software is compatible unless it has very specific hardware requirements. Webassembly, like we said, was designed to be compiled once, run anywhere. So it supports a bunch of all OSS platforms, architectures and so on. Now, just a few days ago we launched something called Spincube, and I'm super excited to talk about this because it ties into so many of the things that we just spoke about, which is things like density, performance and binary size. I'm sure many of us are either familiar with or work on Kubernetes. So Spincube is a completely open source project with contributions from companies like Microsoft, Liquid Reply, souser and Fermion, and essentially gives you hyper efficient serverless on kubernetes, completely powered by Webassembly, it's again fully open source and it just streamlines the development and deployment process of webassembly workloads on Kubernetes. You should check out Spincube dev for more info and the slash spincube. But essentially when you build a webassembly app that's deployed in Spincube, these artifacts are significantly smaller in size compared to a typical container image. So again, think of the costs, think of your carbon footprint, think of performance when this actually happens. And these artifacts can be fetched over the network and started much faster than running a typical container image. Which also means that substantially fewer resources are required during times when your container is actually idling because these webassembly functions can scale back down to zero in no time. Just to give you a quick overview of how it works, here's a slightly complicated architecture diagram. Now if you look at the bottom here, this is the core of the project, which is the container D shim spin. Right? So this uses something called run vazi. And this essentially, sorry, I need to look at my computer there, enables containerd to sort of run spin webassembly apps in a Kubernetes cluster. And it provides all these capabilities needed to pull an application from a registry to start the application and so on. Now something called a runtime class manager. This deploys pre configured images that can run webassembly workloads. And this works with the container D shim spin the runtime class manager was contributed by liquid reply and Sousa. And you can do things like annotate your nodes, install and configure container D with this shim. Now something that fermion, the company I work for, contributed towards this project is this spin operator here. And spin operator essentially is used to schedule and manage your spin apps as custom resources. So what it does is it looks at a custom resource definition of a spin app for any changes and it speaks to and creates a spin app using a specified executor. So you can specify that using the operator. And lastly, Spin itself has an ability for anyone to write plugins for it. So there's a plugin called Cube which essentially scaffolds your spin app and creates a deployment yaml, which can then be used by your CRD to sort of deploy into Kubernetes. So Spincube is all of this combined where you have like the container d shim, you have runtime class manager, a spin operator and a spin plugin. Now if you think this is exciting, we have taken the concept of Spincube and we have amped it up and we've released something for enterprises called fermion platform for Kubernetes. Now with this, and I'm not joking, you can actually get a 50 x increase in workload density. That's right. You can actually run 5000 serverless apps in one Kubernetes node. Typically that limit used to be around I think 256 if I'm not mistaken. That was the maximum you could. But because of webassembly you can actually run 5000 in one node. I'm showing you a demo in the next slide. I think that's pretty awesome. And you get massive reductions in your serverless cold start delays as well. Again because of how webassembly is built, because of this you're saving a bunch of costs because you're increasing your capacity and your efficiency of resources. So the infra that you spend or your platform team spends is going to be so much lower. And again, this is highly portable. So there's no vendor lock into one public cloud. You can run this in different places in different architectures, OSs, there's absolutely no lock in there. Here's the quick demo of platform for kubernetes. It's a prerecorded video because well, you have to ping 5000 apps. So this is an Azure Kubernetes cluster running 5000 apps, right? You can see the number here, we've just done a count, 5000 and you can see instantly how you get a response. Hello number twelve. So I just changed the number. You instantly get a response. So it's that quick. So the cold start time is sub one millisecond within this Azure Kubernetes cluster. And just to give you an idea of how this works, you can write your webassembly apps using spin open source. You can self host in your Kubernetes using Spincube which is open source. So if you want an enterprise grade one, this platform for Kubernetes, get in touch with us about that. Or you can also host your spin apps on cloud. There are paid and free tiers there as well. All right, well, I hope you learned something new today. For next step, check out and build your first spin app. Check out spincube as well. If you're into the Kubernetes space, there are a bunch of tutorials on our YouTube, too, so feel free to jump in there. We have a discord, so join us there. Or hit me up on LinkedIn if you have any questions, or if you had any feedback about the stock, I'd love to hear what you're building in the webassembly space. So yeah, get in touch and enjoy the rest of the conference. Thank you.

Slides

Download slides (PDF)

See all 47 talks at this event!

Conf42 Cloud Native 2024 - Online

March 21 2024

The Future of the Cloud is WebAssembly

Video size:

Abstract

Summary

Transcript

Slides

Sohan Maheshwar

Developer Relations Lead @ Fermyon

Join the community!

Featured event

2025

2024

Info

Conf42 Cloud Native 2024 - Online

March 21 2024

The Future of the Cloud is WebAssembly

Video size:

Abstract

Summary

Transcript

Slides

Sohan Maheshwar

Developer Relations Lead @ Fermyon

Join the community!