Transcript
This transcript was autogenerated. To make changes, submit a PR.
Everyone, my name is Sohan Maheshwar. I'm a dev advocate at Fermion
and I'm here to talk to you about why the future of the cloud is
webassembly. Now, if you haven't heard of Webassembly before, no problem.
That's what I'm here for. So let's dive straight in.
I've worked in cloud computing for a while. My previous role wasm in
Amazon Web services, or AWS. So I've really seen the evolution of cloud computing.
I'm old enough to remember a time when the pre cloud days,
when there was a server room in every office which had an air condition.
It was locked, heavy security, no one was allowed to enter.
It was a different time. And then finally, with the release
of things like EC two, we actually saw virtualization and
widespread adoption of virtual machines. But us,
we as us as developers, we still had to use
or we had to take care of things like the kernel, the drivers, the operating
system, the utilities, and of course the business logic
that made up your app that ran on this virtual machine.
Containers took that a step forward in sort of removing some of
that effort that you required. You still had to configure things like
either docker or kubernetes for orchestration while still looking at
things like your OS and utilities and building your app.
Of course, the latest and greatest in computing is
serverless, where you can solely focus on your business logic
and the platform really takes care of the rest. So where
I'd assume most of us right now are, are at the containers or the
serverless stage. And with the current
state of containers and serverless, there are a couple of problems that really plague
the ecosystem. One is
that containers are just too expensive. They over
consume resources. We do know that containers can get complicated,
but teams have built out large platform engineering teams and DevOps to
be able to run these containers. But the thing is, when we're provisioning for
a container or for a workload, we actually look at what
the peak of that consumption is, even though our average is
much lower. So as a result, we end up over provisioning containers all
the time. And these containers sit idle. They're running, they're consuming
electricity, so they're not sustainable, they're also costly,
and they leave systems idling. Right? And this report at Andersine
Horowitz actually estimates about 100 billion of market value
being lost purely because of this, which is pretty
insane. Secondly, app configs are complex.
I think we know this. If you're a small company, you don't necessarily need
like a huge Kubernetes orchestration all the time, but because
this gets complex and more complex, the bigger your
app becomes. And honestly, modern apps do comprise a large number of
frameworks, language dependencies, libraries, all of which need
to be shipped alongside your containers and your code in the
cloud. So this just increases the complexity, so the
portability of your workload or your code base,
and the simplicity is often sacrificed so
that you can actually support a variety of architectures
and platforms. And specifically with serverless.
One big problem that we see and we hear from customers is that
serverless does have a cold started problem.
So roughly how serverless works is an
event occurs. This event basically pings some compute
resource that's running in the cloud, say AWS, lambda,
azure functions, whatever. So there is a finite
start, a finite amount of time for that
function running in that compute in the cloud to start up,
execute the program and send a response back. And that is
called a cold start time. So solutions like lambda
or Azure functions, they take about two to 3 seconds
and on an average we see about 250 to 500 milliseconds of
cold start time. Now for applications where this is vital,
this could be a problem. So there are ways to
work around this of course by keeping the instance warm,
for instance. But this cost is borne by you as a dev.
So keeping all of this in mind, I'm going to make a bold
statement, which is the next wave of cloud compute will be powered
by webassembly, specifically serverless webassembly.
And honestly, if you don't believe me, I'm just a random person on the Internet.
Take a look at this tweet from, I think it was mid 2010s
or at least 2018 or 19 by Solomon
hikes. Now he's the founder of Docker and
he actually mentions if WASM, which is Webassembly, plus WaSM,
I'll explain that existed in 2008, we wouldn't have needed to create Docker.
That's how important it is. So if Solomon
Hicks has an insight into tech, the next wave of
cloud computing will be powered by webassembly.
Let's talk a bit about that. Right, so what actually is webassembly?
Now the boring answer is it's just another bytecode format.
So for a program to run on a piece of hardware,
it has to be written in an intermediate type code or a low level code
that a computer can understand. And essentially that's what a
bytecode format is. So WebAssembly is just another bytecode format.
The interesting thing for you to know is this technology,
it's a general tech, it's not owned by any company. It was developed
sometime in the mid 2010s by a bunch of companies
working in the browser and front end space. So I think Mozilla
and few others. And the idea was for
it to be able to run any program on a browser
and hence the name Webassembly. And it wasm
designed from the ground up as a portable compilation target.
And this means that you could write code in any language,
say Python, Java, JavaScript, ideally any language that could compile
to Webassembly and that could
run in any browser. So that was the idea.
You'll hear me say wasm a lot. And wasm is just short for webassembly.
So how this would work essentially is in a way similar
to how the Java virtual machine worked from back in
the day. So you wrote a program in Java, this compiled to
something called Java Bytecode, the thing that I mentioned earlier. And this
Java bytecode could execute in any Java virtual machine.
And these Java virtual machines could run on Arm processors and x 86
processors. They could run on windows and Linux as
well. The cool thing about Webassembly is you can
write any program in any language, again ideally,
and compile this to a WaSM module which is
in the format of the webassembly bytecode that I spoke
about. Now this WaSm bytecode can execute in any runtime
that supports WaSM. The industry standard or the most
popular one right now is a runtime called WaSM Time.
So any wasm bytecode can run on a webassembly runtime.
And this runtime is designed to run across architecture.
So Arm x 86 whatever operating
system, so windows, Mac, Linux, et cetera. But you can also run it
on Kubernetes, on Raspberry PI and so on.
Literally any place that has support for a webassembly runtime.
A couple of other things to know about WebAssembly is, like I
said, it originated from the browser, but now is available outside
as well. And really the idea was for you to be able to compile
it once and run that code on any number of targets. So once
you've compiled from a program, say in Python to a WASM
format or a webassembly module, this webassembly module should
be able to run anywhere. And that was the idea.
Another cool thing about webassembly is it's security sandboxed
by default. This is very different from the
different ways we have been coding in the past where you had to specify
what you had, resources you could deny it wasn't sandboxed by default.
Webassembly, though, is the other way around. It's completely sandboxed by default.
So for anything to access a webassembly module, you have to
explicitly give it permissions. This is
how you would compile and run a Webassembly module. You'd write code in
any language, you would compile that to Webassembly, and Webassembly would
then run in a, let's call it virtual machine, but any WaSM
runtime. Now, all of these things, right, the portability,
the security sandbox, the fact that you could write
once, compile once, and run it anywhere, all of these
things made it very good for the browser. People were
like, hey, hold on, this could actually make it ideal for server
side, but for anything to run on a server, you need things
like access to files and file systems, you need access to a system
clock, you need access to the network as well.
So what happened was sometime in 20,
18, 19, something called Wozi was introduced, and Wozi stood
for a new kind of system interface, or Webassembly systems
interface. In short, it allowed you to run webassembly
outside of the browser. And the cool thing is it gave you access
to all your operating system like features,
including files and file systems, clocks, random numbers and so
on. The good thing is it wasn't tied to any
browser or front end or web API or JavaScript.
You can literally run it on a server side, and it extends this security
sandboxing to include things like input output. So you
still have the default security sandbox if
you take a look at it, among the top 20 languages in Redmonks
ranking, which by the way, I think they released a new one a couple of
days ago. So I should update this slide the top 20 rankings, that is,
as you can see, Webassembly supported by most of
the languages. Of course, things like CSS don't really apply here,
but JavaScript, Python, Java, Php Net,
C plus plus, typescript, Ruby, but also Zig C,
Rust, all of them have good or very
good levels of support when you write code in WebAssembly and compile
it to these languages. So now this,
I know you're a technical person and you're seeing this, so let's get into some
code, right? How do you write your first WebAssembly app on the server
side? And then we'll talk about why this will be the future of
the cloud. I'm going to show you this through an
open source project called Spin. Spin is the open source
tool for building webassembly serverless apps with
three commands that you see on the screen here. I'm actually going to build
and test spin app locally and with the fourth command we will also
deploy it to the cloud. Spin, just to
reiterate, is completely open source and we have a commitment for it to be open
source. It supports 15 plus languages. Right now we have
about 4.6 thousand stars on GitHub. We also
have a discord server, so join in there and
at least I personally think that the developer experience is really good.
So let's just jump into the CLI and try
it out. So I've already installed the CLI, so I'm just going to say spin
new and you can see on the left, you can see HTTP
and redis on the left, which is basically the
trigger to run your serverless function. Remember,
serverless is all event driven, right? So it has to be triggered
by something. And right now spin supports HTTP and
redis, but there are also community created triggers for MQtT
and SQs. On the right you'll see languages
such as C, go, grain, pHp, Python, rust, Swift,
et cetera. Let's just choose rust. These are the different languages
that are supported. I will call it conf 42.
Rust description and this is
the HTTP path. You can specify when
your serverless function is triggered. So say checkout or
resize or whatever. I can leave it blank, which means this is
default and it will be triggered when this path is hit.
So let's go into the folder
and I will open it in my favorite code editor, which is vs
code. Just two things you need to know about a serverless
webassembly app using spin. The first one is something called
the application manifest, which is this,
right? So this think of it as a manifest file. It's written
in a toml format. All you need to know is this is a trigger
which we just specified. So by default this
particular component will be triggered. You can actually specify multiple
routes and have different components for each. So say
for example you're writing a calculator. So maybe the
plus root will trigger a plus component. The subtraction
route will trigger a subtraction component, and given that this
is webassembly, you can write each of these in a different programming
language. So you can write addition in Python and subtraction in JavaScript
and this would still work.
And as you can see, this is the wasm file that it eventually compiles to,
which I will show you in a bit. Looking at the source code,
it's fairly straightforward. You don't need to worry too much about rust itself.
All you need to know is there's a request that comes in, right? So when
an event is triggered, this particular function, this is the request that comes in
and you send a response back. We can just modify
that into sing. Hello Conf 42.
And yeah, this is the response that's being sent back.
So I promise that with three commands,
we'll get like an app up and running. I've said one, which is pin new.
I will say the second one, which is spin build.
So this is command number two. It's rust, and this
is a one time sort of compilation.
All the crates are compiled in rust. And then we will use
our third command, which is spin up. Right,
which basically,
there we go. What it does is it creates a local instance for you to
test out your app, thereby giving you pretty good developer
experience, because you can test out your app locally.
There we go. And yeah, I just did a curl to that thing that Wasm
running here and you can see hello Conf 42. Right? So three commands.
Got a serverless webassembly app from scratch up
and running to test locally. I'm just going to close
this. I did say with the fourth command I could deploy this to
the cloud. So fermyon does have a fermyon cloud with both free
and paid tiers. I've already logged in here on
my ClI, but with just one command of Spindeploy, you can
see that this particular app will be deployed to the
cloud and we can actually test that out as well.
So, yeah, that's it. I can view the
application here and I can manage it too here. I'm just
going to do the same curl here and we got
the same result. You can feel free to open this app on your browser and
you will see literally the same result again.
So super easy to go from nothing to creating
a serverless webassembly app that's running in the cloud. Just going to close
this and open spin.
Like I said, it's completely open source. And with this SDK, you get access to
a large language model, which is the llama two model. So you can do serverless
AI. So you don't need a large language model running in the cloud and
paying lots of resources. You also get a key value store,
a NoSQL database, custom domains, bunch of other cool things.
So do check it out. So I showed you
the experience of building a webassembly app for
the server side. The four things that really make it suited or
make it ideal for doing this is this, right. So one is
binary size. A simple rust hollow world is only two
MB and ahead of time. Compiled Rust hello World
is about 300 kB. The app that I showed you now,
which is a simple HTTP API written in Rust, is about 2.3
MB, just in time compilation and ahead of time. If you compile
it, you can bring that down to about 1.1 mB. And I can
also show it to you. Let me do CD.
I think it's. Let me do an Ls first.
Yeah, CD target.
Right. And yeah, you can see that this is the
wasm file that you can see it's about two MB,
right. So it's pretty small. I can actually
do. And this is
the bytecode basically, right? Yeah.
So we can't understand most of this stuff. So that is
the wasm bytecode that you're looking at.
The startup times are comparable to near native.
So in the benchmark that you see there, it's about 2.3 x slower
than native. I think that's where there is a bit of a trade
off in terms of binary size versus startup time, but it is
still comparable, and it is near native performance for something that's
not written in rust. The portability we spoke
about where you can build once and run this anywhere. Right. So that wasm
file that you saw should theoretically run on any wasm
time, sorry, on any webassembly runtime out there.
And lastly, there is a security sandbox that I spoke
about. It's completely a capability based security model. In fact,
if you look at the spin toml, you can actually see
something called an allowed outbound host. So if this
model, sorry, if this module had to make a HTTP call,
for instance, you had to explicitly allow it to make a
HTTP call to a particular URL, only then
will it work. Similarly, if you want a file to access
this particular module, you have to give it access to this particular module.
So it is security sandboxed by default.
So the big question is, how is this going to change cloud
computing? And my answer to that is gradually
and then suddenly 2023,
everyone said, is going to be the year of Webassembly. And it didn't really take
off in the way that people expected it to. But now
in 2024, we are seeing so much about webassembly.
Part of my job is to speak at conferences, and I do that maybe three,
four times a month. And there's just so much of an increase in the
number of talks and the number of questions and queries about this
thing of running Webassembly on the server side and in the
cloud. The key to understanding the success of the
cloud is to understand this concept of multitenancy,
which essentially is how multiple applications can run in
a started environment. Now, I'm not going to go into the science
of this, but the analogy is like an apartment
building, right? So instead of one small family, or like few
people staying in a really large building, you break that down
into multiple houses in the same plot of land,
which many tenants can inhabit. And that's the
general idea of cloud computing. The idea,
again, is that any of these tenants that are hosted in
your piece of hardware shouldn't interfere with the other,
intentionally or unintentionally. And that's the key to success for cloud
computing. Now, people have driven, or companies have
driven more and more towards bringing the cost closer to
value, which means increasing the number of tenants in the same
piece of hardware. Because the value of this piece of hardware
is based on your long term average traffic, and the
cost of running the system is based on short term peak
traffic. So the more value you can extract
out of the system, that means you have got more bang for your buck for
that hardware itself. If you look at again,
the waves of cloud computing, when we first started off with
just virtual machines, we could run very few apps on the same
hardware. But I think with containers, you slowly increase the
number of apps you could run on, say, a Kubernetes cluster.
In this final form of serverless webassembly,
you can really pack and have dense workloads
with multiple functions running in the same piece of hardware.
The analogy I love to draw when I'm talking about this
is think of how atoms and molecules think of how molecules are
structured in liquids, solids and gases.
So on your left, what you see is like a gas,
where you have molecules that are kind of loose, and then in
a liquid they are maybe a little closer to each other, but they're
really densely packed in a solid, which gives it its shape and
texture and format. And that's how serverless
webassembly will look. You can really have a high density
of functions in a workload. In fact,
as a serverless unit, webassembly is so ideally suited,
because the people who created the Firecracker VM,
which is the base for AWS lambda,
essentially wrote a paper, and I highly suggest you read that paper that I've linked
at the bottom here, about the characteristics of an ideal
serverless unit. And they define six characteristics which included
isolation. I mentioned it, you could run multiple functions
on the same piece of hardware, overhead and density,
where you can run thousands of functions on a machine with
minimal waste. Three is performance. You should
have consistent and near native performance at all times.
Four is the ability to switch quickly,
right? Essentially not have cloud start times, but open
a serverless unit, run something, shut it down,
and then switch to something else. The ability
to allocate or soft allocations, where if there is a spike in one
of the tenants, you should be able to overcommit resources,
CPU, memory and so on. And lastly, it's compatibility.
I think we as devs we want to use our favorite libraries, our favorite
frameworks, hosts, et cetera. So it has to be compatible with a
bunch of things. We sort of compared
a microvm such as firecracker to webassembly with
these six parameters. So in terms of isolation both are sandboxed.
So a microVM is sandboxed via the Firecracker KVM,
and webassembly sandboxed via its own security sandbox model.
There are two places where at least I personally think webassembly really shines
compared to a microvM. The first one is overhead and density,
right? So to run thousands per node on a microvm you needed
a 48 core 382 GB RAM with 3360
GB disk. So that's your hardware spec. But you could do the same thing
with an eight conf 42 GB RAM 100 GB disk
if you use webassembly, because it is so lightweight and
performant performance in
both are near native, so nothing to compare there.
Fast switching, I think is the second thing where webassembly really
shines. We did mention microvms do have cold start times
from 125 to even 500 milliseconds, whereas with
webassembly that's down to about a millisecond. Right? And you can scale
up to like thousands of functions in that
node and then scale back down to zero in under a
millisecond, which is very impressive in terms of soft allocation.
I think with things like Lambda and even azure functions, they've been tried and tested
that you can run these in production in enterprise grade
with oversubscription ratios as high as ten X.
Webassembly is new, so it's untested. But I have a feeling by the end
of this year we'll really get to see how software
allocation would work when it comes to Webassembly.
In terms of compatibility, microvms are Linux and KVM
only. Most software is compatible unless it has very specific hardware
requirements. Webassembly, like we said, was designed to be
compiled once, run anywhere. So it supports a bunch of all
OSS platforms, architectures and so on.
Now, just a few days ago we launched something called
Spincube, and I'm super excited to talk about this because it ties into so
many of the things that we just spoke about, which is things like density,
performance and binary size. I'm sure many of
us are either familiar with or work on Kubernetes.
So Spincube is a completely open source project with contributions
from companies like Microsoft, Liquid Reply, souser and Fermion,
and essentially gives you hyper efficient serverless on kubernetes,
completely powered by Webassembly,
it's again fully open source and it just streamlines the development and
deployment process of webassembly workloads on
Kubernetes. You should check out Spincube dev for more
info and the slash spincube. But essentially
when you build a webassembly app that's deployed in Spincube,
these artifacts are significantly smaller in size compared to
a typical container image. So again, think of the
costs, think of your carbon footprint, think of performance
when this actually happens. And these artifacts
can be fetched over the network and started much faster than running a
typical container image. Which also means that substantially fewer
resources are required during times when your container is actually
idling because these webassembly functions
can scale back down to zero in no time. Just to
give you a quick overview of how it works, here's a slightly complicated
architecture diagram. Now if you look at the
bottom here, this is the core of the project, which is
the container D shim spin. Right? So this
uses something called run vazi. And this essentially,
sorry, I need to look at my computer there, enables containerd to sort of
run spin webassembly apps in a Kubernetes cluster.
And it provides all these capabilities needed to pull
an application from a registry to start the application
and so on. Now something called a
runtime class manager. This deploys pre configured images
that can run webassembly workloads. And this works with the container D
shim spin the runtime class manager
was contributed by liquid reply and Sousa. And you
can do things like annotate your nodes, install and configure
container D with this shim. Now something that fermion,
the company I work for, contributed towards this project is this spin
operator here. And spin operator essentially is used to schedule
and manage your spin apps as custom resources.
So what it does is it looks at a custom resource
definition of a spin app for any changes and
it speaks to and creates
a spin app using a specified executor. So you can specify that using
the operator. And lastly, Spin itself has an
ability for anyone to write plugins for it. So there's a plugin
called Cube which essentially scaffolds your spin
app and creates a deployment yaml, which can
then be used by your CRD to sort of deploy into Kubernetes.
So Spincube is all of this combined where you have like the
container d shim, you have runtime class manager, a spin operator and
a spin plugin. Now if you think this is
exciting, we have taken the concept of Spincube and we have amped
it up and we've released something for enterprises called fermion
platform for Kubernetes. Now with this, and I'm
not joking, you can actually get a 50 x increase in workload
density. That's right. You can actually run 5000
serverless apps in one Kubernetes node. Typically that
limit used to be around I think 256 if I'm not mistaken.
That was the maximum you could. But because of webassembly you
can actually run 5000 in one node. I'm showing you a demo
in the next slide. I think that's pretty awesome. And you
get massive reductions in your serverless cold start delays as well.
Again because of how webassembly is built,
because of this you're saving a bunch of costs because you're increasing your capacity
and your efficiency of resources.
So the infra that you spend or your platform team spends
is going to be so much lower. And again, this is highly portable.
So there's no vendor lock into one public cloud. You can run this
in different places in different architectures, OSs,
there's absolutely no lock in there. Here's the quick
demo of platform for kubernetes. It's a prerecorded video because well,
you have to ping 5000 apps. So this is an Azure Kubernetes cluster
running 5000 apps, right? You can see the number
here, we've just done a count, 5000 and
you can see instantly how you get a response. Hello number twelve.
So I just changed the number. You instantly get a response. So it's that quick.
So the cold start time is sub one millisecond within
this Azure Kubernetes cluster. And just to give
you an idea of how this works, you can write your webassembly apps
using spin open source. You can self host in your
Kubernetes using Spincube which is open source. So if you want an enterprise grade
one, this platform for Kubernetes, get in touch with us about that.
Or you can also host your spin apps on cloud.
There are paid and free tiers there as well.
All right, well, I hope you learned something new today.
For next step, check out and build your first spin app.
Check out spincube as well. If you're into the Kubernetes space,
there are a bunch of tutorials on our YouTube, too, so feel free to
jump in there. We have a discord, so join us there. Or hit me up
on LinkedIn if you have any questions, or if you had any feedback about the
stock, I'd love to hear what you're building in the webassembly
space. So yeah, get in touch and enjoy the rest of the conference.
Thank you.