Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi there. My name is Atila Fassina and now it's time for us
to start demystifying multi threading in JavaScript.
I'm a lead web developer at SAP. I'm based in
Berlin and my website is achilla
IO. Feel free to mention me as well on Twitter as Atila Fassina.
I'm always keen to connect, and without further ado,
let me get started it by differentiating
multithreading and parallelism.
They are not synonyms in the sense that we can have concurrent
multiple threads, and we can have,
of course, parallel multiple threads.
What we need to decide first is that if
we need a second or a third and
so on thread, or we can work on a single thread,
and once we realize that, okay, we might need multiple threads,
then we need to define if those threads can be
concurrent or they need to be parallel.
And to exemplify what concurrent threads mean,
the best example I could come up with was chat threads.
You cannot reply to more than one thread at the same time,
the same way you cannot read more than one thread at the same time.
If you're really great at switching contexts,
you may be fast enough that whoever is speaking with you
won't notice that you are switching in and out of a thread.
But the threads themselves are still concurring. They are
competing for your attention. You cannot read one
thread while you reply to another one. You cannot reply to multiple
threads at the same time. You can jump from study group to games,
for example, but the moment you are there, you don't know what's
happening and cooking, for example. And that's
what mean concurrent threads in
this case our attention, or we are the engine
that is executing the tasks from the threads, and we just
jump from one to the other. We don't completely finish the thread before moving
to another one, and that's what multiple threads would actually
mean. And then we do micro tasks in each one of
them and giving the impression
of them being executed at the same time.
While if we are performing as a one person
band, for example, we can have multiple things happening at the
same time. In this example here, the accordion
is being played at the same time, has the keyboard as the trumpet,
and whatever else this guy that is playing. The resources
are actually shared, and one thread does not occupy
the same processing as another thread, they're happening simultaneously.
And that's also stressed the fact that parallelism is
just a special kind of asynchronous.
But what does that actually mean for JavaScript?
This conceptual talk is really great to get the gist of what
multithreading actually means. But we need real code examples,
right? So let's have a look at our APIs.
We have a bunch of asynchronous APIs,
and these ones mentioned here
in this slide are all concurrent. For example, the time
period and a set timeout cannot be reliable in
the sense that if there is a long task being executed, the callback will take
longer than the time you established as a functional parameter,
while the set interval can be reliable, but at the
expense of whatever else is happening on the main thread. So it's just going
to come in blocking and executing the callback once
the timeout is done. So it can be
very destructive to the UI.
While promises will happen in
a better way, in a more efficient
way than set timeout actually does, but in
nature the behavior is similar. So they're going
to execute the micro tasks and jump between the
different threads, but they're still concurrent.
So what about parallel?
For parallel multiple threads? We have web
workers in JavaScript. They're not the only way of having multiple
threads in JavaScript, but they are one of
the main ways to do it.
And following from the example that we had with set timeout,
set interval. More specifically,
Ux love web workers in the sense that things
that are happening in a worker thread do not interfere
with the UI. The tasks are ran, and even
as big as they come, they will not block
the interaction with the UI. So the
UI is still going to be as interactive and responsive
as if your page is idle.
And to start going deeper with web workers,
there are two kinds of them. There are the dedicated worker,
which are dedicated to a main script. They are executed
on a very specific context window, and there
are the shared workers that can be executed by multiple
windows and main scripts, even iframes, for example.
And though that's the difference that name them, they're more once
we start working with worker threads, it's important
to remember that the global scope is different, it's a different object,
whereas in the main thread we have the global object being the
window object, which we're very familiar with in web
workers. They have their own specific scope, dedicated worker,
global scope for a dedicated worker and shared
worker, global scope for a shared worker.
So that means that once we execute code in
a worker thread, we need to be aware of the scope we're
in if we need to interact
with the global scope, and we'll see that we
need to interact with it to send messages to the
main thread. So how
can we write isomorphic code as in code that can be run whatever
place we do. So we need first to
the only thing we actually need is to account for the differences between
each of those objects. So by
checking what properties they have, and we can then
access all of them using the self keyword.
But there's another fundamental difference between dedicated workers
and shared workers. When we talk about the web workers API, which is
shared workers, to the best of my knowledge,
has been removed from webkit and does not
have any plan to be added back. So it's not possible to use
shared workers in a Webkit browser context.
Besides those little differences between background threads in the main one,
there are important things to account for. And just to
be clear, from now on I'm talking about the web worker API,
which with a
few syntax differences in the sense that shared workers,
we need to specify the port we're connecting to and
other implementation details. Those things
apply to both the dedicated worker and to shared workers.
So from now on I'm just going to mention them as web workers.
Web workers deal with data transfers and
don't take this word lightly. This means that data
does not exist on both runtimes simultaneously, so don't exist
in the main thread and in the worker thread at the same time,
it is copied and not shared.
Once you send to your worker thread, it's best to
wait until the data come back before you mutate or consider it at all.
The whole unidirectional data changes takes a whole
new level of importance when coming to multithreading.
Because we're not able to mutate the data when it's coming back and forth
from the threads, they do not share the same instance, the same object
instance, for example.
And that's because workers interact with the main threads through posting
messages. Data is copied through the structured
clone algorithm, and objects sent and received do
not share the same instance. They're perfect copies of each other
instead. So to transfer objects
with things algorithm, it's important that they are serialized before
they sent and deserialized upon arrival. And that's
how we use the post message method for
it. We pass it through the post message and we listen
to the event message. And talking about
listening to events, we need to deal with errors on the worker
thread. So runtime errors on the background threads
don't block the main thread because they have a completely different
runtime. So you
do need to listen to the error events
so you can communicate it to your user and provide a good user experience.
Otherwise, if there is a runtime error on your background thread on
your worker thread, it's just going to fail silently if you're
not listening to that event.
Also, we need to be careful with thread safety.
It's only possible to interact with document APIs that
don't risk exposure of the data from the thread.
As I mentioned before, all objects being sent need
to be serialized,
and that, together with a transferring clone algorithm,
imposes important restrictions in regards to the data that's being transferred.
Plus the fact that background threads have their own execution context
imposes a very important issue that developers need to be
aware in the vast majority of cases
that we are going to use a web worker, it is important to set
the content security policy on the request
header with the appropriate directives there,
because it will not inherit the policy from the main
page. If we don't set the CSP
headers to the request,
it's going to be completely vulnerable, as if we had nothing there.
It's not inheriting, and I'm saying 99% of the cases
because there is tiny percentage of
use cases where they will actually inherited the
CSp headers from the main page, which one of the examples
is if the origin of the request URL
is a universally unique identifier.
The other possible case is if the web worker is actually an
embedded worker as an inline script in
our main thread in our main page script.
And now let's review what we talked about. We have a different
runtime in the background thread. We only deal with serialized
objects and only thread safe DOM
components are available. So these sets of
precautions impose that we are unable to make DOM manipulations
from the worker thread. This is actually
kind of nice because the
worker thread is not meant for that. The UI thread is the main thread,
so UI interactions should be handled there exclusively.
For a detailed list of available APIs on
the worker, there's a link in the slides
listed in the description, and you can check that for
a better detail.
Okay, we've done our due diligence. We talked about concepts, we talked about security,
we talked about all the things that make this engine turn.
Now it's time for some practical examples and for a demo.
So the stopwatch there is just for you
to see that there aren't any cuts on this example.
But note that once
the button is clicked, the UI is frozen. Like I can
click the boop as many times has I need as I want to,
and nothing actually happens until my big
task is finished. Run with the computation and then everything
unblocks and all my clicks just come in crashing
down.
But you're all thinking okay, that's totally fair. It's a synchronous
method, so that's it. So let's have a look at
how it behaves. If we run in a promise.
Now we're doing just the very same, but inside the multithread. But it's
concurrent. So in theory it would perform a task and jump to
the other thread. But this thing is so gigantic
that it just holds everything. It freezes the state for a while until
it finishes the task and then it comes back and I
get all my interactions at the same time.
Kind of still annoying, right? So let's have a look
with a web worker, and there are two web workers there,
but don't pay attention to that. We're going to get to that a little bit
later. So now the UI is extremely
responsive while everything's running. You see the loading state there
and it's just seamless to the user. The user can just continue to
interact with our app while everything is running,
while everything's happening, the thread's computing
in the background and the user is none the wiser.
They're still doing their thing and the data comes back and we can resume
activity. This demo is available
if you just look at Achilla IO demo
42 and the code is in GitHub. The link
is over there as well.
So let's take a look at the code then. I already
said it's available on GitHub, but let's
walk through and just see how we can make things better.
Maybe so. First, our worker.
At this moment we're just listening to the message event
here. And now a callback will run my huge task and send
the returning value down the wire check.
The self keyword is using
the global scope on lines one and four. And now let's see how
we are going to trigger this execution in the main thread.
So now is our react component. First of
all, we instantiate a worker passing
the URL to the constructor of the
URL. And this is something
for our compiler. In this case I'm using parcel two,
but that will be the same for parcel one for webpack
or any other compiler as far as I'm aware of.
Later on we're going to send the result
of our message. We're actually going to send the
parameter for our worker with
a message to trigger the task.
In this case I'm just saying how big I want a task to be.
It's just a parameter that my method takes.
And finally we're going to have the
event listener to the message event and this callback
will trigger whenever
the message comes back, and it's going to send the data
to trigger a state change in my react UI.
And those approximate ten lines of code is how you get
background threads to work in a react app without any additional dependencies,
just a platform and a compiler.
But look at line 13 again.
Post message is a platform method. It will accept
any parameter you pass to it. And I grew too
much accustomed to typescript to feel safe with this kind of thing.
So let's see if we put some x ray spices in our code to get
a better developer experience.
There's this library called comlink.
It's going to leverage the proxy API and
just a disclaimer, it's only I eleven plus. I hope you're
not stuck with I eleven anymore.
And it's going to create a remote procedure call.
So long story short, we can send stuff to our background task without
feeling we left the main thread at all. Let's have a look.
How to implement a bare bones worker with comlink so
first we're going to create an object literal and put
our task there. It's now a promise though.
But internally nothing really changed.
Besides that. We are going to create a type for that object and
we are going to export it. It's going to come in
handy in a moment. Just hold on. And finally we import the
expose method from comlink and we ask it to do its magic on
line twelve over there. And finally we
import this exposed method from comlink and we ask it to do its
magic on line twelve.
Now back to our react app. We are going to instantiate
our worker just like before.
Then we are going to use the wrap method from comlink
on it. And it is a typescript generic. Remember that
type that we created on the other file? That's how we're
going to pass now the type of our Comlink worker.
We get a promise back, and that's the result from our worker
thread. So that's going to be strongly
typed in our ide and for our compiler to
check if we're passing the proper parameters.
But keep in mind that the main difference is that our click
handler is now asynchronous, so we need to provide it with the async
keyword, and we need to await for the return of our worker.
And that's about it. How we get a web worker to work
with either type safety or zero dependencies.
If you need to use comlink with legacy browser
like I eleven, you might need a polyfuel for the proxy API
but I hope you're not anymore.
And that's just the tip of the iceberg when it
comes to talk. But web workers, so there are
more features to it. For example, import scripts
is a global function to import the scripts.
It's going to bring out third party scripts to be executed
with your worker thread. There's one caveat though.
Make sure that the resource is accessible.
Otherwise, be sure to handle the network error.
That will happen, because that network error will actually
bubble up as a runtime error on your background thread, and it's
going to just stop everything. So either
your thread is ready to mitigate that during runtime,
or you are aware that this resource is going to always
be available. And second thing, it's a
very powerful feature. I didn't mention it explicitly in
things talk, but there's nothing preventing a worker to spawn another worker.
So those are called subworkers in theory, but they behave just
the same as their parent workers.
And I mentioned right before this task that in the
beginning of the talk that web workers were not the only way of achieving parallel
multiple threads in javascript. So there are actually
other kinds of workers, like service workers,
the very same ones that you use to create progressive web apps
and to make offline caching and proxy
API calls. And there's also like audio
worklets, which aren't really workers, but they are
very close to it, and they behave in a parallel separated
thread as well.
So if you're leveraging things talk,
you're still unsure about what the takeaway of things. I just want to
make it very clear that getting
out of the main thread is very doable and cheap,
and I hope this demo that has showed you
that there's nothing outworldly about using multiple threads
in a react app, for example, and that you
consider this as part of your toolbox whenever you
stumble on a situation, that rendering performance
can be affected by some heavy computation and
you need some help to provide a snappy user
experience. There are few to none
bottlenecks associated with using a web worker, and I
think that's the main thing that I want to stress out here. I'm not
saying that you should use heavy
computations on the client side. I'm saying that there is a specific use
case where you might need to, and then you could
use a web worker, and that's
about it. Thank you very much for having me. These slides are going to be
available. Check it out on Twitter or on my website
and see you next time.