Conf42 JavaScript 2021 - Online

Multi-Thread demystified

Video size:

Abstract

JavaScript is not single-threaded anymore. And there is much we can learn from it has been a while. But we are still leaving all the processing in the same thread that we render things.

Let’s have a look on how we can do it better. By leveraging multi-threads we can ensure the User Experience will not ever be jeopardized by any computing and processing we perform, since as blocking as each task could be: they will still be contained within their own thread. Therefore, rendering performance will never drop below the idea rate.

Summary

  • I'm a lead web developer at SAP. I'm based in Berlin and my website is achilla IO. Feel free to mention me as well on Twitter.
  • multithreading and parallelism are not synonyms. We can have concurrent multiple threads, and we can have, of course, parallel multiple threads. Ux love web workers in the sense that things happening in a worker thread do not interfere with the UI.
  • Web workers deal with data transfers and don't take this word lightly. Data is copied through the structured clone algorithm. Workers interact with the main threads through posting messages. Developers need to be careful with thread safety.
  • There's this library called comlink. It's going to leverage the proxy API and just a disclaimer, it's only I eleven plus. Let's see if we put some x ray spices in our code to get a better developer experience. This demo is available if you just look at Achilla IO demo 42 and the code is in GitHub.
  • There's nothing preventing a worker to spawn another worker. Make sure that the resource is accessible. Otherwise, be sure to handle the network error. There are few to none bottlenecks associated with using a web worker.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi there. My name is Atila Fassina and now it's time for us to start demystifying multi threading in JavaScript. I'm a lead web developer at SAP. I'm based in Berlin and my website is achilla IO. Feel free to mention me as well on Twitter as Atila Fassina. I'm always keen to connect, and without further ado, let me get started it by differentiating multithreading and parallelism. They are not synonyms in the sense that we can have concurrent multiple threads, and we can have, of course, parallel multiple threads. What we need to decide first is that if we need a second or a third and so on thread, or we can work on a single thread, and once we realize that, okay, we might need multiple threads, then we need to define if those threads can be concurrent or they need to be parallel. And to exemplify what concurrent threads mean, the best example I could come up with was chat threads. You cannot reply to more than one thread at the same time, the same way you cannot read more than one thread at the same time. If you're really great at switching contexts, you may be fast enough that whoever is speaking with you won't notice that you are switching in and out of a thread. But the threads themselves are still concurring. They are competing for your attention. You cannot read one thread while you reply to another one. You cannot reply to multiple threads at the same time. You can jump from study group to games, for example, but the moment you are there, you don't know what's happening and cooking, for example. And that's what mean concurrent threads in this case our attention, or we are the engine that is executing the tasks from the threads, and we just jump from one to the other. We don't completely finish the thread before moving to another one, and that's what multiple threads would actually mean. And then we do micro tasks in each one of them and giving the impression of them being executed at the same time. While if we are performing as a one person band, for example, we can have multiple things happening at the same time. In this example here, the accordion is being played at the same time, has the keyboard as the trumpet, and whatever else this guy that is playing. The resources are actually shared, and one thread does not occupy the same processing as another thread, they're happening simultaneously. And that's also stressed the fact that parallelism is just a special kind of asynchronous. But what does that actually mean for JavaScript? This conceptual talk is really great to get the gist of what multithreading actually means. But we need real code examples, right? So let's have a look at our APIs. We have a bunch of asynchronous APIs, and these ones mentioned here in this slide are all concurrent. For example, the time period and a set timeout cannot be reliable in the sense that if there is a long task being executed, the callback will take longer than the time you established as a functional parameter, while the set interval can be reliable, but at the expense of whatever else is happening on the main thread. So it's just going to come in blocking and executing the callback once the timeout is done. So it can be very destructive to the UI. While promises will happen in a better way, in a more efficient way than set timeout actually does, but in nature the behavior is similar. So they're going to execute the micro tasks and jump between the different threads, but they're still concurrent. So what about parallel? For parallel multiple threads? We have web workers in JavaScript. They're not the only way of having multiple threads in JavaScript, but they are one of the main ways to do it. And following from the example that we had with set timeout, set interval. More specifically, Ux love web workers in the sense that things that are happening in a worker thread do not interfere with the UI. The tasks are ran, and even as big as they come, they will not block the interaction with the UI. So the UI is still going to be as interactive and responsive as if your page is idle. And to start going deeper with web workers, there are two kinds of them. There are the dedicated worker, which are dedicated to a main script. They are executed on a very specific context window, and there are the shared workers that can be executed by multiple windows and main scripts, even iframes, for example. And though that's the difference that name them, they're more once we start working with worker threads, it's important to remember that the global scope is different, it's a different object, whereas in the main thread we have the global object being the window object, which we're very familiar with in web workers. They have their own specific scope, dedicated worker, global scope for a dedicated worker and shared worker, global scope for a shared worker. So that means that once we execute code in a worker thread, we need to be aware of the scope we're in if we need to interact with the global scope, and we'll see that we need to interact with it to send messages to the main thread. So how can we write isomorphic code as in code that can be run whatever place we do. So we need first to the only thing we actually need is to account for the differences between each of those objects. So by checking what properties they have, and we can then access all of them using the self keyword. But there's another fundamental difference between dedicated workers and shared workers. When we talk about the web workers API, which is shared workers, to the best of my knowledge, has been removed from webkit and does not have any plan to be added back. So it's not possible to use shared workers in a Webkit browser context. Besides those little differences between background threads in the main one, there are important things to account for. And just to be clear, from now on I'm talking about the web worker API, which with a few syntax differences in the sense that shared workers, we need to specify the port we're connecting to and other implementation details. Those things apply to both the dedicated worker and to shared workers. So from now on I'm just going to mention them as web workers. Web workers deal with data transfers and don't take this word lightly. This means that data does not exist on both runtimes simultaneously, so don't exist in the main thread and in the worker thread at the same time, it is copied and not shared. Once you send to your worker thread, it's best to wait until the data come back before you mutate or consider it at all. The whole unidirectional data changes takes a whole new level of importance when coming to multithreading. Because we're not able to mutate the data when it's coming back and forth from the threads, they do not share the same instance, the same object instance, for example. And that's because workers interact with the main threads through posting messages. Data is copied through the structured clone algorithm, and objects sent and received do not share the same instance. They're perfect copies of each other instead. So to transfer objects with things algorithm, it's important that they are serialized before they sent and deserialized upon arrival. And that's how we use the post message method for it. We pass it through the post message and we listen to the event message. And talking about listening to events, we need to deal with errors on the worker thread. So runtime errors on the background threads don't block the main thread because they have a completely different runtime. So you do need to listen to the error events so you can communicate it to your user and provide a good user experience. Otherwise, if there is a runtime error on your background thread on your worker thread, it's just going to fail silently if you're not listening to that event. Also, we need to be careful with thread safety. It's only possible to interact with document APIs that don't risk exposure of the data from the thread. As I mentioned before, all objects being sent need to be serialized, and that, together with a transferring clone algorithm, imposes important restrictions in regards to the data that's being transferred. Plus the fact that background threads have their own execution context imposes a very important issue that developers need to be aware in the vast majority of cases that we are going to use a web worker, it is important to set the content security policy on the request header with the appropriate directives there, because it will not inherit the policy from the main page. If we don't set the CSP headers to the request, it's going to be completely vulnerable, as if we had nothing there. It's not inheriting, and I'm saying 99% of the cases because there is tiny percentage of use cases where they will actually inherited the CSp headers from the main page, which one of the examples is if the origin of the request URL is a universally unique identifier. The other possible case is if the web worker is actually an embedded worker as an inline script in our main thread in our main page script. And now let's review what we talked about. We have a different runtime in the background thread. We only deal with serialized objects and only thread safe DOM components are available. So these sets of precautions impose that we are unable to make DOM manipulations from the worker thread. This is actually kind of nice because the worker thread is not meant for that. The UI thread is the main thread, so UI interactions should be handled there exclusively. For a detailed list of available APIs on the worker, there's a link in the slides listed in the description, and you can check that for a better detail. Okay, we've done our due diligence. We talked about concepts, we talked about security, we talked about all the things that make this engine turn. Now it's time for some practical examples and for a demo. So the stopwatch there is just for you to see that there aren't any cuts on this example. But note that once the button is clicked, the UI is frozen. Like I can click the boop as many times has I need as I want to, and nothing actually happens until my big task is finished. Run with the computation and then everything unblocks and all my clicks just come in crashing down. But you're all thinking okay, that's totally fair. It's a synchronous method, so that's it. So let's have a look at how it behaves. If we run in a promise. Now we're doing just the very same, but inside the multithread. But it's concurrent. So in theory it would perform a task and jump to the other thread. But this thing is so gigantic that it just holds everything. It freezes the state for a while until it finishes the task and then it comes back and I get all my interactions at the same time. Kind of still annoying, right? So let's have a look with a web worker, and there are two web workers there, but don't pay attention to that. We're going to get to that a little bit later. So now the UI is extremely responsive while everything's running. You see the loading state there and it's just seamless to the user. The user can just continue to interact with our app while everything is running, while everything's happening, the thread's computing in the background and the user is none the wiser. They're still doing their thing and the data comes back and we can resume activity. This demo is available if you just look at Achilla IO demo 42 and the code is in GitHub. The link is over there as well. So let's take a look at the code then. I already said it's available on GitHub, but let's walk through and just see how we can make things better. Maybe so. First, our worker. At this moment we're just listening to the message event here. And now a callback will run my huge task and send the returning value down the wire check. The self keyword is using the global scope on lines one and four. And now let's see how we are going to trigger this execution in the main thread. So now is our react component. First of all, we instantiate a worker passing the URL to the constructor of the URL. And this is something for our compiler. In this case I'm using parcel two, but that will be the same for parcel one for webpack or any other compiler as far as I'm aware of. Later on we're going to send the result of our message. We're actually going to send the parameter for our worker with a message to trigger the task. In this case I'm just saying how big I want a task to be. It's just a parameter that my method takes. And finally we're going to have the event listener to the message event and this callback will trigger whenever the message comes back, and it's going to send the data to trigger a state change in my react UI. And those approximate ten lines of code is how you get background threads to work in a react app without any additional dependencies, just a platform and a compiler. But look at line 13 again. Post message is a platform method. It will accept any parameter you pass to it. And I grew too much accustomed to typescript to feel safe with this kind of thing. So let's see if we put some x ray spices in our code to get a better developer experience. There's this library called comlink. It's going to leverage the proxy API and just a disclaimer, it's only I eleven plus. I hope you're not stuck with I eleven anymore. And it's going to create a remote procedure call. So long story short, we can send stuff to our background task without feeling we left the main thread at all. Let's have a look. How to implement a bare bones worker with comlink so first we're going to create an object literal and put our task there. It's now a promise though. But internally nothing really changed. Besides that. We are going to create a type for that object and we are going to export it. It's going to come in handy in a moment. Just hold on. And finally we import the expose method from comlink and we ask it to do its magic on line twelve over there. And finally we import this exposed method from comlink and we ask it to do its magic on line twelve. Now back to our react app. We are going to instantiate our worker just like before. Then we are going to use the wrap method from comlink on it. And it is a typescript generic. Remember that type that we created on the other file? That's how we're going to pass now the type of our Comlink worker. We get a promise back, and that's the result from our worker thread. So that's going to be strongly typed in our ide and for our compiler to check if we're passing the proper parameters. But keep in mind that the main difference is that our click handler is now asynchronous, so we need to provide it with the async keyword, and we need to await for the return of our worker. And that's about it. How we get a web worker to work with either type safety or zero dependencies. If you need to use comlink with legacy browser like I eleven, you might need a polyfuel for the proxy API but I hope you're not anymore. And that's just the tip of the iceberg when it comes to talk. But web workers, so there are more features to it. For example, import scripts is a global function to import the scripts. It's going to bring out third party scripts to be executed with your worker thread. There's one caveat though. Make sure that the resource is accessible. Otherwise, be sure to handle the network error. That will happen, because that network error will actually bubble up as a runtime error on your background thread, and it's going to just stop everything. So either your thread is ready to mitigate that during runtime, or you are aware that this resource is going to always be available. And second thing, it's a very powerful feature. I didn't mention it explicitly in things talk, but there's nothing preventing a worker to spawn another worker. So those are called subworkers in theory, but they behave just the same as their parent workers. And I mentioned right before this task that in the beginning of the talk that web workers were not the only way of achieving parallel multiple threads in javascript. So there are actually other kinds of workers, like service workers, the very same ones that you use to create progressive web apps and to make offline caching and proxy API calls. And there's also like audio worklets, which aren't really workers, but they are very close to it, and they behave in a parallel separated thread as well. So if you're leveraging things talk, you're still unsure about what the takeaway of things. I just want to make it very clear that getting out of the main thread is very doable and cheap, and I hope this demo that has showed you that there's nothing outworldly about using multiple threads in a react app, for example, and that you consider this as part of your toolbox whenever you stumble on a situation, that rendering performance can be affected by some heavy computation and you need some help to provide a snappy user experience. There are few to none bottlenecks associated with using a web worker, and I think that's the main thing that I want to stress out here. I'm not saying that you should use heavy computations on the client side. I'm saying that there is a specific use case where you might need to, and then you could use a web worker, and that's about it. Thank you very much for having me. These slides are going to be available. Check it out on Twitter or on my website and see you next time.
...

Atila Fassina

Lead Frontend Engineer @ SAP

Atila Fassina's LinkedIn account Atila Fassina's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways