Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, my name is Mitrivinik and I'm a developer advocate on the
Facebook open source team. Today we'll talk about concurrency in Java,
and I'll try to use some of the Facebook open source projects in Android
space to showcase lessons that we will learn today in this talk. So let's
get started. Thank you everyone for joining. As I mentioned, my name is Mitrivinik.
You can read all my other content on my Twitter account, Dmitry Vinnik,
and on my website that depth. So without further ado, let's get started.
What are our goals for today? The goals are fairly straightforward. We'll talk about
concurrency in its forms. We'll discuss misconceptions in concurrency
and also workflows, how and why they used in Java. So the
big question to ask is why concurrency? Why do we even need
to talk about it? Motivations are fairly straightforward, I'm sure you've heard of them.
It's the fact that we all have multi core machines these days,
the abundance of microservices, it's been around for ages
now. And also the cloud computing services like AWS,
Azure and Google Cloud. So all those services obviously provide you with
as many resources as you'd like, and the concurrency just goes along with it so
you can utilize more. So what's the conclusion know after me bringing up those
motivation factors? So the conclusion is fairly straightforward.
Concurrency is a new reality. Concurrency is the reality.
It's not something you have to adopt anymore, it's already here.
So what's in it for us though? Yes, it's great that the world is ready
and that's how people operate. But why would you need to adopt concurrency?
And the benefits again, just three of them that I'd like to mention. The fact
that there is no idling of resources, the fact that you have multicore
machines but you don't use those resources is kind of wasteful. It improves
user experience. You don't have those freezing
threats, the frozen UI while you're waiting on something to load. You can't ever
close the pop up, you don't want to have that. That's why we have different
threads running in parallel. And the fact that it really forces you to think about
abstractions, because as long as you have a fairly safe abstractions
in your code, you can actually utilize multithreading fairly easily.
But unfortunately it's not just good parts when it comes to concurrency, otherwise there
wouldn't be a need for this talk whatsoever. There are complexities. The fact that thread
safety has to be considered, the race conditions the liveness.
When do you remove the thread from the pool? When do you
reuse it? Performance it's not just a gives add another
thread to your machine and your service and have an improvement two
x or ten x just by adding a couple of threads. It doesn't
work like that. And also really you have to consider about other stages
of software development lifecycle. How do you test concurrency?
That's a big issue of its own, but even though it's complex,
it's actually beneficial. As I mentioned, the motivations and
benefits, and I would even call it beautiful. Concurrency is beautiful.
It reminds me of the fact that concurrency very much like
sharks. When we hear about sharks, we worry about them,
how dangerous they are, but really it's the fact that we don't really know about
them well enough. The same goes for concurrency. As long as you're not aware of
how to use it, how it can improve your workflow. Try to avoid,
you know, concurrency in sharks when I've read the Brian Goitz
Java concurrency in practice, at first it know
scary to me, it's so complex. But as I was reading more and more,
I realized that concurrency is extremely useful and the book
again made an impression on me. The same goes for this book by Jean Marie
the Chark Fear and beauty these same kind of concept there. I would want to
try to use this talk for you to remove these fear in
avoiding concurrency and begin to admire it like I did.
And this journey from fear to admiration is what
will create our agenda. We'll begin by talking about single threading in
Java. Then we'll discuss multi threads concurrency in
Java, and ultimately we'll finish by discussing workflows.
But as I mentioned, we'll begin by talking about single threads.
So another important question to ask do I even care?
I've given you a couple of motivation factors benefits also complexities.
Still, I might have not convinced you right. You might be asking yourself or telling
yourself that my app is single threaded. I don't even need to make it more
performant. It might be too complex to think about concurrency.
And that's why, regardless of whether you'd like to adopt concurrency
itself, I always push people to consider implementing design for
concurrency. It's the fact that you stop programming by coincidence.
You don't just write software and assume it will work. You design
by contract. You know what to expect from your code. You try to
avoid temporal coupling, which is basically when depending
on these order of operations, you actually modify the state of
the object. Imagine that you use the HTTP request
and depending on whether you've made a post or get call
on it, you would expect the response to be inside of the object.
That's a good example of what temporary coupling is. So you need to really
work on making it immutable and atomic as possible.
So, meaning regardless of how many threads interact with it, one thread
can change the object for another thread, it will use its own copy or
utilize other techniques again to sync for concurrency first.
And that's where I'd like to talk about multithreading a bit more.
And just regardless of Java, you use something else. It's just the
conceptual idea of multithreading. So multithreading, what forms
can it take? There are so many different ideas and forms,
and that's why I'd like to discuss concurrency form, parallel form, and asynchronous
forms. And again, I'll begin with the most commonly used
concurrent form. So concurrent form implies that it's working with multiple
tasks, but it doesn't physically require multiple cores,
but at the same time it's logically simultaneous tasks. What it
means is that if you had an old machine and you didn't have multiple cores
like my Mac laptop would, you would have a
perception that tasks completed parallel from one another. But in reality it's
actually just switch resources around. So is it too abstract? It might be
too abstract. Just to see the diagram. I'd like to just show you some example.
And I like to try to think of developing
an app. And what do you need to develop an app? You need coffee and
you need the laptop, right? And so when it comes to that, imagine you have
a single thread, a person named John, and John just
needs to code and drink coffee. Code and drink coffee. And even
though it's just a single person, he ultimately completed two tasks.
So two simultaneous tasks. He drank his cup of coffee, but he
also completed the app. So it seemed like it's simultaneous. Logically simultaneous
tasks, but there's only one genre, right? So he had to use
his resources and switch it around. So ultimately he finished two tasks. Parallel form
is a bit more complex. What it means is that it still have multiple
tasks, multiple subtasks, some people think of it.
In other words, imagine you have a complex mathematical
algorithm running, and so what you do, you split it in multiple subtasks
that ultimately do their own part of the equation. Then they come together
and do the summation, or do the
addition, do subtraction, whatever necessary. It's physically simultaneous.
So it does require multiple cpus, which again we have enough
these days for the most part, let's try to apply it. Developing an app.
And here we actually have Jenny. And Jenny, she's not just
a single person, you can things of it, she's ambidextrous, she has
ability to use both hands at the same time.
So in things case you can think of Jenny as two people, two threads,
because using one hand she can keep drinking coffee,
finishing that one task, but with another she can just keep coding. So you ultimately
have this physically simultaneous process, completing two
subtasks and ultimately having an app. And it brings us to asynchronous
formers of multithreading. So when it comes to asynchronous, it's really an
idea of fire and forget. It's non blocking tasks and
they do require multiple of, you know, you go to withdraw money
from ATM and while your main operation is to withdraw
the money asynchronously, it might fire a logging
operation somewhere. So the people who manage ATM
know that the transaction happened of some sort, some metadata.
Right. And that separate task that doesn't affect the main
process of withdrawing the money has been triggered. And that's where asynchronous come
and play. It's fire and forget. So let's think again. Developing an app, we have
coffee, we have laptop, but the big question to ask is, where did
the coffee come from? Right? Who does that? Who makes it for you?
And this is where coffee machine or our great friend Henry
comes into play. And Henry is so amazing, he just keeps
bringing you coffee while you work on your app. In this
case, John is the person who's not an ambidexter developer.
He develops an app, but also drinking coffee at the same time. Parallel to
his work, Henry just keeps making coffee, bringing it over without
actually stopping John. So it's been quite abstract so far.
Let's actually look at concurrency in Java. So without further ado,
let's get started with concurrency in Java in particular.
First there were runnable and threads. That's probably where most of
the concern, avoidance of concurrency
happened for majority of beginner Java developers or even senior Java developers.
And so when it comes to runnable and threads, these didn't take any input,
they didn't produce any output, and they gave you no exceptions.
An example I have here some pseudocode where you have a
process of drinking coffee. It's a simple run operation
that you have to override and implement in one way or another.
But really when you trigger it, you have very little control over runnable
itself. I'll give you a quick demo with fresco to just showcase runnable and threads
and how they're used. Fresco is a great image management library on Android.
Imagine you're on a slow network and you can't
show image completely in a high resolution right away.
And so what Fresco would do is it will show a placeholder
and slowly improve the quality of the image, will handle caching and
other complex things for you. That's what fresco is. And to use
fresco, it's fairly straightforward. You would gave to add
a create functionality, you initialize the fresco,
then you cold add to your layout a simple
drawy view as an example. And then what
you would do is that you will just show the image fairly straightforward, but then
it will actually handle caching. As I mentioned, a placeholder
image for you. So now let's take a look at small example of runnable and
threads with fresco. Imagine if I had a simple adapter
for fresco image handling. I cold use something called run on
UI threads that's common to activities in Android
and I would supply runnable to it and will handle in the run
refreshment of the UI whenever data on the screen would change.
That's what runnable are and they still heavily used even today.
While concurrency, as I will show in this talk, has improved
significantly in the past couple of years, so fortunately, after threads
and runnable, GDK five was released and GDK five introduced concurrency
API, concurrency API had multiple
ways to improve your experience, developer experience when
it comes to concurrency. One of the big additions were thread local
atomic operations, thread safe collections, and a lot more other
things. So let's take a look at thread local. What the heck that is.
Thread local is a great way to confine resources.
It allows you to request or initialize different
instance of the source of a resource, like for instance simple date
formatter, depending on the threads that access the thread local. That being
said, it still doesn't come for free. There are some opportunities
for memory leaks and the fact that you should be still avoiding global
fields like thread locals that really kind of
allow you to do that. Atomic operations are also great addition to
the concurrency. What it does is it helps you with compound
operations, the inline incrementation that you might have seen before,
especially with for loops like I plus plus, it's actually a
compound operation in the parallel flow,
it would have to retrieve a value, add another value and add
them up. It's three different operations together and if you were just to
plainly access and do those kind of addition, it will lead you
to potentially having race condition there. That's why atomic operation
are great for that. They will make sure that incrementation, subtraction or
even more complex atomic reference to a website. An example that
I have here will be handled for you, so it allows you to have a
speed and also it gave compare and swap operation that handled for
you in the background and threads collections with collections
in general, adding values to it and retrieving values
from the collection are complex for different threads. That's why
with threads safe collections like concurrent hashmap, you have again a
great handle on compound operations. Getting a value, changing it
and then putting it back in the collection. It's not that straightforward.
And here you have a configurable concurrency in action.
You can control how big the collection can be, the density number of
threads that you think might be accessing this collection. And you don't
have to create this synchronized block that really just blocks
your threads and you lose so much when it comes to concurrency. That's why
just plainly using this collection saves you so much time.
And I'll give you a quick demo with Spectrum, another Facebook open source
project. Just showcase how one of these collections can be used.
Spectrum is another open source library from Facebook that focuses
on Android and actually other platforms as well. But in this example I'll
use spectrum for Android. What spectrum does is it handles
transcoding of images for you. In other words, it will handle
in this scenario complex image uploads for you, making sure
that resolution is actually kept as high as possible with
a small size as possible. So to use spectrum you
would have to simply add initializer to oncreate function
and then you cold have to specify what kind of plugins you will use.
Basically what images. In this example you will use like GPEG,
PNG, et cetera. And in our case if you
wanted to use it you would just invoke spectrum to transcode
input file output and then produce output stream,
specify GPaC as output and
so on. So you will see documentation for spectrum is quite extensive and
I cold say great, but if I were to apply concurrency API
for spectrum, you would see that I would return
concurrent hash map that I showed in my slides.
I would have imagine I wanted to process duplicate images,
but I don't want to just upload photos to my
app. Instead I want to just have a
quick lookup on my map. But because I have a concurrent application
I want to make sure I don't have a race condition and the only thing
I would have to do is my already initialized transcoded images map.
I would look up by name or an id for the image,
and if it's not there yet, I'll add it to the list. Otherwise I'll
just retrieve it and use for my purposes. So again,
quick look at what concurrent hashmap
can do for you. So we talked about concurrency really briefly. I'd like to
now jump into workflows and not just workflows in Java.
And that's where a big question to ask where do we start when we
talk about workflows? And it's important to start at other languages,
at other implementations of that. Promises in Javascript might have been
the first time I myself personally heard of workflows. And we'll
talk at future and callable executive framework and it will bring
us to completeable future. So promises in Javascript you don't
have to think long for how to scare a web developer.
You just have to bring up the callback hell that people had to encounter.
When you call an operation, then you wait for it to complete, then you have
to handle on success, on failure, try to even catch exceptions.
It's been really complicated in the past. Fortunately with creation of
promises now, it's very much like chaining of operation.
You call a task, you call an operation, and then depending on how it
works, you either handle success or a failure, or ultimately
an exception. And as you can see in this example on the screen things pseudocode,
you can see how much shorter and actually maintainable this code
becomes when it comes to Java. Future and callable are very important.
What runnable is for a threads. Basically this powerhorse, these thing
that does the work. Callable is what does work for future.
Callable is really a big improvement. Step forward from runnables.
It doesn't still take any input, but it produces an output and
has an exception that you have to handle. In this example, imagine that you have
a process that sometime in the future someone has to fix a bug. That's what
filing a bug is, right? You expect it to be fixed ideally.
So imagine you have to override a call operation.
It will throw an exception, someone will fix a bug, and then depending on success
or failure, you will handle it appropriately, but also you
have to handle an exception. But the future, again, it's similar to threads,
but using callables. It's something that's completed in the future.
That's what the name comes from. Executive framework is what it relies
on, and I'll talk in depth about what executive framework is in the
later slides. So executive service, just imagine that it's
a thing that we are aware of. You have an operation completed to
do. We have plenty of to dos in our code base, so we have something
that someone will fix in the future. So you have a completed
to do written and then you have a future that you invoke
on these executor service and you say code to be written and
you wait for that code to be written. That's what future will produce.
Actual code as you can see in this pseudocode future sounds great though,
but how do we use it? That's where executive framework is essential.
Executive framework really helps with things like threads management
and implements this declarative model. Things that you don't have to
think about how something works, but it just works. So you focus
on the task that you're trying to complete, rather than focusing on how
that task is done. That's these executive framework. What these executive framework
does for you, it handles that threat management for you. It really relies
on these threads pool. That's what behind the scene executive framework is
all about. Threads pool is really what does the threads management
and threads configuration. Here is an example of actual
constructor that thread pool executor looks like in
the Java itself. It has so many arguments
and you wouldn't want to initialize it on your own unless you're building a custom
executor framework. Executor service. That's why
regardless of that many arguments, we have factories. And factories
are amazing. I can't talk about executor service without talking
about factories. Executor factories, there are plenty of them.
There's a single thread pool, great for just experiments.
Cached thread pool, something that you would use for small operations like
you're trying to crawl. Web page you have multiple threads that scroll
different pages. That's where cache thread pool will come into play.
Thick thread pool, you know exactly how many threads you'd like to utilize.
It's great for some, again, complex and resource heavy mathematical
calculations. Scheduled threat pool, think of it like Kronos
or just constantly monitoring service that you might like to trigger and work.
Stealing pool is actually how strings in Java work.
They just throw around resources. Utilize whatever pool already has.
Great thing that you don't really have to worry about too much on your
own. Just rely on the ones that I've just mentioned prior though.
Executor and thread pools. It ultimately produce this thing that
I've mentioned before. Executor service. Executor service
is what allows you to have these asynchronous tasks, these futures,
but ultimately it's threads pools. In this example, as I mentioned before,
you'd like to crawl website. You have a crawler service that relies
on the factory forecast threads, cache, thread pool, have a
list of URLs you'd like to crawl and you just submit
those operations to executor. It launches them as soon as
you invoke submit and you can just collect those future pages and ultimately
wait for them to complete. I'll give you a quick demo with Fresco as
an example of how to use executor service. So let's give it
a go. Another quick look at Fresco and how it can be used with futures
and executable services. In this case, I have an image
pipeline, something that fresco relies quite heavily.
It has a great implementation for something called data sources and its
own executors. The only thing you have to know is it's just
how it handles images. And so if I were to subscribe
to a certain data source, the only thing I would really need is the bitmaps
and executor, in this case color thread executor that
I would retrieve instance of. That's how it's actually used
in fresco. Good production ready example.
The important question to ask are we done? Mean, you know, the title of
this talk is completeable futures. So you can guess we'll talk about something other
than future because it's not perfect. Future pitfalls. Some of them
are blocking operations. There is no result chaining, so it's
not real promises a future combination combining multiple
futures that's running in parallel and exception handling is fairly
complex when dealing with futures themselves. So blocking result
retrieval, what does it mean? So when you call a gap on the future
itself, it blocks the process, right. And you actually have to handle
interrupted exceptions and things of that sort. So there are some ways to handle
it, but ultimately you're losing on a lot of benefits of asynchronous.
And also you have to be careful and always use the timeouts. You don't necessarily
have just a continuous block with no end whatsoever.
That's why you call get with a timeout of some sort. There is no future
chaining. You can't wait for future to complete and just
keep writing like a stream style code. You have to make a
call for it, wait for tasks to complete, and only then handle
it some one way or another. For instance, you have a kanban board,
you have a developer that writes these code, have to wait for that
code to complete and only these you can test it, which kind of makes
sense. But if you will write it as the two separate blocks, it becomes
kind of harder to maintain because you context switch. That's why future chaining are
so important. Exception handling. Another thing I want to just bring up, it is
complex. Multiple exceptions to handle, interrupted exceptions,
executor for exceptions, timeout exceptions. Lots of things when you're
dealing with futures and you gave to handle them completely different depending
on what exception you're dealing with. So there are many issues, but no worries,
right? We have completeable future to help us. That's why this talk is called completeable
future, and that's where we'll discuss that in length.
So, completeable future, it basically gets you everything that future has,
but now has additional interface implementation, which is
completion stage. It allows you to transform, compose, chain, and combine
basically everything that future had issues with. Completeable future allows
you to handle those pitfalls I've mentioned before with future, we're handling
them with completeable future transformation and chaining. It allows you to have
workflows. It relies on something called fork joint pool that I
brought up earlier really briefly, but basically the way you write it,
you can supply a sync, basically retrigger a operation like
developer wants coffee, so a person has to brew it while that
being is happening. Developer can just keep doing their thing so it doesn't have
to stop the person from working while that's happening. As soon as
coffee is done, you can invoke, then apply, basically drink the coffee when
it's ready. That's done. You can have more complex operations like
you can have a separate pool just to handle that operation.
But it's just for folks who have a more intense concurrency
that models in place. Controllable futures
you can now control when the operation is done. So in this case you can
control when these future is completed. If it's not yet done, you can say give
me whatever and specify default value, or you can forcefully complete
it through another thread even if you'd like. In this case, I complete
latte making, which is give me an espresso, give me whatever you've done so far.
If you're not yet done, and I'll take that and I'll go with that,
you can have multiple futures controlled, right? You have all off
or any off operation with completeable future here.
It's perfect for. Let's say you try and run a web driver test for
multiple browsers, Firefox, Chrome and ie, and the only thing
you really have to invoke is all off and just run those asynchronous threads
in whatever way, shape or form you'd like. And then just wait for
those to be done. You can see how many processes are still working.
You can control them even more fine grain if you'd like,
but just amount of control it gives you is outstanding. You can also
combine your tasks, right? You can collect your results. You can run them
either synchronously or synchronously. Imagine you have two teams asking
you for APIs before you work on it. You don't want to just
rush. You want to make sure you have these requests properly filed and only
then you will put them on your kanban board or whatever you use. And so
you'll wait for those teams to complete these operations, which is filing
those requests and only these, you'll begin working on them. That's where then
combined operation come into place. It's just that powerful
but also exception handling. As I mentioned being very complex
with futures. Here you have a very much try catch
finally flow but using completeable future. In this case,
if an exception is thrown, you have exceptionally call to make and
this is these if exception happened, it will go there, handle it
whatever way you want. You can fail it completely, you can propagate the exception,
you can just return some value or you can handle it. It's finally basically
in that case, if the exception happened, it will go on its own if
statement. If not, you can proceed in whatever shape form
you'd like. Great. The paradigm that you don't have to switch
from your regular coding, non concurrent coding with a completeable future.
Just outstanding. And to give you a quick demo, I'll use Litho,
another open source project from Facebook to showcase completeable future really briefly.
So another example of Facebook open source library that I'll like to showcase really
quickly is Litho. Litho is a great declarative framework
for UI and Android. What it does is helps you to
easily and quickly create uis for Android applications.
In this example, the only thing you would need to do to have lethal working
for you, you would need to initialize as so loader and then
you would have components. It's heavily inspired by
react. As you can guess, that's where components come from. And in this
case I just show you how text component might
show up on the UI for Android. We have a great tutorial
on the Litho website as well, fblisso.com and
so to give you an example of how it's used with completeable future, how Litho
can utilize completeable futures. Imagine that I,
for some reason, whatever reason might be want to randomly change the
text on the component and so what I would do is that I would
use completeable future run async, which basically triggers
operation that I don't care for. Result really, it produces void
as its return. And what I would do, I would retrieve
a text component, change it text, and basically rebuild the
UI. But again, I would be triggering it randomly in a separate
thread. It might be some just funny small app that you
might be building, and that's where computable future will come into play quite usefully.
So I always like to end my talks with call to action.
Try to embrace concurrency. As I mentioned before, it's already here.
It's a reality that we all live in. Review your application. Even if
you're not ready for concurrency, think about designing for concurrency.
It will bring you the benefits just by itself and just continue learning.
Concurrency is one thing, reactivity is another very popular concept
that's been discussed for ages now. So continue learning.
When it comes to that, don't just be afraid of a concept just because you've
used it before, like threads and runnables. These a lot of work being done
around that. So my name is Nietzsche Vinnie. Go to my Twitter, my blog
LinkedIn, or just email me directly if you have any questions. Thank you so much.