Abstract
As a beginner myself (not so long ago) trying to learn this language, it always seemed like C++/Python like language. Golang is much more than just functions and execution, its probably the easiest to write code which exploits concurrency at its best and in an intuitive and readable manner.
This talk, I’ll discuss “channels” and “go-routines” with an example of MergeSort to demonstrate the power of this language for beginners.
Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi, I am Jay Jayaganesh Kalyanasundaram, a software engineer in Google.
This lightning talk in the beginner's track will cover my
journey of venturing into and understanding concurrency in
Golang. When I first came across the syntaxes in Golan,
I thought it's almost same as C and has slight differences,
like the keyword where for variable and funk
for functions. I'm an avid user of oops and
found that structs with interfaces can help mimic
the thought of classes. Golang also supports
weight groups, which can be really useful for
lock retentions. Finally, for me, at the first glance,
Golang was the performance of c in
cpython. Like intuitive language, merge sort is
one of the good known algorithms to sort an array
with complexity o of n login. It works by
breaking the array into two halves and recursively sorting the halves
and finally merging them. For me, this is one of the
good problems to code in a new language to understand the
semantics of functions, variable declaration initializations,
array slice manipulations. So I went
ahead and coded the same in Golan. The sort function here
simply breaks the array into two halves and sorts
them recursively and merges them, and finally returns the
result. The merge function, on the other hand, uses a two pointer
to iterate through the two arrays to be merged, and updates
the final array with the smaller of those two elements pointed
by those two pointers. You can read more about it separately in the Wikipedia
link, which I shared in the previous slide. So after coding
the basic merge sort, I dwelled deeper into Golang to understand
the other salient features, and discovered the
go keyword, which is basically a magic to do multithreading
in a very easy manner. I wrote a lot of code in C plus plus
with multithreading by using the
library threads. Though it wasn't rocket science, it wasn't
as intuitive and easy as just using the go followed by
the function name. This go keyword launches a
go routine, which is the fundamental unit of running a multi
threaded program in Golang. I also came across
the channels which I've never heard before in any
other language. My first thought towards
the handling of threads was that this is a server
optimization problem, and it's not of much use to me,
although because the go keyword was
so tempting, I decided to give that a try.
With that, I basically wrote the same merge sort with slight
channels by using the go routines.
So here, as you can see, s one and s two are those
half two halves which are basically sorted
in those two concurrent threads defer in
a scope is basically the last command which will be executed in that scope.
So in this case, the wake group release is basically the last
command which will be done in the scope of the funk. For each
of those goroutines, we use weight group
so that we ensure to get both the slices fully sorted
before we start merging. As you can see, the waiting
for the two goroutines to complete is
the part which is slowing the execution down. So I was thinking,
how can I speed up this execution? So I was thinking we could
stream the data so that we don't need to wait for these total
of two halves to be fully sorted and can proceed with the merging as
and when we get the parts of the sorted data. So I
thought of going back to what I came across, and that
was channels. So I started reading a bit more about channels,
and my first glance at them was, these are more like water pipes.
So it basically transfers data from one end to the
other. Okay, so what are the salient features
of a channel? First, it has a start and it has
an end. So it has a sender and it has a receiver.
Second, it has a buffer to hold and this could be
zero. This buffer is basically like the length of the pipe.
So if the length is zero, imagine it's like a trying. So you just have
a ring and you have the source and the sink pretty much
with no gap between them. And if the
receiver is not receiving more like there is a gap towards
the end of the pipe, the sender will be blocked. The sender can't send
more water through the pipe. Similarly, if the sender is
not sending, the receiver will not know about it and will keep waiting
indefinitely. The sender, however, can close the pipe
once they have sent all the water or all the data, in this case for
the channel, and basically say the channel that they have sent all the
data, and the pipe or the channel, once has
transmitted all this data to the receiver, will intimate
the receiver that it has actually transmitted all the data, and it is going
to close the pipe now once for all. So, with this
analogy of water pipe, let's get back to writing the same code
with channels. We now use channels instead of a slice,
so that we get the stream of the data as and when they are ready,
instead of waiting for the two fully sorted slices. So here you
sre basically sorting those two halves again concurrently and
merging them concurrently. That is, as and when you get the data
from the individual channels. As a recap
from what we did so far, we started off with
sorting the two halves of the slice sequentially
and then doing the merge. Then we used the go
keyword and improved it by sorting those
two halves concurrency and then merging them at once,
and then finally we sorted and merged them
again concurrently with no blocking at all.
So we are benefited by directly transferring the data with
channels to improve the concurrency.
Also here I would highly recommend reading this article
in the Go block, which basically explains why
it's better to share memory by again communicating
rather than communicating by sharing memory. The latter
needs log contentions to ensure safe access by
different processes. Now a quick anecdote
of how I used it practically so we were in a system in our team.
We use a system in our team where we had a front
end which was surfacing a command line interface for the
user. It had multiple backends which it was communicating with,
and every backend interaction took a few
minutes. Sometimes it was also in the order of about 510 minutes.
So overall it was in the order of half an hour to 1
hour. The front ends and the back ends were run as a single binary.
So for the user it is just one single CLI command which they
execute, or one single binary which is finally executed.
The user wanted to know the progress of the CLI, because when
you're waiting for half an hour, it's kind of annoying to just see a blank
screen. You would want to see something more interactive. So finally we
had to solve the problem of how do we report the intermediate progress before the
function call to each of the backends actually returns?
So we basically built a channel from the
front end to each of these backends, and the channel was
populated with the progress as and when some significant
milestone was reached, and each of this was populated to the front
end, which basically aggregated that and showed to the user
at appropriate instances. The other benefit we had was there was
no extra overhead with logging them in separate
place and using some extra db and stuff, so we were able to
solve it in a very cheap manner. That's pretty much it for
the small lightning overview of how I learned to use
concurrency in Golang and the different salient features,
including the most powerful of all the channels.
Hope it was useful to the beginners. And thank you.