Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, I am Sandip Bhat. Today I will be talking about developing
a custom load balancer using go and envoy that is both scalable as
well as fault. All. Before we proceed further, I would like to
quickly introduce myself. I have eight plus years of experience in the industry.
Prior to my current role, I used to work at companies like Walmart, C Square,
Packer, Enterprise. Currently I work as a staff software engineer at
Harness. Harness is a company that operates in the DevOps space and
I'm part of a team that is focused on cloud cost optimization.
As a result of this, I have got exposure to multiple different cloud products
like AWS GCP.
Beyond work well, I love traveling across the world as well as reading
upon different tech and exploring new technologies.
How are we going to go about doing this talk right? So we
will start by discussing some of the basic concepts of load balancing.
We will then see what are the different cloud native options that
we have post that we will discuss about envoy.
What are, what is envoy and what are some of the key features of envoy?
Some of the components of envoy that we will be utilizing in our custom load
balancer. You'll then see what are the different features that we
would want to target using our custom load balancer and we will
discuss about the architecture of the components of a custom load balancer.
Then towards the end of the talk, we'll also see a working demo of a
custom load balance. So what do you mean by load balancing?
Load balancing is a key concept in a distributed computing where
scalability, reliability as well as fault tolerance is is
essential. At the core of it, load balancing is
primarily about routing incoming traffic across multiple different application
servers, ensuring that they are not overwhelmed with requests
and also optimal usage of the resources.
So as you can see in this image, you have multiple different users trying
to access particular service across mean
over the Internet. In this case, you can see a load balancer sitting
in between the the users as well as the application servers routing traffic
across multiple different applications. Some of the key features of
load balancing would be high reliability and availability and
the flexibility to scale. So in this case you can the application
servers can scale by adding more servers and thus
achieve what you call as horizontal scaling. And the
performance of the applications are thus improved when traffic is routed
or distributed evenly across multiple different applications.
Some of the key load balancing algorithms or commonly seen load balancing algorithms are
round robin based load balancing or weighted round robin based
load balancing as well as least connection based load balancing.
So as me, this is a simple example of load
balancing. As you can see on the
left side of the screen like you have multiple clients trying
to access a particular service over Internet.
And the load balancer sees that both applications have healthy and
routes or distributes traffic across both of them evenly.
And on the right you can see that one of the application server has gone
down. Now the load balancer recognizes that and routes all
the traffic to application server. Thus the
clients do not see any difference. They do not know this anything.
So talking about the cloud native options, what are the cloud native options that
we have? AWS has its own offering called
AWCLB, Azure has app Gateway and GCP has its own.
In this case, we are primarily discussing about layer seven load balancing.
We are not going to be discussing about layer four load balancing in
this case. I mean that the core idea of load balancing involves three
major components, or you would have incoming traffic
that is being identified in the or processed on using
rules that define how they have to be acted upon. And then you have
target groups or logically grouped application servers which
are called as target groups, right, which would basically be serving
all the, all those incoming traffic.
Moving on. What is, what do you mean by Envoy? Envoy is a
CNCF graduated project. It evolved out of lib and
primarily Envoy is written using c and
Envoy is a reverse proxy that remind operates in
the layer seven. And it's pretty extensible
in the sense that it's a, it has a filter based mechanism wherein you can
change multiple filters, more like middleware senior
in any API servers, changing them you can, you can
customize onward to your own needs. And as you can see with the commits,
S and s stars, it's a pretty popular topic.
So what are some of the key features of Envoy service?
Discovery load balancing, and I mean
checking wherein you are able, you are able to take certain vms out of rotation
when they are unhealthy. Security Envoy, there is a lot of features around
security observability, wherein you are able to track the different metrics around the
number of requests served and other aspects around
observability, rate limiting, where you are able to ensure that your backend servers
are not overwhelmed by limiting the number of requests
in any given threshold. And it's pretty extensive. As we spoke
before, you can, you can use multiple different filters to customize envoy.
So what are some of the key components of envoy that we would be using
in our custom load balancer? Well, these are some of the key components of envoy.
Listener filters, clusters, secrets and upstream.
Looking at this image here, you see port
443 and port 80. These are these are example for a listener wherein
our load balancer would listen on port 80 as well as port 443
for incoming requests. And there we
could then move the move to set of
chains, filter chains. These filter chains
would include things like domain matching.
I mean custom custom components around.
I mean there could be lower plugins that that you can use
to track incoming requests as less maybe log them or
act on the packet. Once these rules are applied on
the incoming request, they're routed to the appropriate cluster.
The cluster would is nothing but multiple upstreams group of upstreams.
An upstream is is a single vm as an example. It could
be a particular single vm or virtual machine that that
that serves your application. A logical grouping of
these upstreams would be a cluster for a given given domain,
for example, or a given path in a domain.
And secrets would be used for managing the handling the
certificates. This is a sample configuration
of envoy working
example of it wherein in this case we will be routing supporting
traffic coming on port 80. And I mean
in the domain that we are not restricting ourselves to any domain in this,
which means that any traffic coming on port 80 would
adult route path would be routed to the
cluster called some service in this case cluster named some
service. So, and the cluster, some service in this case is
pointing to particular IP address on port 80
with, with health check different, as seen here,
with a timeout of 2 seconds and check being performed at
every interval of 5 seconds. And field health
check would mean that it's taken out of rotation. So what
are the requirements of our custom load balancer? Some of them mean that you
can, you could possibly add more features or more features
that we want to, we could possibly support using our custom load balancer. But to
begin with, we look at some basic features that we want to support.
The primary one would be to distribute traffic amongst multiple different back end targets.
Our custom load balancer should not be limited to any given domain.
We should support supporting multiple domains. We should have one load
balancer that can route traffic to multiple different domains and pass.
It should support health checking. That way we would be able
to take out the application servers that are unhealthy or it
should be cloud agnostic in the sense that we, we should be able to run
our custom load balancer or easily port it across multiple different.
And we are looking at celebrity customization.
I mean, we will see how we are going through. So this is,
this is the design of a custom
load balancer. So we have some different components. As we can see here,
we have a virtual machine which will be, which will be the custom load balancer
that we have. So the idea is we would have a VM in
which we will run our load balancer and our VM. This VM will behave as
the custom load balancer. And by making it cloud
agnostic, you can easily run this vm anywhere. You could run it
on AWS, TCP or even azure. And as you can see
here we have our custom load balancer running within a vm,
but it will be interacting with some other components which are outside the vm,
like in this case an API server as well as a database.
So there will be a database which would basically store the configuration
of our load balancer like the domain, the incoming port,
outgoing port, the IP addresses of the vms,
application vms as well as the, I mean even certificates,
details related to certificates, etcetera.
So we will have an API server that would fetch this data from the database
and be interacting with our load balancer,
which basically be, will be fetching this configuration and passing it
and translating that into something that envoy can understand.
So as you see here, we would have VM that would have
two services that can be run as Linux system CTL services,
for example like we would have envoy running inside the vm and we
will have a control plane, custom control plane return is info that
would be communicating with envoy in the form of JFC communications.
As we discussed before, envoy supports service discovery and
it's going to do, we are going to do that using,
we also have, we'll also be using cloud. Initially we
will come to that. So if you mean this is how a sample
envoy service looks like, you would, I mean you would have the
definition as in, as in like aspect of
this service configuration that we are interested in would be the
configuration that we will be passing to envoy when it boots up.
As you can see here, we would be passing a particular startup config that
Envoy would boot with. That way it knows how to
interact with our, this is the configuration that we would
want to bring up envoy and these are some
of the service discovery components of envoy that we would
want to. In this case LDS config would
be listener Discovery service and CDs would be cluster discovery service.
So our envoy would boot with these two discovery services enabled.
And as you can see here, it would be using GRPC to communicate
and discover the services. And in this case
we are going to tell our envoy that they should be communicating with
XDS cluster, something that we call as XT's cluster here
for fetching the configuration of envoy and
be able to update dynamically without any downtime.
What we mean to say is if ever there were to ever add a new
domain or remove a domain from our load balancing configuration,
we would not want to restart on or have any downtime to update the
configuration. That would happen dynamically. And in this case our
XD's cluster or our control plane would be running on port 18,000 in the
same 127.01s you can see here.
And it can run on any port. As an example here I can
put it into. So our control
plane can look something like this. It can be pretty minimal in the
sense that you have a go routine that would be running the envoy server server
that would basically communicate with envoy through
GRPC and sharing the configuration that envoy has to
update itself. And then we also have a sync server basically that
would periodically talk to the, talk to the
API server and fetch the latest configuration. I mean in this
case we would be polling our API server, but you can, we can even
use websockets to improve the performance.
And this is our, this is the JPC communication with
the, with envoy. Basically the JPC control plan wherein
it will be updating the envoy. This is our scene server.
That would mean it's a basic scene server where we have a for loop that's
running on loop and we have a select construct that basically
uses a timer to periodically sync and get the
configuration. And look, I'm talking about configuration.
This is the basic entry in our database.
Our DB could, would have a JSON V field,
like if you have to use postgres wherein you have some of these fields.
As you can see, the domain that you want to support would
be Sandeepbud code. And these are the different vms
that are going to host this particular domain. And the
incoming port and outgoing port would be port 80 flights. The request would be coming
on port 80 and the vms would be supporting the
same in the port 80 as well. And we have high checks, different as is.
As you can see here basically can have multiple reports
like this for different domains that would be passed by our
load balancer to so this traffic.
So in terms of packaging the cloud init, cloud init is a
key component wherein it's a initialization system or a package installation
system. It supports, supports writing custom scripts and
it's pretty cloud independent as in you can, you can
have cloud unit in AWS, TCP and Azure.
Particular cloud in it script would be the
same script would be supported across multiple different products. So what this
helps us do is like we are able to bring up our custom load balancer
using cloud init. You would have a cloud init script which would run
whenever the system boots up and be able to fetch
envoy and install envoy as a service in the system as well as bring
up download our binary of the control plane
that we designed and get them up and running as
a Linux service. That way you are able to package them together.
A sample cloud init script can look something like this.
The key thing to notice here would be the scripts user. What we
mean to say is we would always want our cloud
init script to run whenever the system and
with that we can even support updating our system as
and when we have a newer version of our control. The script will
go towards the end of the loading script.
With this we are able to now we have a custom load
balancer that basically edges
all these aspects that we see here. We are able to distribute traffic among multiple
different back end targets by orchestrating one and dynamically
without any downtime. And we support multiple different domains checking.
And we are able to be cloud agnostic by using cloud init
and in terms of scalability customization. As we
saw in our design, we can run our custom load balancer on
any virtual machine, which means that if you had to run it on say AWS,
you can run our load balancer on machine
with a lower spec like say t two small or
t two medium. Or you could even run it on a bigger machine like say
t two x large or four x large based on your needs
or based on the traffic needs. Yeah, so I
mean I would also like to do a bit of cost comparison across
AWS LB as well as our customer balance rest and as to why we
would want to use our custom load balancer and how it's beneficial.
AWS ALB has early pricing and it
also has pricing or cost based on different aspects like number of connections,
bytes processed and even that two connections as well as the rules processed.
And if you have to bring that up comparatively, I mean it can look
something like close to $16 per. I mean I'm not
even taking into consideration the cost of traffic or
data processed. I'm only looking at the basic cost components
here. And if you have to look at a custom load balancer, it has early
pricing per instance, right?
The cost of running a VM only pay for
that. And if you are to run a smaller vm, let's say smaller configuration,
exactly to medium or even smaller, you would end up paying lower
amount. And at
the same time you have the if you have to compare the
cost of that, say running our load balancer and d two micro, it would be
close to around $8 per month. Digital micro we have
seen pretty capable of handling traffic to a decently
good enough scale. Now let's, let's take a look at
our demo. As you can see here, I have three
freight terminals here. The one here is running on
Oi, one here is running the API server,
running my API server here, and this
one out here is running the control plane.
So ideally in our load balancer, in any deployed in any
VM virtual machine would have these two components in it.
The one in the top as well as the one in the bottom. And this
one, the APS are, will be running outside the system and or control
plane would be configured to talk to the APC. And I also modified
the etc host file of my system to the
domain sandeepbud.com to my, my own system localhost in space
to aid with this demo. So going back to the demo you
see here, right, I have two vms that are running
in AWS that are running nothing but basic
Nginx server with a custom
HTML file. It basically prints the IP address of the.
And if you see here, currently both of the vms are running and I'm pointing
to my domain Sandeep. So what would happen is when,
when we hit sandipad.com in the go,
as per our etcetera file, it's pointing to the
same system as our system 127.1
in this case, and on port 80, which, which is how our load balancer is
configured now, wherein we are saying, okay, as you can see now
in a previous slide here,
our domain, the domain here is different, but we have configured
it pretty, pretty similar to this one, wherein we are able to route any traffic
coming on port 80 is routed to port 80 on the target.
So when we hit sandybud.com here, it's routed to port
18 in the same system wherein we have our system listening to it and
it routes the traffic to the target vms, the demo
vms one and two. And we'll see the IP address of
the demo vms being printed. So as you can see here, it's routing traffic evenly
across both of them. Now, if you have to bring down one of the vms,
we should pretty, pretty soon see our load balancer routing
all the traffic to the other V. Right. The SJ came
into impact and it's able to identify that the target
VM is another VMS is. And we keep seeing all the requests
go to the same. Thus, I would like to conclude my demo.
Hope you had as much fun as I had while doing this.
Doing this talk. Thanks. Thanks for listening.