Transcript
This transcript was autogenerated. To make changes, submit a PR.
Firstly, let's get some introduction, right? What are
these data intensive distributed systems? So all
our end users of this in our day to day life example,
Netflix streaming, e commerce systems like
Amazon.com, TikTok,
Instagram shop photo upload systems like
Instagram, and Tinder proximity
system like Yelp, Google Maps, all of them
are data intensive distributed systems in the backend.
And what is an API?
The term itself answers this question. Right,
an interface for application programming. So there are two perspectives
to this. One is the system or the server perspective.
It is used for interaction between microservices
in a system, or for interaction across two
or more servers or software applications,
and the communication between systems
in simple terms. And the other perspective is
the client perspective, which is API
is seen as a way for a client to
communicate to the server with all the
inputs and requesting for some resources.
It is like an abstraction. API is like an abstraction to
the client, where the client uses it to fetch
the resources without having to
know the details of the underlying implementation. It's like
API is like a steering in a car
where the driver doesn't need to know how the
engine works. So the details of the API implementation need not be known.
So with this, let's get right into
the aspects of what is needed. First and
foremost, the design aspect we need
to consider is rest for HTTP
API calls and thrift for RPC API
calls. Now let's understand a bit about HTTP,
why and where. So now let
me introduce an architectural diagram for any
modern data intensive distributed system, and it's a
very most simplistic way, like this
diagram. Firstly, it shows a user
who is using any system, like a payment system or
Netflix or any other
example I gave earlier, any real world example.
So obviously the request goes through the load balancer and load balancer
redirects to the right backend system. There will be multiple systems,
and then again from the load balancer it goes out. If there
is an integration needed with an external partner, say for example in
payments, you might have to integrate with visa or
stripe or paypal, any of those things as an example.
Now as you can see, some of these calls are HTTP
and some of these calls are RPC. And why?
So firstly, HTTP is like a networking protocol
used by the client most of the times. Or the end
user to call or invoke a server.
Or HTTP can also be used by one software
system or one microservice to call another microservice.
It's possible it's used in both ways,
but the RPC is a protocol which is used only
within the back end system, like within the services.
As you can see here, microservice one calls microservice
two with an RPC, and gateway calls microservice one with
an RPC because they are within a particular back
end server. And why? Because RPC is
highly secure and RPC is remote.
Procedural call, obviously. And it is more about invoking
a function from one
service to another service. So when you're invoking a function,
you need to know the exact request and response
to the point, right? And HTTP
is more generic, or it is more on
the web exposed to the external world with
those put, get, post and all those port
types get put, post and delete.
So in summary, RPC and HTTP are both communication
protocols used in distributed systems,
but they have different design principles, communication patterns and
typical use cases. RPC is focused on direct
invocation of remote procedures or functions
like a process running, say microservice two runs a
process and microservice one needs to call that process. It is
done using RPC, while HTTP is a more general purpose
protocol for transferring data on the web or requesting
data on the web. And that's the reason why the external gateways
and one service to another service are
usually called through HTTP. So now
let's talk about the rest for
HTTP and thrift for now.
What is rest in the API world?
Rest transfer representational state transfer.
It is an architectural style for
designing networked applications,
particularly these web services like oh,
web services. Just that the distributed systems
which are published and used by the
outside world. Now, restful APIs added
to the principles of REST and API design to provide a standardized way for systems
to communicate over HTTP.
And now, as I mentioned, the restful
APIs use the HTTP methods,
which I just mentioned. And another important
feature of the rest and why it should be used
for HTTP is stateless communication.
So what is it? These restful APIs
are stateless, meaning that each request from
a client or an end user coming to the server
must contain all the information necessary for
the server to fulfill that request. The server
does not store any client state between
the requests, like request one to request two doesn't store any states.
The reason for that is it improves the scalability and
simplifies the communication. Imagine the server has to store
any information from the previous call and not keeping
them independent. It adds a lot of overhead and unnecessary
information needs to be saved and added, and you don't
want any strings attached between the client and the server
or client and the web service, right? So that's the reason
why the stateless communication is very important
in this HTTP, which is provided by the rest
architecture. And the other one is resource oriented
design, meaning say, when you are
requesting, say you are using Instagram, and what do you do when
you want to look at some comments or post?
You just go to the user and click on, say, comments, right? What happens
internally is it makes an API call through
HTTP and it makes that the uri
like a specific domain,
www.instagram.com users user id
post. And similarly, if you're looking at some comments of a post,
it goes through instagram.com posts post id and
comments. And so you see that the hierarchy of
say, you have users user id and post or post post id and comments,
that is called hierarchical structure, and that is also
provided by this architecture of ReSt. And it
helps you identify the resource in the most simplistic
and smooth form so that the
back end system can retrieve it and
uniquely return that particular data. And returning
or all of this data is represented
in JSON or XML or any other format in
rest architecture.
And the other important feature provided by the rest
architecture is in these API calls which are made
through HTTP by the end user or anyone
who is making HTTP calls. You can add authentication,
authorization, rate limiting. All these things
are not core logic. These are exterior things
which just needed to be added. Like say for
example, if you want to look at the number of likes
on a particular comment, or number of likes
or number of comments on a particular post, they require some computation,
although it is as simple as adding things. But you
don't want to include all that computation.
You want to keep all that computation separate from things
like external things, like additional things like authorization,
authentication, rate limiting, and all of these things. So all these things are
provided, especially when you're making calls
over the web or over the mobile or
whatever it is from external outside of
the actual server. So these additional features are
always provided by HTTP, which ensures that
the RPC calls can have only the logic being computed
by invoking the APS within the microservices.
Now, talking about the thrift,
why thrift for RPC? Thrift is what is thrift, first of all? So thrift refers
to Apache thrift. It's a software
framework and also protocol. It is developed by Facebook for
scalable cross language services development. So scalable
cross language services development, that's the key here. So I'll
tell about it. Thrift is primarily used for defining and
creating efficient and interoperable RPC services.
Now, how do both these things happen?
Interoperable RPC communication and cross
language services development. How do these things happen? There is one important
feature provided by thrift which is called code generation.
So it's like you can create a thrift file,
say you want one microservice to talk to another microservice through some API,
through some RPC call, and you can create
a thrift file containing the request response and the API
with these request response parameters. You can define it in thrift
and you can run commands depending on whatever you use,
like Golang or Java or whatever you use, you have the specific commands
generate code like maven or any kite
or gin frameworks. If you use for go, it generates
all the boilerplate code which is needed,
like request objects, response objects, and all
the API skeleton and
the things needed for concurrent request
handling. All these things are provided by thrift
auto generated code. So this auto
generated code can be generated for different languages.
Say your one microservice runs on Java, another microservice runs on
Python, and you can use thrift to generate
auto generate code for these APIs.
Say in microservice two, you auto generated the code as a
server code, and in microservice one, you auto
generated code using this thrift file as a client code.
You can generate both. And now the client can call the server
by whatever through doing that RPC call.
And it is cross platform or cross
language. It can work
because if you're manually doing this right, if you're
writing the implementation, everything in Java
and in Python and in go three services now have to interact.
There will be lot of incompatibility issues
and there are a lot of manual effort which needs to be done.
But with this thrift you don't have to worry about it. Cross language
support is in itself provided by this code generation
and all the features provided by the thrift, it is absolutely
amazing. And another thing is for
RPC, right, you want the communication
to be as fast as possible, and that is supported by
binary protocol, the data. So thrift uses
this binary protocol
for efficient communication between services. This binary protocol is
optimized for performance and reduces both network
overhead and serialization deserialization costs. It's really quick
and really efficient. So it
is a lot better compared to the text based protocols like JsON or XML.
So for between microservices communication or
between systems communication, it's always better
to use thrift, which provides all these advantages.
Oh, and another big advantage is scalability
thrift. Like I mentioned a bit a few minutes earlier about
the concurrent requests, the thrift software framework
provides by generating the code like the APIs can handle concurrent
requests, right? It has support for it.
And with thrift you
can do the development for asynchronous communication between
different microservices and how that can be done.
Or what is asynchronous communication I'll be covering in my next chapter.
And now moving on to chapter two, as we are
talking about asynchronous API calls.
So now the second aspect of the
design, the good API
design choice is
using callbacks with these
API calls. So what is asynchronous API calls with callback?
Let's take the same example, but zoom in a bit more into the
imagine. Let's take a use case. Right? So if you are
a user making a payment on an ecommerce platform,
do you want to wait until you know your
payment? Do you want to wait until all the processing in the back end
to be done, like microservice one
to two to three and then external channel,
your visa or Mastercard is processing. It doesn't
give a good user experience if you have to wait for such a long time.
So what you have to, and also
with the amount of requests which are computing in, it doesn't scale,
right? You can't just have all the calls synchronously
waiting with each other. So for
that, what we need is a synchronous communication.
And how that is achieved is say you as
a user make a pay request at
step number before step number one
here on the slide, and that pay request
is sent to a distributed queue like a
Kafka or RocketMQ.
And that's it. That's it for the microservice one. And microservice one
sends a response back to the user saying that
it's in processing. So the user knows that it is in processing.
And then microservice two has like a consumer which
reads from this message queue
and takes it and does whatever processing and
then again puts in a different distributed queue at step number
three. Now when you have
to make a call, external call again,
there will be a consumer which will be reading from that
kafka queue, from Kafka queue
two and makes a HTTP call. Now how
does the external partner, or all these microservices
know that the processing is complete? That is called callback.
Now when the external partner processes the
request or the API call which it received,
it is immediately going to send a response.
After it processes, it is going to use that callback URL
to send a response back to the microservice
two. And how this actually happens is we can
see in the code. Here is an example,
as you can see on my screen in
this, there is this function, you can see in the middle process async.
And that is the API, say that is the API which
is being called from microservice
three to external partner from steps three,
four and five, right? Say consumer picks up the distributed
queue message which is about calling process
async and it calls step number five at
step number five to the external channel. And once it calls
a post, you can see I just added a simple
sleep showing that it is doing some asynchronous
processing. It is doing some whatever, all the logic and algorithm
implementation or whatever. Once that is done in
the request, if you see at the bottom request data has something
called callback URL. That means the request
received at this endpoint already has a URL at
which the processing
of the API can send the response back.
Now that's what invoke callback does.
So inside the processor sync, if you see the invoke
callback actually gets the callback
URL, like requestdata get callback URL
and it invokes. Now when this invoke callback
is done,
it actually calls the microservice two here.
So that's what I have mentioned. In the third point,
after the processing is complete, the server invokes a callback URL at step
number six in the previous slide.
And so once it receives, when microservice two receives that
callback,
it knows that, okay, the previous request I sent,
I received the callback and this is complete.
Now, microservice two also needs to inform microservice one
about whatever the pay is complete or
not. Now I added another queue here which is distributed
message queue three. So imagine as
part of once you receive the callback in a microservice,
the microservice might have to do certain things like storing the state in a
database or make another RPC call to a different service
to update something. And what if during that
process of callback handling there
is a failure? There has to be a way to recover, right?
There has to be a way to, the system has to
be resilient enough or fault tolerant enough to recover from such states.
So that's why we have this q three. So now at
step number seven, when the callback is sent or
put in the queue, microservice one picks up that callback
from the queue and it tries to do processing
like saving into the database and all those things.
And if it fails, if that process
fails, that callback handler in microservice
one won't acknowledge that it
received the callback. It received the callback from the distributed
message queue three. So when it doesn't acknowledge,
the message still remains in the queue. So the
microservice one will again fetch or
read that callback message from the queue
to reprocess it again. So that way it retries
and retries until the message is
totally consumed, which means the entire callback handler is
done. So that's how this whole concept works.
And the key point to note here is the
design aspect of the request containing
the callback URL. So that's the whole point here.
So in the processing callback also two steps here. One is
you need to have that callback URL in the request. And the second
part is making a call to that callback URL
once all the processing is done by the callback handler.
In this case, the callback handler is in the external partner and
it sends a callback to the microservice too.
So that's the example I have demonstrated. Now, apart from
the callbacks, there is another way you can do
this. Asynchronous processing with APIs is
something called webhook. I think webhook is again a very generic term.
Callbacks or webhooks are almost similar,
but just that webhook is something,
callback is something where you send a URL in
the request, the caller sends a URL in the request to the collie
and the collie calls back, right? Webhook is like you
keep a process or a URL
open and then the collie
puts the data onto that URL, onto that placeholder. You can think of Webhook
as a placeholder in the caller, where the collie
puts the data. So now let's go into some details or differences
here. So the callback is
initiated by requests.
As I mentioned here, like one system, the caller
calls another system, the collie, and the caller
includes a callback function or URL as part of the request, indicating where the
callie should send the response or notification. Right? Now, webhook are
typically initiated by events or triggers that occur in
one system, especially the sender.
And when the event occurs, the sender makes a
HTTP request to a URL to the receiver.
So receiver is microservice
two in this example, and the sender is external
partner. And now the
receiver does not actively request for data. Instead it waits for
the incoming request from the sender. So you can think
of it as outbound communication. So as
a receiver of the webhook,
you are actually pulling the data,
you have something open, you are actually pulling the data into it. That's Webhook.
And in callback that's not the case. Callback is like
you have sent a request to the external
partner and external partner uses the URL to push the data.
So it's a push model. And callback is
obviously if it is push model, from the receiver
side, it is all about the data coming to it. So it's
inbound communication right? Now, I know these
terms are a bit computing to understand, but let me give an example.
It's simpler. So the example I just gave for callback, as I demonstrated,
is for a payment system,
the external partner can be visa and it got the URL, it processed and
sent back. Now let's take an example for webhook
for clarity. What are these events? So imagine
there is slack application, like a chat application,
which is you often see at work as
a developer, right? When there is a GitHub activity, pull request has
been created, pull request has been merged, comments have been made, you get this
notification onto the slack. How is that happening?
That's because say creating a pull request
is an event and that event is happening. And the slack
is the microservice two here, or you can think of
microservice two and one. All of this system is slack,
and external partner is GitHub. And whenever there is an event on the external
partner like GitHub, when that
event is triggered, it actually
sends web books are initiated by those events.
And then the sender makes a HTTP call,
which is the sender here is GitHub. It makes a HTTP call to the
defined URL onto the receiver, which is slack,
passing all the data onto it. And that's how
the slack gets to know that, okay, there is a pull
request update or whatever has happened. And as
you can see, webhooks are always asynchronous. You're not waiting
on anything, but the callbacks
can sometimes be synchronous. Like once you send the request to a third party,
it is possible that you can wait and you can get back the response.
So moving on to the next chapter.
Now is another aspect of APIs is rate limiting.
Now designing rate limiting for APIs. So what is
rate limiting and why? And the term
itself says you have to limit the rate.
So say if you are a user using,
doing some payments on the payment system, or you're
using again uploading lot of pictures or accessing lot
of comments. You can click on, say,
comments multiple times as a user, or you can
actually click on too many pay button or
payments on a payment system. And there has to
be a way to limit, or you're trying to do a lot of purchases,
right? So there has to be a way in the system to limit it.
And how can you do it at the API level and why should you
limit it? So it's like limiting the total number of requests coming from
a client in a specific time window. Say you're
making a request, API request in say ten requests in
say 30 minutes, 40 minutes, that's fine. But you need to know it's
a two dimensional thing. Ten requests in 1 second.
So those are the things that need to be defined as a system
capacity and things like that. Now why
is it needed? I wrote it here. Obviously you want
to prevent abuse and there might be a lot of other users who
are using the same endpoint. So you want to have fair access to
these resources and you don't want API to be overwhelmed
by these excessive requests. And this also promotes
the stability for the system and of the distributed systems.
And also the reliability is guaranteed
with you saying that, okay, you're not abusing the system. And you
can see in this diagram, when there are a lot of requests which are coming
onto the system, the rate
limiter sends immediately 429 saying that, okay,
you have exceeded certain limit on the number
of requests that can go. And this logic can either be
written at the gateway or it can be like at the HTTP
level, or it can be written in the back end system itself.
But I personally always designed where the systems
do the rate limiting at the gateway layer. And again, this is a vast topic.
It's a trade off. Again, we can discuss pros and cons in any different session.
Now let's go to the implementation of this. It's quite simple.
I took the same similar language example I'm
using. Like the Java spring boot application. You can see
a rate limit API here. The base URI
is API and base path
is API. And then for any specific resource,
you have resource as the
extended URI. And whenever someone calls
at this endpoint, you can
actually just annotate it. Say you see here, the annotation
here is limit five duration 60. That means one
client can't request or call this API at
this endpoint for more than
five times in 60 seconds. So if
the client is making a request more than five times in 60
seconds, that means he will be rate limited. He or she won't
be able to get a response saying 429 or saying that
yeah, you can't access more than this or rate limited whatever,
be the message. That is debatable. So it's just about annotating.
So ensure when you're designing APIs and we're implementing, you need to
have that rate limiting aspect in your mind as
a very important one in these large data intensive applications
when there might be lots and lots of requests you can't even imagine.
Now the last aspect of the API design is idempotency.
Now what is idempotency?
Now I'll take the payment example again. Payment system, say you
are an end user and you did a
payment, right? So what happens when you make a payment?
The request goes to the HTTP request goes to like a post
pay request goes to a payment system. And what if
you immediately attempted a pay the second time?
Immediately, instantly, because the button is still enabled due to some UI
issue or whatever, be the reason, right? Do you want your money to
be deducted twice or what?
No matter how many times you perform the
activity of, say, pay or any other
activity which are supposed to be idempotent, not all aps are
idempotent. So that's an important one. So pay is definitely idempotent.
The response should be same. Here in this example you see the first attempt
payment and second payment retry both have the same response.
Payment succeeded and there won't be any additional operation.
In the first operation the money gets deducted from your account in the payment
system and all the processing happens. And the second
retry. In the retry, all that doesn't happen, the money
doesn't get deducted. And how this can be prevented,
how can this be done? It's simple. It's as simple as
you can have a database. I mentioned
it in purple box here. Every request
should be associated with an item potency key, a unique key
associated with a specific client, and this item
potency key need not be added by the client.
And for simplicity, I just didn't put any boxes in between client and
payment system, but this can be handled by a separate service
within the back end system itself or at the gateway. Again,
that's debatable. It's a design choice.
So let's assume that is sorted out and for
every request, unique request which is coming from a specific client will
have an item put in c key. And it's like a UUId.
I gave that example. One, two, three, ABC. When the first payment
is made that gets entered into the database
and you can see when the second attempt, the payment retry
happens. The backend system checks.
Is there any entry in the database? It's like
a simple key value. It can be a map or any key value like
a redis cache or whatever it is, right? Again, that's a design choice.
The system will check whether there is an existing key
in the db or not. So it doesn't process the same
request again the moment it finds that key in the DB.
And once it checks and oh, there is already
this existing one and then boom, sends the same response
again without any additional processing. Let's take an example
again like spring boot application I was showing you earlier. And the
same example, you can see this
is another API API handler like user
controller. Say it's like update user.
I took an example, update user. Someone is making
a call to update some details of a user
and when in the request,
the user id, when it is obtained you can see
the implementation. You are
actually accessing the db. Here I use DB
as a user map, as the dB dal get dB
table data access layer gets the table and in
user map you check whether that
key is already contained or not. And this user id is the
idempotent key and whatever
the UUID I gave in the previous example. And if it is already contained
then you just send that response saying user information is already contained
or whatever, the payment is already succeeded and you don't have to do anything.
Only if it is not contained in the DB, then you go ahead with
whatever the database update or payment
calculation by deducting from the account and things like that.
So yes, that's it. So these
are the four, some of
the important design aspects that needs to be considered
while designing APIs
for data intensive applications. And there are many more
security aspects of the system and
pagination and filtering that can be done when you have
lot of data coming from the server. And how do
you do pagination and in the APIs and how do you implement that and
how do you incorporate the security and other aspects of
the system and error handling, because you have lots and
lots of requests coming. You obviously have lots and lots of errors also
coming. And how do you write the APIs in such a way that the response
also contains the proper error message and error handling is in
the right way in the API contract. So these are some of
the other things which I couldn't cover
in this session because I wanted to keep it short
and maybe I
can have some other opportunity to talk more
in details about this. But yeah, thanks a lot
for this opportunity. I really thank the Conf
42 Python team for providing me this opportunity to speak.
And any questions anyone have,
has, can reach out to me.
I'm Santosh. Nikhil Kumar,
available on LinkedIn. And, yeah, thanks a lot
for watching the video.