Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, welcome to my session about webhooks
security and scalability. My name is Marvin Collins and
in this talk I'm going to talk about webhooks. I will start with
the introduction and explain what is webbooks.
Then we will move forward to discuss the use cases and
security concerns and security approaches and
webhook scalability. And finally I will give my experience
thoughts on out when dealing with webhooks.
So from a personal experience, I've done a lot of research developing webhooks
application and I'm
sure one of you, I'm sure you've,
and I'm sure you've had experience dealing with webhooks that in terms
whether it's integration, implementation or just adding into
a third party system, I will assume that
you understand the cops concept of webhooks and I
will just try to explain for those who don't know where webhooks is.
So Webhook was coined by Jeff Lintney back in
2007. A web book is just a URL again,
which is a reverse API that is created
by application developer which is referred to the client
to receive information from API provider which is
often refers server without
polling the server. So we'll discuss what polling
is. So webhooks
basically just another way of communication between
application, same as rest API. If you understand what
rest API is, and they do use the format
which is JSON and request is done through HTTP
post request, the same as Restus PI which is can HTTP
method. The other option when
for example when a pr
happened to a GitHub repo, you often receive a notification
if your project lead who needs to review that project,
or you can listen to the webhook.
So this is a good example of a web
book because just imagine going to a GitHub page
and refreshing over and over. And that is what is called
polling. Okay, now polling
is the process where you repeatedly send
requests to the API which is the server to check for new
data or updated data. This is done on at different
intervals from the client application to the
server to make sure that the client application sync with the server.
But one thing you need to know is that polling is resourcing,
intensive and inefficient as you can
see on the right side of the screen.
That is the polling process where a client send a
request at intervals to the server.
But the story is different with webhooks.
So webhooks is kind of don't call me, I will
call you and have information and you can see the difference on the screen.
So on the left side we have webhooks, and then the right side we
have polling. Now, webhooks only send one request
to the application, but based
on polling side, you can see there's multiple requests
happening, some are failing.
But when are we supposed to use webhooks?
Very simple. Number one, we use
webbooks to eliminate the polling process, which I just showed you
a few seconds ago. When using webhooks,
you can help conserve resources for client application.
With the webhooks, there's no constraints building the server.
The data is transferred based on event and it's very simple,
risk free, with only critical or necessary information.
Unlike traditional API relying
on webhooks for data and require users to constantly
check if the data is there without any trigger events.
Webhooks do allow application to transmit data
based on events or when the data is available,
and they do send it immediately.
So another use case of webhooks is automated
data transfer on events. So again, I've mentioned event,
my previous explanation, but this means webhooks
do send data automatically, like immediately there's
an event on a resource in the server, the data will be sent in real
time. So this make it easy to automate data
transfer based on events. Then we
have integration and integration. Like previously
mentioned, we've built a lot of system and
this system do need to widely support each other and
communicate and share data. So webhooks allow us to
have that implementation with ease.
Again, client application can
rely on other information, like other system information to
create triggers and actions within the application.
And in this case, Webhook can help us
create those triggers and action on those applications.
The second part of this discussion is security concern.
By default, webhooks does not come
with security implementation and this is a big challenge.
Okay. The Webhooks communication mechanism does
not have any native way to identify maybe
the source in the destination. So this is a security concern when working
with the webhooks. Okay, so this means that a Webhook
producer has no way whatsoever to verify that
it is sending its Webhook data to the right destination.
And the webbook's consumer, which is a client, cannot verify that
it is receiving Webhook from the expected source.
So like Alia mentioned, the vulnerability
can allow anyone to act as a Webhook producer,
which is the server,
or a consumer, which is the receiver. And these
people act as the receiver or consumer.
The producer can send any kind
of compromising data to the receiving application
or the client application. But we
need to make sure that this
system are very secure. And that is where now we
explore the security concerns and come up with
security approaches to secure
the web, webhooks. So I will just recap the security concern again
and just explain them. So number one,
Webhook's communication mechanism do lack a
native way to identify the source and the destination of
webhook. That is the major red flag when dealing with the web book.
So if your Webhook cannot identify the
producer and the consumer, that is a security concern
that you need to look at the second
one. This means that our Webhooks producer
cannot verify that it is sending a web book to the correct
extension, and the Webbook's consumer cannot verify
that it's receiving its Webhook from the expected source.
This vulnerability,
again, like I mentioned, can allow anyone to
act as a Webhook producer and receiver and
potentially send a malicious web book to a web
book consumer, thereby compromising and
the receiving application. So this is where you get act based
on the data that your application is receiving.
So how can we make
sure that our web, webhooks are secure? So let's discuss
that in this section. So before
we dive deep into webhook security,
making webhook secure is different from the
normal web application security. And this is
because webhooks is a URL which is accessible
on the Internet. It's like publicly available on the Internet
as compared to API endpoints or
URLs which are secured. Some are public.
Therefore, whenever there's a request to, it's the URL that
is the Webhook URL. It's very important to ensure that the
request truly come from the expected source, as we earlier discuss
or mention. Without such verification, an attacker
can fake a request again and send that to that URL.
But at what point do we start
securing our webhook? Okay, so there's on the
setup when you're setting up a webhook, and also there's during,
the others mostly are done during runtime.
So the first one, okay, so we're going to look at one
time with one time verification. This is mostly
done on setup. So where the provider
give the client a token or
a one time verification. Remember,
just to let you know, this one time verification can
be revoked. So the provider will give instruction
to the client on the best way to manage this
client. So the token
will act like a secret key, but it's not managed
by the provider, and the provider cannot tell if the client is managing
the token. So what they will do on every
request they will send a request,
a book request, with the security token
that they issued to the client.
Once a verification token is set and registered,
the client will validate that. So it's the job of the client
to verify the token on every request. If it
matches, the request is accepted. Otherwise the client should
ignore and deny that request. The disadvantage
of this is the security is very limited because again,
like I mentioned, you don't know the best implementation
that the client is doing on their end. Okay, so this
is also another mostly used way of
webhook security.
Still, this exposed a lot of security issues
to a webhook URL because they can be attacked by
DDoS, this can be attacked by server
side request forgery and among other
security attacks.
So it is not the best recommended, but it's being used
by companies like Zoom to manage their webhooks.
So the next
web book security method is verification of token.
So this simply means there's a secret token that is shared
between the client and the provider.
This security code,
this secret or verification token,
it's sent on every request and on every request.
That's very simple. So on request, the provider send
a Webhook request containing a secret
which is shared between the client and the
server in the editors on the request.
Editors, of course. Now this security can just
be like 64
username and password or something like that, or just a normal
security key. Then the client will validate the
value on request and compare. If the value that is shared
there is the same as the value that they have.
Okay, it's also another used
web, webhooks of validation and authentication process,
but it's not effective.
The security method does not address so many things
and it does not secure your webhook
application as preferred, the most preferred way.
The second one is HMAC,
which simply means ash
based message authentication code. So Ash
based message authentication code is
one of the most popular, actually it's the most popular security,
webhooks security method we
use during requests. So it simply has a hash signature
in the editors with timestamp enabled
for validation. So example of companies that
leverage this Webhook security method is GitHub,
Shopify, Slack, you name them. So basically the
server or the provider will compute a
signature and I'm going to display this in a plain
test, then send it to the client.
Now since the client has a
secret,
they're going to also compute a signature and compare the
two. If they're the same, then that
response will be accepted as a valid response.
The client application. Of course, after doing
that computation and accepting the
authenticity of the message request,
they will allow to consume that. But now how do
we use the timestamp? Now there's a timestamp
duration which is allowed for the message to be received and consumed,
and if it's elapsed, then that message is considered
as irrelevant, so it's not consumed by the client.
So that's where the timestamp become of value
in this method. Sorry.
Yeah. Now, if you compare Hmark and shared
secret or verification token,
they're more or less the same, but there's more integrity
while using hmark compared to shared secret.
And also hmark also give you a
leeway to deny the token
if it reaches a certain duration
or a certain amount of
time if the message is sent
later. Another security method is just whitelisting
IP on
both servers for client or provider.
It's not usually that it's effective because there's IP
spooning where the attacker
can pretend that they do have the same IP and
shared it with the same. So this can
be, sorry, IP spoofing,
it's a process where the attackers
will impersonate the host by just
kind of changing or make the IP look like the
same as the IP that you requested. So it's not
one of the best and not recommended. And also the implementation is
a little bit hectic because when the
IP change, that means you have to do the setup again.
We have mutual TLS, which is one of the best
when it comes to webhook security method.
So whenever, let's say you are sending a URL,
sorry, you are sending a request from one services to another.
There's what is called transport layer security and shake
protocol. Then the server will send a certificate
from client and the client will verify that certificate is coming from
the server that is sending the request to them.
With the mutual TLS, not only
does the client that,
not only the client will verify the server,
but even the server will verify the authenticity of
the client, so they both verify each other.
This method is very secure and used
by big companies like docusign, but most
of the time it's very difficult to maintain since
one of the biggest challenges that the certificate
can expire, the certificate can be changed, they can have
a different the certificate can be revoked,
and that means you have to set it all up again most of the time.
So that is the downside and it's not
mostly the best way to manage a very
high demand webhooks service.
So from all this example that we've looked through
from let's start with the one time verification process,
verification token and shared secret ash
based message authentication and to IP
white listing to mutual TLS. What is the best approach
to implement security?
Webhook security. Now again,
it's very debatable. I will say it's very debatable and hear
me out. The reason why it's very debatable because the
security of your web book depend on the data
that you're supposed to share with the to
share with the security of your
web book depend on the data that you're supposed to share with
the client or the data you're supposed to receive
as a client. So if
it is just an average data
that does not expose a
lot of things, then the best way to use the
HMAC, that is ash based authentication
code with a timestamp
and also the data that is being shared
should be dataless. It's supposed to contain meaningless
data, supposed to contain the minimal data.
And this means that whenever the client
receives that communication, they can do again one
polling or they can retrieve the resource that they need through
the API. So our webhook
will just notify, will just act as event to the
client and create can action. And that action is now
what you're going to use to complete a
resource on the server. So that is the best approach
that I think. And just to show you this in Golan
code, let's open vs code real quick.
I have this ready here. So you can see here we
get a signature here. So let's look at this
function. And this function is just getting the data. So this is the
data, the plain text that we are supposed to
send to the client. And we get a security
which is shared with the server
and the client. And we
generate a new ash using the given ash type and
key. The ash type here is
now the computing algorithm is the
Shawan and the secret. Then we just
write the data and we return
with the encoding format which is
exam to string. So this is our signature
that we return. And that signature
is going to be included in the
request editor, as you can see now, when it
reaches the client side, when it reaches the client side,
the client side will use this key that
they shared with us. They will use this key and
they will take the request body which
is just the data that this
body here, you can see we have that body here.
Yeah, we get this body, JSON body.
So they will use this secret
key in the body and try to create a signature and
match the signature. If the signature is the same then
they will confirm that the data is valid.
Now winding up, winding up,
now winding up,
winding up. So the base is
just having ash base with the timestamp, the one that I showed you with
the less data and when it reaches the client
side then they will do a server request
based on the action provided in the
webhook event.
So webhooks scalability.
Webhook scalability, let's look at that real quick.
So we've talked about all this implementation
and setup and everything, but now our application
is serving a lot of users and we need to scale
this and that means our webhook
should be able to handle large volumes of data
with ease and making them highly scalable available for
transfer between our application and other application.
So as a webhooks,
webhooks do not require
continuous polling for data. It's much more efficient
where resources and resource friendly and it sends
the data in real time. But again
at some point we need to make our
business more effective, like more
scalable based on demand.
Okay, so webhook
in conjunction with other solution like your
infrastructure, et cetera can be used to
make it so scalable. So let's look at some of the ways
that we can make our webhook scalable.
Number one is you need to optimize
your Webhook payload. This is very simple. This is just to
ensure that your webhook payload, payload is
the request data that is being sent is
very minimal. It contain only necessary
data as minimal as possible. So this will reduce the
amount of data that you are sending. Okay number two,
implement load balancing. So using a load
balancer technique we can distribute the traffic
and the workload to multiple servers or multiple
services. This will prevent any just
one services being overwhelmed with the large volume request.
So that is very important. Use a message broker.
Don't just directly send the
content to your message, to your web
book, to your web service.
We can use a message broker. Example of a message broker
is Nat. We've implemented NaT and lambda
is Rabbitmko and Kafka. So using
a message broker to handle requests and process distribution,
this will help reduce latency and improve scalability
by number,
a huge percentage because now the
data is sent to the message broker and the web
book services is just going to pick it from the message broker
and distribute them. Implement cache this is a very good
one, implementing cache. So when you implement cache
it will improves the
frequency of accessing data. So the common data that doesn't
change so much often can be put in the cache and
if they needed to be included in our webbook
request, then it will reduce the number of
requests that we need the transaction. Let's say we need to get this data
from the database or other services. So when
you put this common data in like cache,
it will reduce the number, the transaction
or the frequency of accessing that data by
a significant percentage. So I know this
is not common when dealing with webhooks, but I recommend
it and it will improve your scalability
when dealing with webhooks. And finally monitoring.
So if you don't monitor your webhooks, you will never
know the bot length of your webhooks infrastructure
and setup. So monitoring will identify webbook
scalability issues and you can
use those metrics and logins to trace the response times,
the errors and key performance metrics within
your application. So those are the key things
that you need to do to scale your application.
But as I wind up this talk,
as I wind up this talk, I have some items that I want
to reiterate or just mention them one more time. So number one rule
of thumbs when dealing with the book security is authenticate.
Authenticate, that is, verify the source, verify the
consumer and using the authentication
method that we've mentioned before during
this talk to verify the source and also
the consumer. Number two is encrypt all
data, okay, provide less data,
encrypt all data if necessary. That will make
it very easy for you to secure the data that
is passing through to the client from
the server side. And again, I will repeat
this, use times time to prevent
replay attack where attacker can replay
the message so many times,
okay? Provide sdks for the user
so that they can know how to implement the webbook.
Again, provide documentation, very good documentation
and listing
the best way to implement webbook security.
That will help a lot when developers
are trying to implement your webhooks. Perform logs.
So webhooks are part of event driven architecture,
okay. With this event, you should be able to
trace a user through the system from account creation
like the way you just do, from account creation to whatever, et cetera.
The same thing should happen with webhooks.
You should perform logging and tracing
for webbooks and that will give you a clear picture of your
webhooks. And finally, please provide webhook
events id so that you can track which web book
to a specific point in time and also the origin
of that web book. Those were
my party shots and I want to thank you. Santsana my name is
Marvin Collins. My Twitter handle is at Marvin Collins.