Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hey folks, welcome to this session. Thank you so much for tuning in.
I've been a part of Con 40 before, but not the Golang edition, so I'm
delighted to be speaking at this wonderful go conference. My name
is Abhishek Gupta. I work on a bunch of things, but databases
go and open source in general happen to be the key focus
areas in my role here as a developer advocate
at AWS. My social media activity is
kind of lukewarm, I would say, but I'm fairly active on GitHub because
that's where I keep pushing all my stuff that I'm building. But don't hesitate to
connect with me if you have questions or just want to chat about tech in
general.
Now this talk is geared towards folks who are looking to get started with
Redis and go. Or perhaps you're already experienced
with both these topics and in that case this might be
a good refresher. Now I have a very, very simple agenda for
you. I'll start off by setting the context about redis and
go in general, followed by some hands on stuff
and wrap up with some gotchas and things you
should watch out for. Now, I cannot possibly go over
or cover everything in a single talk, so I'll leave you with some resources
at the end as well. And by the end of this presentation,
you should have a fairly good overview of Go
and Redis. And this includes things like
the client options you've got. How can you
implement common use cases and patterns? And as you start
your journey with Redis and go, hopefully some of the tips and tricks will come
in handy and try and help you simplify that learning curve and
help reduce some of that friction.
As you might have guessed already, I love Redis and I've been a
fan and a redis user since a few years, but I'm not the only
one out there. Redis is not brand new anymore,
but I still like to refer it to it as relatively
young because it was first open source in 2009, so that makes it
around 14 years old now. But I'm pretty confident that there
are a lot of innovations we are going to witness in the years to come.
As far as Redis is concerned now, since it was released, it did
not take Redis too long to win the hearts and minds of the developer
community. As you can see, as per DB Engine's statistics,
Redis has been actually topping the charts since 2013.
That's just four years after its release and on
the stack overflow annual survey. It's been voted one
of the most loved databases for many years in a row now.
And I also remember seeing something like Redis being one of
the most pulled docker images on Docker hub
as well. So all in all, pretty popular.
And I also love Go.
And in many ways, so far, I've seen that Go has become this
lingua franca, or language of the cloud, so to speak. It powers
many, many cloud native projects. Apparently 75% of
CNCF projects are written in Go. And if you're using perhaps
Docker or Kubernetes, Prometheus, terraform,
you're actually using Go indirectly and the power of Go.
And in fact, there are many, many databases that have been written in go,
like influxdb or etCD and vitas etc.
And it is not just that Go caters to,
excuse me, go caters to a wide variety of
general purpose use cases as well, be it web apps or APIs
or data processing pipelines. Think about infrastructure
as code, SRE DevOps solutions, and also
building Clis. In fact, that happens to be one of the very, very popular use
cases for Go. And no wonder Go has become a very popular
language. Again, pointing to this, I love
referring to stack overflow. Now, as you might
think, go is down at the bottom,
but if you notice carefully, it is the only statically
typed language in that list after rust. And of course there's c sharp,
but that's down the list. And in fact, this is from 2022.
If you look at data from 2021 down
to 2018, you will notice that Go has actually
maintained its top five spot pretty consistently.
Now, with that, let's talk about redis and go. They have
a few things in common, but in my experience, one of the things which really,
really stands out to me is the simplicity.
Now, Redis is a key value store, but the values can
be any data structure that you sort of
see here. These are all data structures that we use as developers
day in, day out. Think of lists or a set, or a map or a
sorted set. So that kind of gives you the feeling that
redis is just this simple extension of these
core programming languages constructs that we use day
to day. And with go, it actually comes
in various forms. This ranges from the
excellent yet simple go tooling or the comprehensive
standard library, easy to use concurrency.
And sometimes it's
not about what is there, but what is not there.
And it's this notion of not bloating go with
unnecessary language features. Just to cite an example,
it took a while for go generics to be added
to the language. Now, I'm not trying to trick you or trying to sell you
into thinking that go is simple, or for that matter, any programming
language is simple, right? That's not the case. But with go,
the goal is really to be able to give you simple components to
help build complex things and hide the complexity
of the language itself behind a simple facade.
And there are folks who have actually explained this in
a much better fashion and in much greater detail.
And I would really encourage you to watch this talk by Rob pike,
one of the co creators of Go. It is from 2015, but very
much applicable and to the essence and spirit
of go, so to speak.
All right, so let's quickly get a very, very high level
view of redis, or rather an introduction to redis before
we move on. So at its very core,
like I alluded to earlier, redis is nothing but a key value store,
where the key can be a string, or even a binary for that matter.
But the important thing to note is that as far as the value
is concerned, you can choose from a variety of data structures. Now, these are data
structures like strings, hashes, lists and sets
and sorted sets and so on and so forth. And by the way,
contrary to popular opinion, redis is not just a cache.
In fact, it's a solid messaging platform as well. And you can
configure redis to be highly available by setting up primary replica
replication that's asynchronous in nature. Or perhaps take
it a step further with something like a redis cluster topology.
And Redis is primarily an in memory system,
but you can configure it to be persistent or rather persist that
data to disk as well. And there are solutions like Amazon
Memorydb that can actually take it a lot further by
actually making it possible to use redis as a persistent primary
database. Now, since Redis is open source and wildly
widely popular, you can actually get offerings from pretty much any
cloud provider, big or small, or perhaps even run it on kubernetes,
on cloud, on prem, or even in hybrid mode. The bottom line is that with
redis, if you want to put it in production, you can be rest assured
that there will not be any dearth of options for you if you were
to choose to run and operate it.
Now, what you see here is a list of code data structures.
So the common one is string, which is really
simple at first, but actually quite powerful. It can be used
for something as simple as storing a key value pair, to implementing
advanced patterns like distributed locking or rate limiting and
even session storage and so on and so forth. Now, a hash is
very very similar to a map in Java or a
dictionary in Python. And it is used to store object
like structures. Perhaps you have a user profile or a customer
information, things like that. Well,
a set behaves as promised. It helps you maintain a unique set of items
with the ability to list them, to count them,
execute operations like union intersection and so on. So it's very,
very close to the mathematical version. And sorted
sets are like this big brother to set, and they make it possible for
you to implement things like leaderboard, which is a very useful
sort of solution in areas like gaming. So for example,
you can store player scores in a sorted set. And when
you need to get the top ten players, you can simply invoke
the specific sorted set and the specific
command and get that data. And the beauty is that all the sorting happens
on the database side. So there is no client side logic which you need to
apply here. And lists,
right. They are a very, very versatile data structure as well,
and you can use them to store many, many things,
but using it as a worker queue is a very common
use case and a pattern. And there are popular open source solutions
like perhaps you have often used them as well, like sidekick or celery,
that already support redis as a back end for job
queuing solutions. Redis streams, they were added
in redis five, if I remember correctly, and that can be
used for streaming use cases. And there is also ephemeral
messaging with pub sub. It's a broadcast, it's a
publish and subscribe mechanism where you can send messages
to channels and their consumers which receive from those channels.
There's this geospatial data structure as well,
and a really, really cool data structure, data structure called hyperlog
log, which is can alternative to a traditional set.
And it can store millions of items while actually optimizing for
data storage. And you can count the number of unique items with
really high accuracy. And like I mentioned earlier,
Redis just feels like an extension with all these data
structures. It just feels like an extension to everyday
programming language constructs which we as developers
use day in, day out.
Moving on. So Redis has a relatively
simple protocol, but you can always use it from a
terminal perhaps. But to build something useful you need to have clients.
And Redis has a very, very rich ecosystem of
clients. And that applies to go as well. So let's explore that.
Now, what you see here is a bunch of selected
go clients. There are others as well, but these are the key ones.
So what I'm going to actually talk about is, or rather use heavily
in my sort of demos and
discussion today is Goredis. So Goredis is by far
the most popular client out there, the go client for Redis
out there, and it has what you'd expect from a good client. It has
good features, decent documentation, a very, very active
community. And to be honest,
Redis was already super popular and widely
used, and it recently actually moved under the official Redis GitHub
organization, which is just an icing on the cake, to be honest with you.
So like I said, I'll be using this client for
most part of this presentation, so I'll not spend a ton of time on this.
So moving on to another client named Readygo, which is
fairly stable, fairly tenured client in the go ecosystem,
which supports all the standard redis stuff, as you
might expect, the redis data types, the features such as transactions and
pipelining, and it is actually used in other Redis
go client library as well, like Redisarch or Redis time series
go client. But that being said, the API
for this particular client is a bit too flexible,
at least in my personal opinion. Now some may like it,
but for me it does not feel like a good
fit when I'm using a typesafe language like go. Right? And like I said,
it's just my personal opinion. But apart from this, the biggest drawback
to me is the fact that this client does not support
the redis cluster topology, right? So that's a
big let down to me. And then there is another client
called Ruidus. I hope I'm pronouncing this correctly.
Forgive me if I'm not. So this is, I'd say
it's a relatively new client, again at the time of recording this talk,
but quickly, it's very very quickly evolving as well. So it supports
the RSP three protocol features
like client side caching, and supports a variety of redis modules
as well. And as far as the API is concerned, this client
adopts a very very interesting approach. So it provides this do
function which is very very similar to the Readygo client which you just looked at.
But the way it allows you to create commands
is via this builder pattern, so this helps retain
the strong typing, unlike the readygo
client which you just saw. And to be honest with you, I haven't used this
client a lot, but it looks like it's packed with a lot
of useful features, so I can't really complain too much now.
But for those who are looking to sort of dive deeper into
perhaps the performance numbers and things of that nature, go to its
website. There is a benchmark comparison with the Redis library,
which you might actually find pretty interesting.
Now let's take a look from a holistic
ecosystem perspective, right, not just from a go point of view.
Now, for folks who are not actually using go with redis, or maybe
are not aware of it, the popularity of the redis driver
might actually come in as a surprise to you. So Java workloads
are a huge chunk of the redis ecosystem, and Jedis
is the bread and butter client when it comes to Java application. But I was
actually very, very surprised to see
redison, this client called Java client redison, topping the charts.
And it is followed by the node client and then Python
and back to Java. And again,
the point is sort of not
to do chest thumping based on GitHub starts. I mean, they matter
to an extent, but this is meant to give you a sense of things,
right? And of course, another thing to
note here is I was looking at repositories with more than 10,000 stars.
So when I was prepping for this, the PHP redis
client was very, very close. Actually it had around 9.6 thousand,
9.6k stars. So that's that.
In terms of the ecosystem and where go stands,
as of now, it's very close to the redis
and Java client. So anyway, let's switch
gears and move back from stars to the ground reality,
as they say. It's time to see some code now. So I'm going to switch
to the ide. Okay,
so let me start off with the basics.
So what you're looking at is some sample and demo code
here. So I'm going to walk you through this code and I'll keep this fairly
high level and fairly quick because I have to move
on to some of the use cases as well, and patterns. All right,
so first off, if you look at this, I hope this is
big enough. If you look at this init function, we need to
obviously connect to redis, right? So we have init function for that
where we use the new client function
to connect to a local redis server here.
Now this is actually not going to establish the connection, and that is why
we ping, we issue a ping command,
which is a first class redis command, to make sure that we are indeed connected.
This is a good practice, it's not Mandatory, but it's
good to fail fast, as they say. Now let's take a look
at how to work with strings. So here,
as you can see, I'm doing a very, very simple set.
I'm just using this set function and setting
this key to a particular value and then using
the get function to extract that same value.
But you can also set a TTL
or expiry. Right, so here the TTL was zero,
which means that this key is never going to expire. You'll have to
delete it yourself. But if I were to set a TTL, perhaps a
two second TTL in this case,
and after perhaps 3 seconds, if I were to try and
search for this key, I will not find this because Redis
is going to automatically delete this key after
its expiry time. So it applies to any key in redis for that
matter. Okay. All right, so that's it in
terms of strings. Like I said, I'm moving quick here, very, very high level.
Just giving you a sense of how to work with the client. Just give me
a second while I wrapped up. All right, so moving on to
the hash. Now, as you can see, I'm using h set
to set a bunch of key value
pairs, right? So this is the name of my set,
the name of the key, and I am adding these key value attributes,
the name of the website, the URL of the website, the date.
Right. So edge set is actually very flexible. It can take key value
pairs like what you can see here. It can take a string slice, it can
take a map built, it can actually
take in a struct as an input as well. So if you
see here, if I were to scroll down real quick, I have declared
this Conf 42 struct.
And notice this redis tags which I'm using. Right.
So internally, the go Redis library uses this to map
to the name and attributes of the hash which it creates and deals
with. So I can then create an instance of this struct like I
did here, and then pass it on to this edge set method. So this
is a MUch easier, simpler, and a natural way of representing your
objects and then sort of letting the library
deal with the hash level, doing the et set
and the get, and then populating your structs accordingly. Right. So it's much more
convenient than using a random set
of key value attributes, which is kind of static, to be honest with
you. Moving on, let's look
at how to use a set again. As you might expect.
Right, no surprises here. I'm using an s add function
to add a bunch of, this is the name of my set Con 42,
and I'm adding a bunch of tags
to this set. In this case, I have duplicates here as well.
So if I were to then use the scard function
to print out. How many tags are there for conf fourty
two, it will give me the result, it will only give me the unique
number. Of course, in this case it will be nine. And if I want to
check, hey, is this sre tag included
for conf fourty two? And in this case it's not. So it's going to return
false and so on and so forth. Right? So sets very,
again very simple to work with conceptually from a
client library as well. And pipeline,
pipeline is again one of my other favorites in addition
to sharded pub sub and few others. Because from
a conceptual point of view, redis is primarily
a request response system. You give it a command like
set get, and it gives you back a response, right? So there is this back
and forth going on, but there are use cases where you might
not use it. Perhaps you want to bulk load some data, and using
the pipeline feature you can actually send it multiple commands. Think of it as
batching, right? And you can send a bunch of commands
in one go, and you can get the response in a single request.
So aws, you can see here, I'm using a redis pipeline.
I've got a redis pipeline object from the client object here.
And then what I'm doing is inserting 1
million keys. And then all I do is execute the
pipeline. Now this is a very, very important step. If you do not invoke the
exec function, your pipeline will not get executed and you'll
be wondering like what the heck happened? So make
sure you call the exec function and yeah, that's it. And I would
encourage you to actually try it out. Try how it works by
using say, client set right
versus a pipeline to insert in this case 1
million items, or maybe even more, and just compare the time it
actually takes, right? You'll be surprised with the speed and efficiency
of pipeline here.
And other one I want to talk to you about is you saw how to
use a set, okay? Now there's this data structure I
mentioned earlier very briefly called hyper log,
and it's this sets on steroids, so to speak,
okay? So it is a very, very efficient and enhanced version
of set, because what it can do is it can store a
lot of data, much, much more than set, with almost
a finite storage, right? So if you were to
run this code, and I'm not showing it to you, just in the essence of
time, but if you were to perhaps add 1
million, perhaps there is a use case you have, right?
You want to track the number of unique views and
you're tracking ips, right? So if I were to add ips to a
set and do the same thing to a hyperlog log using this PF
add commands and if you were to inspect within
redis as to how much space these keys take, you will see
a huge amount of difference when I can. My tests with the
set, it took around 50 mb of memory, approximately 50
mb with the hyperlog log. It was, I think it was around
zero point mb or something like that, right?
You can try this out, try adding 1
million entries. I'm using a faker library just to generate
random fake IP addresses. Run this for yourself and see
the kind of memory usage which you notice in your application.
Okay, now moving on, let me show you some things in action.
So a couple of use cases which I want to demonstrate to you.
The first one is this use case around worker queues and I told
you lists and how they make that possible. So I'm going to run an application
to see that in action. Okay? Now before that I'm going to start a
redis server here. First I'm going to use docker
to make it very simple here. Excuse me.
So just give me a second.
All right, so I'm being to start
this server again. All right, so my redis server here
is up and running in docker nice and quick. Now another thing I'm
going to do is start a mock SMTP server,
right? Because the application I'm trying to show you is a very, very typical
simple web application. Well it's not a web application
but a use case of a web application where someone registers and
then they suddenly get a welcome email, right?
So that's what I'm trying to display here. So let me start off another
container here for the fake SMTP server.
Excuse me, this is not what I want to run. Let me copy over
a lot of copy paste going on. In the interest of time, don't worry
about this, it should be up and running pretty soon. So I have my SMTP
server running. Now what I'm going to do is run my worker
application, okay? So let
me hop over again and make sure I am in the
right folder and then run
this command go run in the worker folder.
Worker go. So what you will see here is a worker has
started and it has published its unique id. I've just given a unique id
for each random unique id for each worker. You'll see why I've done that
pretty soon. And I'm going to run the producer
application as well here side by side. So run
producer, producer co. So what it's going to do is just randomly
submit data here and I'm going to show you the code as well. So that
gives you more clarity. But what's happening right now is my producer is
generating data within redis
and my worker applications are actually working. Okay,
what are they doing? They're sending emails. They are sending emails to the SMTP
server. So let's quickly switch over and take a look
at that server. Just give me a second real
quick while I make this happen.
Yeah. All right, we have this server here
and if you see this is my SMTP mail server
and you will see a bunch of emails flying back and forth. Okay,
so this is what our worker application
is doing and if you notice here it says processed by this particular
worker. So right now we just have one instance of the worker running.
So all you will see is this particular worker instance,
right. This particular message. Okay, that's all well and good.
Now let's add another worker instance. So that's the
magic of the list data structure,
right? Scaling out horizontally. So it's very simple. I'm going to run another
instance here. So go run another
worker. And there you go. So now this has a different id
notice it ends with 1119, starts with two a.
So what's going to happen is now everything is going to happen automatically.
These worker instances are going to distribute the tasks of
sending emails amongst themselves, right? So now you have a balance.
You have two workers crunching all these new users
who are registering on your website. So let's go back to the email
client which we were using and just refresh it and you'll
notice that there are different workers. So if you see this is the second worker
which we had started and if I were
to show you perhaps another one, this is the same worker and
yeah, this is the first one which we had, right. If you notice closely.
So like I said, work distribution, this can be
used in many use cases. Of course I showed you a very simple one of
website users registering and then this asynchronous
process starting where these notification
emails are sent out. Right now let's look inside redis what
is happening and also take a look at the code. So I'm going to go
back to my terminal here and fire up the redis
cli real quick to
log into my local redis instance. And let's
take a look at this key called celery.
Okay, so there is this key called celery and let's look at
what type it is. Right. So I'm going to type celery.
So it's of type list. Now actually what's happening is behind the scenes,
the application which I'm using, it's this go library
which is using the celery protocol to do all this magic. Okay. And internally
it creates a list called salary, and that's where all my jobs
go. Okay, so this is the list.
What's the length of this list? Right, so 139,
there are around 139 tasks at this moment. Now this is
going to change very, very quickly. And these are the number of tasks which are
there in the list. And that's what our worker applications are actually processing.
So, cool. Let's take a look at one of the
items. So I can use a list command l range. And perhaps
let take a look at the
first item here. The first item, the task in the list. So this
is the body which you see, it's actually JSon encoded.
Okay, so let's make this simple for ourselves and decode this.
I'm going to open another terminal here. So going
to do is throw in this and I'm going to do a base
64 decoding. And this is just to show you right, as to what's going
on behind the scenes here. So if you see, this is the JSON
payload, this is the celery compatible payload, which is being sort
of shared back and forth, right? So this is the name
of our user here that the producer application is sending.
And at the end of the day, the worker is actually churning this data and
sending emails, fake emails in this case, of course. Right? So with that said,
let me actually show you the application itself very quickly.
So here is the worker which we have.
So if I were to quickly show you
the important stuff here, right,
so the most important part here is this salary client
object and how I register the
function and to a specific named task
here. Right? So this is all going to shape the salary message.
And the function which I'm being here is the send email and
it's just using the SMTP libraries to send emails.
Very simple, nothing magical going on here. And if I were to look at the
producer application here. So all it's doing is sending
messages,
in this case email addresses,
to the same task, and that's why the worker
is picking it up. Right. And just fyi, this is the library which I'm
using. It's called go celery. Right.
So yeah, just a very quick example of
how the list capabilities are actually abstracted
behind the scenes here. We are not using the native list commands directly,
but we are using this library instead, which uses redis lists
behind the scenes to do all this magic and task distribution
and all the good stuff. Right. So let me close this and move
on to another example. I'm going to show you an application of the
pub sub capabilities. So at a very, very simple level,
pub sub is about broadcasting messages. So you publish
messages to a channel, and there are consumers which subscribe to that
channel and receive messages as long as they're online and as long as they are
subscribed to that channel. Now this is very
powerful, but it can actually be combined with this technology called real
time technology called websocket to create much, much more powerful solutions.
And it also actually helps overcome some of the limitations which
WebSocket has because it's a very, very stateful protocol.
Now, I've built a very, very simple, yet canonical
chat example, very simple chat server.
So let's see how that works before we go into the code. So I'm going
to quickly go into another directory
here and start off this application,
my chat server here,
and you will see that my
chat application has started and let's connect to it and exchange
a few messages like you would do in a typical chat application.
Now, what I'm going to do is a Websocket client,
a command line websocket client called AWS can,
and I'm connecting to my server as this
user one. Right, it's a sample user. Okay,
so this user is connected.
Now what I'm going to do is connect another user, but this time I'm going
to change the name and call it user two. Now, these users can
exchange messages over by using the combination of WebSocket and
redis pops up. So I say, hello there,
this will go off. Hi, how are you? It doesn't matter
what you're typing, this is just for demo and so on and so
forth. And I can keep on adding users to this.
And this is being to ensure that all the users get all these
messages. So another user called as user three,
like hey there. Yeah, so you kind
of get a sense of where this is going. Okay, now I'm
going to show you the code. Just focus on the key parts like
I did before. So let me close this and this and
go over to my chat application. So there are a few things which you
should note here. Of course, this is, like I
said, very small application. But again, we are making connections to the
redis client, so on and so forth, setting up our routes here.
But the important thing is, let me scroll down
here real quick, the part where we create the websocket connection.
Now minded, let me just scroll up a bit. I'm using this
client called WebSocket, part of the larger gorilla
system of libraries, group of libraries.
So I'm using this websocket client, I create this Websocket
connection, and what I do is associate that username
which I had entered with the actual connection object itself.
Right? So this is a Websocket connection
here. And when a message comes in via that
websocket connection, when that user sends hi. Hey. Hello.
What I actually do is publish messages,
that particular message, to a redis channel. Okay? So I'm using
this function called publish here.
And then what happens is when a message is actually received
on that channel, I have this broadcaster function which
is running within a go routine. So what it does is whenever a
message is received on this channel, it broadcasts it
to all the connected users and it does it through that
websocket object, that session object which we have, right?
So it kind of goes over that map where I had initially stored
that mapping of user and websocket connection, and it simply broadcasts,
right again, like you see,
we are combining websockets and redis channel to overcome some of
the constraints of WebSocket and making this sort
of simplified but useful chat application. This is fully functional,
by the way, and if someone were to exit,
that user is going to be deleted from the actual map
within the code, and everything is going to work as expected.
So of course you can try this out. But like I said, very, very simple
example to demonstrate a couple of very, very common patterns and use cases.
One of the worker queue pattern and the other of a
pub sub real time applications combining
pubsub and websocket. So we only have so
much time. I did not cover a lot of things, to be very frank with
you, things like geospatial or redis streams. But I hope
you have a fair idea of the concept of redis data structures.
How do you use some of these commands, the sets,
the string, the set, the list and so on and so forth within
the Godadis client. So with that,
let's move on to the next part of this
talk. All right, back to the presentation.
Now let's go over some of the most common
things you should watch out for when working with redis, especially when
you're getting started. Now, there are actually great resources around this
topic, and I happen to write a blog post which I will share at the
end of this presentation. Now, in order to do anything else
you first need to connect to the redis server. Kind of obvious,
but believe it or not, sometimes that can be challenging.
So let's take a look at a couple of low hanging fruits
in this area. And this first one is kind of
obvious. Now I'm going to call it out anyway because this is one
of the most common getting started mistakes which I often see folks
make. Now, the connection mode that you use in
your client application will depend on whether you're using a
standalone redis setup or a redis cluster, or perhaps even
redis sentinel if you're using it. For example,
if you're using the go redis client, you will need to
either use the new cluster client or the new client
function depending upon the topology. Now most Redis
clients draw a clear distinction between these
types of connections which you're making, but interestingly enough,
the goer redis client has this option of a universal
client which is a built flexible right. It gives you that
flexibility of using one function to choose and connect
to any redis topology out there. And this
is actually available in the go redis v nine, if I remember correctly.
Right, so v nine onwards. And if you don't use
the right mode of connection, obviously you'll get an error. But sometimes,
depending upon which redis instance you're
using, which provider and so on and so forth, the root cause actually might be
hidden. It might be hidden behind a generic error.
So you have to be watchful to
avoid frustrations and those challenges in getting started.
And now this is also something which trips up folks all the
time, especially when they are working with redis on cloud,
which is very common to be honest with you. Now say for example,
the example here which you see is with Amazon elasticache
where your redis nodes are
in a VPC. Now if you have a client application deployed
to a compute service like AWS lambda
or Kubernetes cluster in eks or can ecs
and so on and so forth, you need to make sure that you have the
right configuration in terms of the VPC and the security groups.
And of course this might depend,
this is going to vary depending upon the compute platform which you're ultimately using.
But something to bear in mind, and like Samantha rightly said,
you need to read the please, please read the documentation for
your specific provider once you're setting all these
things up, including of course AWS.
Redis is actually wicked fast and a single server which actually
it's going to actually take you pretty far in your journey.
But also know that Redis is primarily an in memory
system. I've alluded to that before. So you will actually want a
bigger machine with more ram and memory if you want to handle more
data. And at the same time, it's important to
know that Redis is a single threaded system.
So what that means is you cannot throw more cpu cores
at it. It's probably not going to benefit you
a lot. So that's where you'll have to think about scaling. And as
is the case with most systems, you can either scale redis up
or out, right? Up, down and in and out, right?
That's the way scaling up and down is.
I would say relatively simpler. Not simple,
relatively simpler. But you need to put a lot of thought into
scaling out horizontally scaling in and out,
especially if it's for a stateful system like a database,
like redis, for example. So you have to be very,
very clear about the type of workload you're
optimizing for. So perhaps you are looking to
solve a problem where you have a lot of reads,
so you can choose to add more replica nodes to
the existing primary redis node. Or perhaps you want to increase write
capacity. Your applications have a lot of it's write
heavy workload, so you will find yourself getting
limited by the primary replica node. And you should opt for a redis
cluster based setup instead. So you can increase the number of
shards in your cluster. And this is because
the primary nodes can only the primary nodes can
accept writes and each shard can only have one
primary node. But overall, once you have
this setup, this has the added benefit of increasing the overall
high availability of your redis setup
in general. And what you see here is an illustration
from elasticache documentation, but this is applicable to
any setup, whether in the cloud or on prem. Doesn't really matter,
right from a conceptual perspective. And once you
have scaled out your please, please don't
actually forget to make use of those replicas
in your code. That's what matters at the end. And I'm calling this out because
the default behavior in most redis cluster clients,
including the Redis CLI and the go client as well, is to
redirect all the reads to the primary node. Now, if you have
added read replicas to scale traffic with
the default mode, they're actually going to sit idle, believe it
or not. So you need to make sure that you switch to
a mode called the read only mode, wherein it's going to
ensure that the read replicas handle
all the read requests and they are not just passive participants.
Okay, so in the go redis client. You can set this using
this attribute called read only. You just need to set it to
true. Or you can also use this attribute called
route by latency, or randomly routing it to
any node and activating these is going to also
activate the read only mode. So be careful about that and make sure
that you use these or activate
these configuration options. Now, instead of using
consistent hashing like a lot of other distributed systems and databases,
redis actually uses this concept of a hash
slot. Now, there are 16,384
hash slots in total, and a range of hash slots is assigned to each
primary node in your cluster, and each
key belongs to a specific hash slot, and thereby
assigned to actual to a particular node inside
of your cluster. But the thing to know is that multi
key operations executed on a redis cluster
cannot work if they belong to different hash slots,
but you're not completely at the mercy of the redis
cluster. Here it is actually possible to influence the key placement by
using something called hashtags. Of course, we know hashtags
from our social media applications,
but you can ensure that specific keys go to the same hash
slot. So, for example, if you're using storing customer
data with id
42 in a set, storing orders
for a customer in a set, and you have
perhaps the customer profile information in another
hash called customer 42 profile, you can actually use
these curly braces to define which is the specific substring
which your client application uses for the
hashing, right, to allocate to that specific hash
slot. In this case, our keys are customer 42,
right? So we can be confident of the fact that the profile
and the orders for a specific customer are going to land up in
the same hash lot, right? So something to definitely bear in mind
and also apply to your client application code
sharded pops up. This is actually one of my favorite ones, to be honest with
you. My favorite features this was, again,
relatively, I wouldn't say recent, but relatively recent,
because this was introduced in redis seven and it actually helps scale
your traditional pub sub even further. But you'd actually be surprised
by the fact that many folks actually don't know this. And I would
really, really encourage you to check out the documentation that covers this very, very well.
And the usage is very similar. If you have used pubsub in your
client application before, it's very similar to the original subscribe and
publish methods or functions, with the exception that you can now
use those hashtags, right, those braces to
the channel name in order to make sure that you can influence the shard placement
so something definitely, like I said earlier, something to be mindful of
and make sure that you're using the
sharded pub sub in your Redis cluster topology.
Now, there are cases where you need to actually execute bulk operations
on multiple keys across your clusters. Now, the redis
client has a handful of utilities here as well. For example,
as you see this, for each shard function on the
cluster client. And the key thing to note here is the fact that this is
concurrent. This is all going to happen concurrently, so the library actually takes
care of fanning whatever operation you
specify. It fans it out to multiple go routines. Definitely helps with scalability
there. And you can also use this for each primary node,
so there is a function for that too. I think I've forgotten to mention that,
I apologize.
Anyway, so I think we have covered a few
scenarios so far, but in the interest of time, I'm going to pause. There are
a lot more, but I'm going to pause here, like I said, for time
and sort of try and wrap up things. But before
I do that, here are a few resources which
you might find interesting. A link to the GitHub repo
for the redis client I mentioned this before, as well
as the Discord channel where you can interact with the core maintainers,
ask questions, exchange ideas with the community members,
and the documentation. The Redis I actually find the redis documentation
to be very helpful in general. So something which you should
sort of follow very closely once you're building anything with redis.
And in case you're using redis on AWS,
you might perhaps find some of these blog posts
pretty helpful as well in terms of using your best practices and
optimizing your client applications to the fullest
possible. And while I am at it,
while I'm sharing all these resources, I might as well do
a bit of self promotion because I've put a fair bit of content out there.
Perhaps when you see this in future there'll be more. And all these
blog posts and content which I've put out, they all have
individual GitHub repos. You can sort of try everything step by step,
whether you're getting started or you're
interested in learning about good practices.
If you want to build practical solutions or try out advanced
use cases, give it a go and give me feedback.
I would love to hear more and whether there is something specific
you'd like me to cover around redis and go. So please don't hesitate
to reach out to me. All right,
that's that. I really, really hope you found this useful I
would love to have your feedback. It should only take a minute,
I promise. So all you need to do is take out your phone,
scan this QR code, or you can use the link directly here as well.
And once you finish submitting your feedback, you should get a download
link to this presentation as well. So your feedback is actually going to
generally help me improve my content and at the end of the day understand your
needs better. So with that, I'm going to wrap this up.
Thank you once again for tuning in and enjoy the rest of your conference.