Transcript
This transcript was autogenerated. To make changes, submit a PR.
Everyone, welcome to DevOps 2024. My name
is Indika Wimalasuriya. Today I am going to walk you through
about how you can maximize the speed and reduce
cost and increase the user experience.
Leveraging AWS elastic cache serverless so
AWS elastic cache serverless was one of the main service
offerings AWS has released as part of Reinvent
2023 a couple of weeks back. It is a grace feature.
I'm pretty sure that all of you probably are aware of the
elastic cache which was initially provided as cache
and redis cache. But now with elastic cache serverless
it's offer the benefits, especially the kind of benefits
which you can obtain by using serverless capabilities.
As part of my presentation I will work
you through importance of performance. So I'm pretty sure that everyone is aware,
but we'll quick touch upon that why it is important
what it mean for the modern businesses
and then we'll dive on. This problem is not
something new. This has been there for
few decades even and we have found some solutions.
And one of the solution is obviously caching. So caching
have its own pros and cons and we'll go through
some of the challenges, the typical, the standard or
the traditional caching is having, and then we'll
dive on to our key topic, what is elastic cache serverless,
how it is working and then what are the capabilities it's providing.
Then we will move into implementing overview.
I promise you its implementation is very
very quickly, you can do it in very
fast matter of minutes and that is what AWS
is promising as well. So that is again a great feature.
And then finally we'll wrap this up
with some of the anti patterns and the best practices I would
think you should follow when you are starting this journey.
So moving on, I think you all understand the
importance of performance. So today's world, the business is
very competitive. So every industry the competition is very high.
So when our customers comes to our systems and they are using our
systems, performance is key. Example, if a customer
is going through order journey and customer sees a sluggish
performance, it's matter of seconds where our customer
just give up and move into a competitor side.
So the performance is very important and it's directly correlated
with your revenue. So if you have a sluggish system,
you can't expect high revenue because your customers will jump the ship and
go into the competitors very quickly because there
might be someone who is offering a better, faster service.
So speed is key. In today's world, speed is everything.
So if you look at some of the research findings we have
related to the performance and how it correlates with the
revenue. One of the famous research
items or the top research
results which we have is the Walmart.
So sometime back, Walmart has found that every 1 second
improvement they do in their page loading time, their conversation
rate increased by 2%. So that is remarkable, right?
So you do a 1 second improvement and that is
getting correlated with revenue. And Cook finds something very
similar, so that they understood that they are able to increase their
conversion rate by 7% by reducing paid load
time to zero point 85. Very specific.
Mobify found very similar statistics as
well. A study done by Hubford found that even few
milliseconds can have a greater
end user impact and conversion rate and ultimately revenue.
So you should understand by now that performance is
key and it's directly correlated with revenue.
It's not about just operations and we are doing a good job
or customer experience. At the end of the day, everyone is here to make
some bucks and then revenue is important. So if you are focusing on revenue and
if you want to make your systems speed, then you'll have to
think about ways and approaches
to give that customers that speed they are looking for.
So moving on, as I said when I starting this
presentation, this is not a new problem
industry. We all knew that we have a problem,
we have to speed up our systems, we have to ensure our customers are getting
what we have designed for them.
So when we went through this problem,
we identified caching is one of the main
solution which we can give. So typically if
you have a system and if you identify that there are some
data which are getting repeatedly accessed by end users,
and those data are generally not getting
changed for a certain time, then we call them frequently accessed
data. And it makes sense to put
this frequently accessed data in a particular layer.
Instead of every time this data is being requested, go to
database, make a database call and then run a query
and then get the data and then travel around the network and then
send that data to the end users. So we can cut down lot of
database processing time and transactional round trip time
by caching this frequently accessed data in a caching
layer. So primary advantages of this design
approach is it's definitely provide low latency,
it's real time responses are much faster, we are completely
eliminating the database overhead and then it's high throughput.
We are able to support large number
of data and this data,
we are able to put that in our cache system cache layers.
So this is again a very high throughput design approach.
And since this is go beyond the database layer,
we have more focus, more opportunity to focusly
scale this layer. So in summary,
caching, frequently accessed data is allowing us
to achieve low latency, high throughput and high scalability.
And generally for last several
years, from the time where this design pattern was identified,
we have been using this caching solution. So some
of the top user cases, like some of the top users cases
like systems depend on real time analytics, financial trading
systems, online transaction processing, recommendation endings,
leaderboards, IoT data processing and so much
of high value use cases are using
the solution of caching because it's provide
us a great benefit and it's able to cut
down the overhead and increase our performance.
So with this, so AWS as a cloud provider,
it has its own mechanism for
offering how to implement cache solutions.
So three of the main famous AWS cache
solutions are elastic cache for memcache,
elastic cache for redis and memorydb for redis.
So these are three of the main offering AWS is
offering with having lot of capabilities and it will
match lot of use cases and it will match lot of your needs.
So example, elastic cache for memcache is very simple,
non persistent caching layer and
elastic cache for redis. It's a persistence and
it has the replication and it has more wide range
of capabilities than the memcache. So it's more of
like enterprise level I would say. And then memory DB
for redis, it's optimized and provide ultra low sub
millisecond latency. So if you are requiring that kind
of performance, then you can go with memory DB for
redis. So I have put a table comparing these
three, the opportunities, the services the
AWS is performing. So if you can see some of the capabilities,
depends on the use cases. And then
example, elastic cache, it's providing caching session
storage, elastic cache for redis. On top of that
we are able to do some queries, leaderboards, transition data
and memory DB for redis. On top of everything it's able
to do real time maps and ultra performance. So typically
great use cases. And we can
also base on our needs like whether need multi ac support,
whether we need replicas, durability, data persistence,
the backups and the automatic failovers,
sharding and architecture security monitoring.
There are a lot of capabilities AWS is offering and
based on lot of your design,
what is your design is calling for?
You are able to pick one of this and then that will give you a
better solution and better end user product for your customers.
So this has been, we have been using this for
the time when AWS has launched these services and these are very
famous things which every design is using and
every enterprise level system is leveraging.
So moving on. There are some challenges
in serverless based memory cache implementations
and one of the challenges being managing capacity.
So capacity is a very challenging thing, right? Because this
is server based. When we are provisioning, when we are coming
up with our cluster, either memcache or Redis, we had to
identify the capacity. So that is again challenging.
So whenever someone asks you to do a capacity assessment
and come up with a capacity plan, it has lot of unknown
variables which can go against you. So capacity is
a very challenging thing. So I'll be talking through capacity
in another one or two slides. So you have to
be remembered that it's not easy. Second one is scaling
complexities. Even though this memcache and
the redis is offering great flexibility,
the scaling is a challenge. Scaling require
whether you have to decide based on which metrics I'm
going to do the scaling, whether it's cpu, whether it's going to be something else
and that require you some overhead effort to
make some decisions. And making decision is not easy
and that require a lot of input variables.
So again scaling is complex when you are doing
it in a server based memory cache implementation and there's
lot of operations law where you will have to take care of security, you will
have to take care of backups, you'll have to take care of the maintenance perspective.
So that's again a lot of overhead and
high availability is a challenge, right? So the
server base means that you have control of your selecting your
availability zones and you have to ability to go into regions.
But the challenge is that still there are some degree of availability,
the challenges which you have and certain things like
predict implementation, you'll have to put a bit of effort to ensure you
achieve that availability and then cost.
While this solution seems very nice and
it's a very good and decent, it involves a lot of cost. You will
have to manage a separate cluster and then separate cluster
will have to have some high performance hardware
and some great software you have to install and
fine tune and operate. So this is again naturally means
there's lot of overhead and then obviously
when you say cost, there's infrastructure you have to manage
and this infrastructure is costly as well that again you'll have to support
and because we are
bringing in a caching layer in front, there will be lot of caching,
I would say designs which we can use.
So this again slow down your development side. Now when you are doing development,
you will have to be mindful that there's a caching layer on top of
the database and then what data to cache and
what are our cache designs patterns we are going to leverage.
So that involves quite a lot of planning,
preparation and designing and implementing as well. So that
will definitely slow down your development activities. So these are
some of the key challenges in the standard server based
memory cache implementing. So if I have
to select one or two, I would see capacity is the key.
So it's not easy to identify capacity and then this
capacity directly correlate with our cost.
Because end of the day, first slide we discussed, speed is everything
because we want to make our systems more speedy.
So end users are happy, they are able to do the conversions that
generate lot of money and then it's about revenue,
right? So we don't want to unnecessarily spend. So capacity
is key here. So how we manage capacity is what
the advantage or the benefit serverless is offering.
So when you come to capacity, so what you will normally
do is you will look at all your future forecast and you
will do kind of like do assessment, then you will baseline
and you will do a provision capacity. You will say this is what
I want, right? This is my kind of like the
upper capacity limits. So if you have ever done
a capacity planning and actually done a production implementing,
you will be in two groups. One group is that you
always do an over provision and then there
will be extra cost, right. I request for
certain capacity by my systems are not hitting this
area, so I'm continuously using
less. So that is good. That means that our end user's
perspective, there are no performance issues, they are apps.
But we are bleeding unnecessary cost
because we are paying extra because it's our provision,
right. Other side of the coin is under provision.
So if we think that we'll have to
ensure that we don't over provision and we do a provisioning,
but if we have some sort of spikes like as you see in this graph,
there are several spikes where capacity need has gone beyond
our provisioned capacity limit. This is literally called
under provisioning. Every time there's a peak there's
a customer impact because our systems are not able to
serve that customer request because we have hit the
maximum capacity. So this is a very
challenging thing to provide the
solution because either way you will under provision
or over provision with
repetitive things and you can come up with some strategies, but still it's very
challenges and it's a big overhead and that is where I
believe serverless caching will be the next big thing
in industry. So moving on. Amazon elastic
cache service is having a great set of features. So if
I go through a few of them, you are able to create a cache in
1 minute. So if you have created a cache or redis
cache, you know, there are a lot of design aspects, lot of
decisions you have to take, lot of capacity related decisions
you have to take. But now here burden is AWS, it's not
yours. By simply doing a couple of clicks you
can bring up either redis or memcache under a
minute. And that is a promise AWS is keeping.
And you don't have to do the capacity management. So you just
provision and then you start using it. And AWS is
take care of it so that there's no question of over provision
or under provision. So you are using what you are using and
you are getting paid what you are getting what you are using.
So that is a hustle free and it takes lot
of overhead out from us. AWS is guaranteed 17
700 microsecond at p 51.3
millisecond and p 99. Those are great performance
stats. Like most of our systems are
happy to operate at those levels.
And AWS is offering five terabyte of storage
so you can have five terabyte of data cache
in your system. And one of the beauty is this
is paper use and this is nothing new. So when you say serverless you
understand it's paper, we only pay what you are using,
so you just don't have to worry. So this is some of
the great benefit the serverless are coming
and AWS is guaranteeing 99.99
availability. So that is something very decent
and that is something even sometimes challenges for us to achieve in
our own implementations. And one of the other major
thing is it provides single endpoint experience, so we don't
have to worry about multiple endpoints. Just by using one
endpoint we are able to plugging in and start using it. And it
also provides lot of compliance as well. So that is also taken
care of. So in summary, these are some of the great features
AWS elastic cache serverless is offering. So this
is allowing you to do rapid development, rapid deployment
and rapid maintenance. And your application is live and
using cache. So one question
is when it comes to AWS services and serverless is how is the cost?
Cost is again AWS is very transparent as usual
with their cost. So here they are introducing two new,
it's actually one new pricing unit,
something called elastic cache processing unit. And before that we have
this data taught so you are getting charged
for how much data you are storing. And at the time of
this presentation for onegb hour they
are charging around zero point 125 us dollars.
And then they have come up with something called elastic cache processing
units ecpus that is for you pay for
the request in ecpus. So which covers VCPU
time and data transfer. So each read or write
consume one ecpu per kilobyte.
Additionally VCPU time or data transfer over one kb
scale ecpus proportionally.
So that is again a very good way
which will ensure that it's aligned with this paper use approach.
So what you have to be mindful is again this is based on region.
So here I am looking at us east. So be
mindful. So when you are going with the implementing be mindful of this data.
So that is how AWS is going to charge you.
So moving on. I think this is the important part.
Now let's see that how we can do an implementation. So all
you have to do is go to AWS console, search for elastic cache
and then you will come to the dashboard. From here you can
select cache and then
you are just starting your journey. So once
you go to journeys you will have this view. So this
is all the information you have to provide, right? So it's initially
asking whether it's serverless or whether you want to design your own cache.
But it's serverless which we are going to test here. So we will select
serverless. You will have to give a name and if required you
can do some changes into default setting, but otherwise you can
make a create and under minute AWS will
create your elastic memcache.
So like this, right? So you can see it will come up here.
So once that cache is available it's
as business as usual. If you are someone who have used
Memcache or Redis previously, then it's a matter of
we are taking the endpoint and plug it with your application, then you
start using. So you are it, right? You don't have to wait a lot of
time and there's no overhead, there's no about fine tuning,
there's no about thinking about capacity, it's just
starting using it and that is about serverless. Serverless is
AWS will take care of the infrastructure and everything and you clear about
actual implementation and development work. So here
you can use some of the easiest command like connect with OpenSSL client
and then you can give some set and set a
standard set of variable in
the cache and then you can get that variable as well.
So this is easy. So this doesn't require actually live
demo, but with this couple of screenshots you get that picture
and I'm pretty sure that this is as easy as
services AWS is offering. So moving
on. Now that you have kind of a decent understanding of
AWS serverless caching solutions,
let's go and see. Even though it's easy,
there are some anti patterns just like everything in our know
this beautiful and one of the great
solution have some challenges
which you have to be mindful not to fall into some
of these standard antipadent traps.
And one of the traps I feel like is over reliance on caching.
So now when the days where we have to spend significant of time
building Memcache or Redis. So we will always think about
whether caching is really needed for our application.
So we would have just spent bit of time to understand the
pros and cons and then we will come
up with the decision. But here, now that we can make a cache within
less than 1 minute, I would say someone would just go and create a cache
and rely on that. And I believe that is not a good thing. You should
always understand for your application for
your use case whether caching is a required thing and
if it is, then of course you should go. But if it is not,
you should do a lot of due diligence to ensure that it's
right for you. Otherwise it can have some
unexpected fallbacks for you.
The second one is not handling cash misses. So once
you have that cache based on that cash
implementing, there can be operations where there are some cache misses,
right? You had to be very mindful. So in order to get the
benefit out of this solution you had figured out way
how you can identify cache misses quickly and how can turn it around.
And the other important thing is security, right? This is again
just like everything AWS providing. So AWS is take care of certain
things and the infrastructure and other things, but the data residing
here, it's your data. You will have to figure out where.
You'll have to implement security policies, processes and procedures to
ensure that you manage the security of data, which is very
important. Moving on. Now that you have
some understanding about antipatterns, these are some of the best
practices you should follow. So one thing is optimize
your data with access patterns. Like you
can get the maximum benefits of serverless caching. By doing this,
you can come up with some caching designs like read through,
write through, write behind,
cache aside, refresh ahead,
cache aside. Seeing things like write behind this
will really give you high benefit,
right? So these are some of the standard and
the famous design patterns and you should really use it
when you are doing your development. And next one is think about your
serialization. You can reduce data transfer cost and
improve all performance by focusing on the efficient
serialization. So think about the message
path protocol buffers, cache, auro or jSon,
some of the famous examples. So that will actually
give you a greater performance benefit. Another thing is
come up with some good cache key in the naming
convention. So that will actually save you a lot of trouble
and effort when you are doing a long stretch
of development or when you are doing maintenance.
So other thing is just like everything, monitor and analyst
continuously monitoring, understanding your cache hits,
missed rates, latencies, response times, cache eviction
rates, data transfer volumes, cache size,
those will be really helpful. And those data points will
give you some ideas like in your development cache,
what you should do and how you can optimize it. So by
following these four best practices you can get the
more benefits the AWS elastic cache offering.
And then this will about to allow you to develop
a more fast system, a better system which is
providing a world class service to end users,
and you can of course reduce your operation cost.
So with this, I'm going to wind up my presentation.
Thank you very much for spending this time and have
a nice day everyone.