Transcript
This transcript was autogenerated. To make changes, submit a PR.
You. Hello and welcome to
my session reacting to an event driven world as part of Comp 42.
My name is Grace Janssen and I'm a developer advocate working
at IBM based in the UK. And my twitter handle there
is there. If you'd like to follow me. I primarily focus most of my time
sort of in the Java and the JVM ecosystem, and mostly
looking at things like reacting technologies and cloud native in
infrastructure and cloud native technologies. Hopefully in this session, IBM going to be showing
you how we can build, can architect and design our applications
to be reactive to this sort of event driven world
that we've created. So let's get started. But obviously no
conference session is complete without a coffee demo. So we've got a coffee demo here
for you today. Kind of ironic, because I don't like coffee, but anyway,
so this coffee example is actually available as an open source
project. So if you'd like to check it out, the link is there in this
slide made by our cousins over at Red Hat. So in this coffee
example, we have coffee lovers coming in and making coffee
orders via HTTP request to a coffee shop. Then the coffee shop sends that
order to the baristas via HTTP again, and the baristas can make the
coffee. Now the issue with this is that when the coffee lovers make
this request, it's a blocking request due to the nature of this. And so the
coffee lovers essentially have to wait at the till, until the barista has
completed their order, until they can sit down with their coffee
and they can't place the next order until the barista has completed that first
order. So it's a very blocking process in this regard.
But what people often suggest when they're looking at applications like this
to make it slightly more asynchronous and more non blocking, is to
introduce something called events driven architecture. So event driven architecture,
or it's often shortened to Eda, is a very popular architectural style
that really enables events to be placed at the heart of our systems and
our applications. When I talk about events, an event here is
essentially a record of something that's changed or something that's happened within
the system. So it's usually sort of a state change within
our system. And these events are immutable, so they cannot be changed,
but they can be sort of a new event can be created to represent a
new state change to that same entity or object. They're ordered in
sequence of their creation. So you can see these, we've got a basic
sort of architectural diagram of what this really sort of represents.
So we've got that event driven messaging backbone. So this is that immutable log
of events placed at the heart of our system. And then we've got microservices
consuming and producing to this event log. So you
can see microservice one, there is publishing events to that event log,
microservice two and three, interestingly actually publish
and subscribe and consume from that event log
and service four is just consuming. So event driven architecture is just
about having sort of loosely coupled microservices that are able
to exchange information through this production and consumption
of events. So let's see what it looks like when we put it into our
barista example. When we put it in, as you can see here, we're still using
HTTP to communicate between the coffee lovers and the coffee shop. But now we've got
an event backbone between the coffee shop and the barista and a
new component called the board. So in this case, when our coffee order is made
by our coffee lovers, it goes to the coffee shop and it gets added onto
an event backbone onto that queue or that topic. And that means
that the barista can read from that topic as and when he's finished
or can read the next record or the next coffee that needs to be made.
And then once the barista has made those coffees, they can then update the
board to basically announce when that coffee has been produced.
And the coffee lovers will then know to go and collect their order. What this
means is that the coffee lovers no longer have to wait at the till for
each of their orders to be processed. Instead they can leave these till
knowing that their order has been placed and that it will eventually be made.
So their request will eventually be satisfied. They can go down,
sit down, talk to their friends, probably discuss Java as we all love
to do, and then once they see the board has updated, they can then go
and collect their coffee. So it's a much more asynchronous process here. We're trying to
get rid of some of that blocking component from our original application to
introduce these style of architecture. A tool that's often used is Apache
Kafka. So if you've not come across Apache Kafka, it's an open source project
that's all about being sort of providing this distributed streaming
platform. So it enables things like stream history, that immutable
data that we were talking about for that event log in the center of that
diagram. It enables really high availability of the information within your application
and it is extremely scalable. So if we were to introduce,
say, Kafka in order to create that event driven backbone in
our barista application. What does this really mean? So, does this mean
that our coffee shop is now non blocking and highly responsive?
In other words, is our microservice system non blocking and highly responsive?
And people often assume that the answer to this question is,
well, yes, I'm using Kafka. I've got this adventure of an architecture,
so I must be highly responsive, I'm scalable, I must
be non blocking, because I've got this asynchronicity, this decoupling. That's not the
answer to everything. There is more to this
application if we want it to really be non blocking and highly responsive
from end to end, than just shoving in tools like Kafka,
or architectural approaches like events driven architecture, and expecting it
to magically be non blocking and highly responsive all the way through.
Kafka is a great tool, but it isn't enough just to have a good tool
shove it in. We need to be using it in the right ways, and we
need to be building our applications in the right manner to really
enable non blocking all the way through. So this is where this concept
of reacting systems comes in. So reactive systems is really
about looking at that sort of high level approach, that system
level view of our application, to ensure that every
stage of our application is asynchronous and nonblocking,
and highly responsive and reactive. So this all stemmed from
a manifesto that was created back in 2013 by
a group of software engineers who really wanted to sort of lay
out the key characteristics and behaviors we need to be expressing
in our applications in order for them to be responsive
and reactive for our end users. And it's sort of based around
four key behaviors or characteristics. So the underlying
behavior in this manifesto is having this message driven
form of communication. Now, this is all about decoupling
the different components of your applications so that a potential failure in one
wouldn't cause a potential failure across the application and mean that your application is
unresponsive. It also enables us to have that asynchronicity
between components because we've decoupled them. So this message driven form of
communication really allows us to achieve this asynchronicity. But interestingly,
in the original manifesto, it was actually event driven.
And you can check out why they switched between reactive manifesto 1.0
to reactive manifesto 20 from event driven to message driven,
there's an article online, you can just go and google it if you'd like to
find out more information about that. But what's the difference between the two, because often
when we think of things like event driven, architecture clues kind of
in the name, we think of it being based around sort of events and
being event driven. So what's the difference between the two? Well,
we've got a definition here on the left for messages and a definition here on
the right for events. So we've defined a message as an item of data sent
to a specific location, whereas an event is more of a signal
emitted by a component upon reaching a given state. So less
location specific there. But interestingly, actually, when you look at
this, you can have a sort of mesh of these two where a message
can contain an encoded event in its payload.
So actually you can have sort of a merge of the two. They're not necessarily
distinct types of sort of communication methods, but when
it comes to Kafka, it doesn't really matter. Although Kafka was traditionally seen sort
of as event driven, they actually don't reference events anywhere in
their sort of description of what they enable. Instead, they reference something called
records. So they enable you to publish and subscribe to streams
of records, store records in a durable way, and process streams of records as
they occur. So actually, they've made this deliberate move away from the word events
and instead use records because it can be used for both both events,
event driven and message driven, or sort of a hybrid between the
two. So actually, when it comes to our reactive manifesto and enabling
that message driven, asynchronous backbone of communication for
application, Kafka can be a really great tool to use. But there's a
reason there's additional characteristics listed in this manifesto that are needed
to be expressed by our applications in order for us to be reactive. So let's
take a look at our barista example and see how we might come into problems
if we were just shoving an event backbone and expecting it to be magically
sort of nonblocking and asynchronous all the way through. So here's our original barista
example. But in this example, we've only got three coffee lovers.
Now, there are more than three coffee lovers in the world. So what happens when
we get an influx of coffee lovers coming into our application or to our
coffee shop trying to place coffee orders? So essentially what we've done here is
we've created a massive increase in load on our system. Now,
more coffee lovers obviously means more coffee orders. So although these
coffee lovers aren't necessarily blocked, because once they make their order, it's placed
onto the backbone and they can go back down, sit down, they're not blocked in
this process, necessarily. It does mean, however, we've got this huge backlog
of coffee orders on our event backbone, and unfortunately, we've only got
one barista to serve them, or one barista to create them or,
and complete those requests. So, in this case, we would
have a potential form of maybe contention, potential failure.
And it would mean, essentially, even if our barista
doesn't go down or fail, or get stuck in a process, our barista is going
to be slow at getting through all of those coffee orders. So our app is
going to be not necessarily as responsive as we would necessarily like.
Our coffee lovers are going to be waiting a while for their coffee. And this
is where the next behavior in the reacting manifesto comes in, elasticity.
So, being able to scale both up and down the resources within
our applications, so that we can gracefully deal with load fluctuations
to our application. So, it's important that, again,
this change, this behavior changed between the reactor manifesto 1.0
to 2.0. It was scalable. It then got changed to elastic, because they realized
that it wasn't just about scaling up the resources, although this is great
when you've got increased load, but when you don't have that load based on your
system, it's really important that we're able to appropriately scale our
application's resources back down so that we're as cost effective as possible.
So, this is where elasticity comes in. So, in our barista example,
if we were able to elastically scale up the number of barista microservices,
we could provide a much more responsive, much more reacting application for
our coffee lovers, because we could get through those coffee orders much quicker and
gracefully deal with that load without these barista becoming a potential point
of failure or contention within our application. Another example here,
in this barista example, we've added an additional component. So this additional
component is, we're calling it the coffee serving table. So, this is
representative of, say, a downstream microservice that's
perhaps needed for additional processing, or perhaps an external service that you're utilizing,
like an external database. So, in this case, for the barista to
be able to update the board and produce and serve the coffee,
they have to place the coffee on the serving table for the coffee lover
to come and collect. Now, what happens in the potential scenario where our
coffee lovers are being a little bit lazy, they're not coming to collect their coffee
orders very quickly. And so these coffee orders are building up on the table.
The serving table is now full, and the barista is juggling coffee
because they can't put it down on the serving table. That's essentially representing
that third party or downstream component going offline,
perhaps failing, or potentially getting stuck in a process. And that
prevents that barista from being able to essentially produce any
more coffees because it's stuck with the coffees that it's trying to load off to
that downstream component. So in this case, this could potentially become a non
responsive app. Because we're no longer able to get through any more coffee orders because
we don't have resiliency built in. And that's the next characteristic,
being able to be resilient in the face of any potential failure that could occur
within your application. So if we were to say, maybe,
perhaps introduce some resilient behaviors into our application, there we could introduce,
say, an event breaker, or a circuit breaker. Sorry. Or perhaps even
things like back pressure communication. So basically,
letting the application know that one of those downstream components is failing,
or is stuck with a load of some kind or a process. And being able
to build in resilient characteristics, like perhaps spinning up another barista
and serving table instance or replica, so that we could redirect requests
to them, or share the load across them, or perhaps rate limit the
number of orders coming in. All of these behaviors would help to prevent our application
from potentially failing and becoming unresponsive. And this leads to the last characteristic.
By enabling elasticity, resiliency, and that message driven,
asynchronous form of communication, we can enable the last characteristic of the
manifesto, which is being responsive. We need our applications to
be as responsive as possible to any state changes or events that are occurring
within our application. And by implementing these characteristics, we can essentially
achieve that reacting, non blocking behavior and
characteristic we need for our event driven world that we now live
in. So, how do we go about actually building these systems, building these
types of applications? Everything I've mentioned so far has been fairly high
level, fairly based around sort of characteristics and behaviors of
our application. Let's talk about these practical how do we achieve this? So we've got
here a fairly basic application, just made up of three microservices,
just so we can go through the various different aspects and I guess layers
within it that we need to be considering. We've already looked at the message
driven, asynchronous sort of data layer and how Kafka can enable
us to help decouple the aspect, the components within our application,
and provide that message driven asynchronicity sort of
backbone within our application. So that's great. For the data layer. But we
also need to think about how our microservices are interacting together,
whether that's through Kafka or not. And so for this we can introduce reacting
architecture patterns. So reactive architecture design patterns,
there are many of them. We've just listed four here. It's not an extensive list.
These are just four that we've come into sort of contact with fairly
regularly when looking at reactive applications. So we've got
here cqrs, which stands for command, query, responsibility,
segregation. That's essentially all about splitting out the read and the write APIs
so that we can have really high availability of the data within
our application. Then we've got circuit breaker. So circuit breaking is very
similar to the concept in electrical engineering as it is in software. So it's
essentially being these circuit. So in this case, an upstream component recognizes
when a downstream component is under stress or load, or perhaps is failing by perhaps
identifying the same error message coming out of its logs. And that upstream
component can then temporarily put a stop on the request route
to that downstream component or that downstream microservice. And instead it
can reroute that request to an alternative replica microservice. And then
once that downstream component essentially becomes healthy again, it can
then essentially reinstantiate that route to that
downstream component, which prevents it from becoming a potential
point of contentional failure. Then we've got sagas. So sagas
are essentially a mechanism that takes what would have been a more traditional transaction
that maybe we'd have done in a monolithic architecture and do it in a more
distributed manner. So we create these sort of multiple microtransactions that
have this fallback behavior to account for things potentially going wrong partway
through. So it's more of like a sequence of local transactions where that
transaction updates data within a single service. And then we've
got backpressure. So backpressure I sort of mentioned earlier, as well
as circuit breaker, as a potential solution to enabling that
resiliency within our barista microservice application.
Backpressure is a form of feedback. So it's a feedback mechanism from
a downstream component to an upstream component to essentially help rate limit
so that that downstream component doesn't become overloaded or overworked
or stuck in a process and potentially become a point of contention or failure.
So it's a communication and feedback so that we can rate limit these
requests or the messages or events coming to that downstream component from the
upstream component. And many of these are utilized in reactive applications.
But as I said, you can go online and find many, many more reactive architecture
design patterns that you can utilize when you're looking at sort of making sure
that the communication between components of your application is asynchronous
and nonblocking as possible. But what about within our microservices? This is sort of the
next level we need to be thinking about. We also need to ensure that the
logic we're writing within our microservices, the logic within our
application, is also highly responsive,
nonblocking and reactive. And to do this we can utilize something called reactive
programming. So reacting programming, here's a basic definition. It's all
about asynchronicity. Again, it's a paradigm where the availability
of new information is what drives the logic forward, rather than
just by a thread of execution. And we can enable this through several different
programming patterns. So as a Java developer, you're probably familiar
with the concept of futures. If you're not a future is essentially
a promise to hold the result of some operation until that operation is complete.
So really focusing on that asynchronicity there, then we've got reactive programming
libraries. So they are all about composing asynchronous again
and event based programs, and they include examples like Rx Java and
smallrie mutiny and actually lots of the frameworks. We'll be going on to later
utilize these programming libraries within them. And then we
have the reacting stream specification. So this specification is really
a community driven effort to provide a standard for handling
asynchronous data streams in a non blocking manner while providing back pressure
to stream publishers. And again, many of the frameworks we'll look at in a minute
utilize this community driven open source specification.
So we've looked at sort of the different layers of our application and
where we need to be making sure that we have this sort of reactive behavior.
We're looking at the data layer, we're looking at the microservice within the microservice
layer, and between the components, between the microservice layer. But we need to also make
sure the configuration we're using with tools like Kafka also
enable these reactive behaviors. So let's take a look at how we can really utilize
Kafka as best as possible for these reactive systems.
So as I said before, Kafka can be a really great tool for enabling
this message driven asynchronous form of sort of communication within
our application. But what about the other sort of characteristics and behaviors within
the manifesto? So let's look at resiliency first. How do we enable greater
resiliency in Kafka. So Kafka actually is really
beneficial in that it actually has sort of this inbuilt resiliency within
it. So it's different from traditional message queuing systems because it provides
this stream history. So when a consumer loads a record from Kafka,
the record isn't removed from the topic. So that means that both that consumer
and any other consumers can reconsume that record at a later time if
they need to. And that's really useful for enabling consuming applications
to recover really well from potential failures. And that's
where this sort of immutable data comes in. It doesn't get deleted, and that stream
history remains, which means we have a really resilient form
of data retention and data persistence for our application. Kafka also
has inbuilt resiliency in regards to actually how it works itself.
So Kafka, essentially, when you start up a Kafka cluster, it will
contain a set of kafka brokers, and a cluster usually
has a minimum of about three brokers. And we'll see sort of the reasons behind
that later. So Kafka is then broken down further into topics.
And these topics are where we store our records. So a topic is sort of
a logical grouping of a similar kind of message or record.
And then it's up to you to define which messages or which records
will go into which topic. So for example, with our barista example,
we might have one topic for coffee orders, and we might have another topic for
user information updates. But it's up to you to define what those topics are and
which messages and records are going into which. Within a topic, you'll have one or
more partitions. Now, partitions are distributed across the Kafka brokers.
As you can see here, we've got partition one and broker one, partition two and
broker two and partition three and broker three. When it comes to the brokers themselves,
we have something called leaders and followers. So for each topic,
one of the brokers is elected. This is all automatic by Kafka,
but it's elected as the leader. And the other brokers are automatically assigned
sort of a follower status replication. So these application,
your application will connect with the leader broker. And replication of
those records being sent from the application to Kafka get replicated
by the followers repeatedly fetching messages from the leader.
Again, this is all done automatically by Kafka. So this means that all
the apps will connect to the leader, they'll consume and produce to the leader,
and then that information will be duplicated across the different brokers
due to that follower behavior. But what happens in the potential scenario
where our leader broker, for example, goes down? Well, in that case, you might
think that the application's connection with Kafka would be broken. But, no, that's not the
case. Again, we've got this resilient built in behavior. So instead,
what happens is a leader election occurs. So, a leader election is where Kafka
automatically then assigns one of the followers to become the new leader
and automatically switches over that communication to the new leader
broker with your application. So it ensures that your application essentially
continues to be able to communicate to Kafka to consume and produce.
And it means that we still have a replica microservice, a replica broker.
Sorry. So that we can still replicate that data and have that data persistence
and ensure that we've got that resiliency behavior built in in case that second
leader were to go down, for example. And that means it gives enough time for
that broker to essentially come back up again and become a new follower.
So that's really great that it means that we don't lose any data and we
don't lose that connection with Kafka. It's that resilient behavior built
in. So, we've talked about how Kafka itself is resilient. How about how we communicate
with it, with our application? So, when it comes to creating resilient producers,
we have sort of two things that we need to be considering, our delivery guarantees
and our configuration. So, when it comes to trying to be as resilient as possible,
you can't just do this sort of fire and forget method if
you want full resiliency. Because if the broker were to go down with a fire
and forget methodology, your messages could be lost. So we
need to be thinking about our delivery guarantees to make sure that those records
are being received. So, at most once, there's two sort of options.
At most once and at least once at most once. With at most once,
you might lose some messages. It's not completely resilient, but if
you don't mind some of the messages getting sort of getting lost, this does increase
greater throughput. But it isn't the most resilient configuration value we could
use. Instead, we'd be looking at at least once, which ensures that
there is guaranteed delivery of that record. But you may get duplicates.
So that is something to consider when you're trying to pick which delivery guarantee to
use. Then we've got the configuration. So we've got axe and retries.
So axe is short for acknowledgement. And this is essentially acknowledging that the
records or messages have been received by Kafka. You can either
set axe to zero, one, or all. So zero is for when you really
don't care about getting the message acknowledged. It's really good for fast throughput,
but not for resiliency. One. If you set axe to one, it makes sure
that the leader successfully received the message, but doesn't really bother about making
sure the followers have received it. The danger with this is that if, for example,
your broker were to go down after receiving and acknowledging
if the leader had acknowledged that record but the followers hadn't been able
to replicate it, you could potentially lose that record. So it's, again, not the most
resilient, but it is faster in terms of throughput
than all. So all is the last option. And that's essentially where
you wait for all of the replicas to confirm that they've successfully received
the message or record. And it's the most resilient configuration value for this
and these. The other thing you need to consider is retries. So, retries are for
if the axe times out or fails. So how often do you want to retry
sending that record or message? How often do you try reproducing that event?
But what you need to think about is how will that retry potentially affect your
ordering? So this is important to consider when you're setting this retries configuration.
So, let's take a look at consumers. So, with resilient consumers,
the configuration value we need to be aware of is our sort of how we're
committing our offsets. So, again, because Kafka has
these stream history, and it retains all of
the data within it. So it's got this persistence. It means that we
need to be aware of where we've read up to so that
we don't reread a message. If our consumer was to go down, for example,
and come back up again. And to do this, we use something called offsets.
So, offset is just a value assigned to each record within a topic,
and it increases over time. For example, if we added a new record
to this particular topic, it might be either five, depending on what you class the
dotted line as, or six. So there are different methods of
committing that offset. So it means when you commit it. So, for example,
if we had a consumer and they'd read up to one, we'd want to commit
one. So that if the consumer went offline or had to come back up again,
then they would start at two instead of at one. So the two different methods
of committing off offsets are manual and automatic. So let's take a
look at our barista example again to see how these differ in terms
of their resiliency for our application. So with autocommit,
there's no code in the application that determines when this offset
is going to be committed. Instead, it's relying on default settings in
the underlying client, which in this case is probably a Java client. And that means
it's essentially based off a timer. So it will automatically commit
offsets for the messages that the consumer has read from Kafka based on
a default timer method. So in this case, the barista is looking at this
topic. And on our topic we have three records which represents three orders
from our three coffee lovers. So we've got a coffee order, a cappuccino order and
a latte order. So in this, our barista will start at the beginning and will
start producing the coffee order. Now, because we're on auto commit after an allotted
period of time, our offset is going to be committed. So we're going to commit
that offset represented by a tick here. But unfortunately, whilst the barista
is making the coffee order, they trip and they spill the coffee. So they've got
no coffee left anymore. So this is representing that barista microservice going
down partway through processing that record. What this means is that actually when the
barista microservice comes back up again, it looks at where it's committed its offset.
It's already committed the first offset, that coffee order. So instead it starts
at cappuccino. They successfully make cappuccino and latte.
But that means at the end of it all, we've only got two coffee orders
instead of three. So we've lost a record somewhere along the way.
However, let's take a look at manual commit. So manual commit is the sort of
other option you can choose for when to commit that offset for your consumers.
And this is where you write code into your application to determine
when that's going to be committed. That might be preprocessing midway through processing if
you want, or post processing. So in this case, we're going to do it post
processing. We're going to only commit the offset when
we've actually finished to the coffee order and served it up to our customer.
So in this case, when we're making the coffee order, if we were to spill
it, we haven't committed that offset yet. So when our barista microservice comes
back up again, it would know to start back at coffee. So now we only
put that tick up, we only commit that offset when the coffee is served.
And that means by the end of it all, we've got a much more resilient
system because we haven't lost any records,
because we're committing our offsets at the time at which we expect
that behavior to occur. So this is how we know Kafka has built
in resiliency through its stream history and its immutable data and things
like its broker system, its leader, elections, et cetera. And we're
able to introduce greater resiliency through things like our configuration
for our communication between our consumers and our producers with Kafka,
through things like acknowledgements, the commit method you're using,
etc. So let's take a look at the next behavior elasticity. How do we enable
this sort of scalable behavior when utilizing Kafka?
So Kafka is designed to work well at scale and to be scalable itself
and for producing applications it does this using partitions.
So for any particular topic there are usually one or more partitions.
So in this case there are three, because when I created the topic I
specified that I wanted three partitions. So it is something you need to specify.
Kafka will then aim to spread the partitions across particular,
for a particular topic, sorry, across different brokers. This allows us
to really scale produced very easily as they won't have
load for one particular topic on just one specific broker.
And we can always add more brokers if we want to and spread
the loads out more. So it's really great for enabling that spreading of load to
gracefully handle load in our application and enable that scalability
for sort of producers. Actually the interesting part when it comes to scalability
and when we really need to sort of make a conscious effort to enable this
behavior is in our consumers. So consuming in Kafka
is made possible by something called consumer groups.
So when you're consuming messages in Kafka because of this stream history, again,
we need to think about how we're going to handle it, because if we have
multiple consumers all trying to consume from the same topic,
we're running the risk of potentially applications, records or
not getting ordering guarantees, et cetera, et cetera. So in
Kafka we have this consumer group idea to
help ensure that we have sort of ordering guarantee and that we're not rereading
messages from these same consumers. So this really allows sort of scalability of
consumers by grouping them. And this grouping
is enabled via a config value. So you put in a group id so
that you know which consumers are joining which consumer? The group. So let's take a
look at what this looks like in practice. So on these left hand side here,
we have our topic, and we've got our different partitions with some of our records,
and you see the offset on them there. And then on the right hand side,
we have two consumer groups. Consumer group a has three consumers within it,
and consumer group b has two consumers within it. So let's take a look at
how they're going to be linked up to the different partitions. So because you've got
three consumers in consumer group a and three partitions, each consumer will be
assigned a partition. So that means that they're
only reading from one partition. And that's important because that's where
the ordering guarantee comes in. Now with consumer group b, there's only two
consumers. So one of those consumers will read from two partitions.
So essentially you can see that bottom one's reading from partition one and partition two.
So what this means is that by only reading from one partition, we can guarantee
that they're not going to be reading duplicate messages. And from an ordering standpoint,
Kafka provides us ordering guarantee per partition. So it
means that for the first consumer, for example, it will get the messages
from zero in the correct order. So we can ensure that that ordering guarantee remains.
What happens if we want to scale up the number of consumers, though? Here's an
example here. If we were to add a consumer to consumer group a, it actually
would just sit there idle, and that's because there's no partition for it
to read from because we've already got three consumers and three partitions that already
matched up, and we can't assign the same partition to an additional consumer
within the same consumer group because we'd lose that guarantee that we're not
duplicating records and we'd also lose that ordering guarantee.
So actually this consumer would just sit there idle. Now, it might be useful,
for example, if you wanted a spare in case one of the other consumers went
down, it could just pick up where the other one left off and be assigned
that partition. But right now it's essentially just a waste of resource because it's not
sitting there doing pretty much nothing. However, if you were to add it to consumer
group be, it would be able to be assigned a partition because there would be
a spare partition, essentially because you've got one consumer consuming from two partitions.
So this is where you need to be careful when you're setting up your application.
You need to make sure that you're thinking about how many partitions you're going to
need for each topic based on how many consumers you're essentially going
to be sort of potentially scaling up to when you're utilizing
your application. Now you can add new partitions once your
system is already up and running, but an important thing to consider is that the
more partitions you have, the more load you're putting on a system when a leader
election happens, for example, and the ordering is only guaranteed
while the number of partitions remains the same. So as soon as you start
adding partitions, you lose that ordering guarantee once again. So when you're setting
up your system, be sure to think about the number of partitions you're going to
need before you set it up. So that's how we can enable scalability within producers
and consumers in our application. So we've looked at the different behaviors and how
we enable that using Kafka. We've looked at the different layers within our application
and where we need to be introducing these reactive behaviors for an end
to end non blocking reacting application or reacting system.
But how do we actually go about writing reactive Kafka applications?
What tools and technologies can we utilize? So there's obviously the standard
Java Kafka producer and consumer clients that you can take a
look at, but these aren't really designed to be used, they're not optimized
for reactive systems. So instead, what we'd really suggest is taking a
look at some of the open source reactive frameworks and toolkits that are available
that help to provide advantages like simplified Kafka APIs
that are reactive built in back pressure, as we mentioned in the reactive architecture
patterns and the enabling of asynchronous per record processing.
So examples of these open source frameworks specific
for sort of reactive Kafka interactions include, but they're not limited to
alpaca microprofile and vertex. These are the ones we're going to be taking a look
at today. There are others like Project Reactor, which you can definitely take a look
at. We're just not going to be going into it in this presentation, so let's
take a look at these and see what the differences are. So the Alpaca Kafka
connector is essentially it's a connector that allows consuming and
producing from Kafka with something called ACA streams. It's part of this ACA
framework or toolkit. So ACA and things like
the other libraries that sort of interact with it is actually based on
something called the actor model, which is slightly different to things like a microservice
based application. So in the actor model, it is the actor that's a
primitive unit of computation. So it's essentially the thing that receives a message
and does some kind of computation based on it. Messages are sent asynchronously
between actors and they're stored in this sort of mailbox,
and that's how they communicate. It's a different process. It's not a one to one
mapping between an actor and a microservice. So it can take a bit of
sort of a paradigm or way of shift of thinking if you're going from a
microservice based to an actor model. So that's something to bear
in mind if you're considering Alpaca and the actor based model
and ACA. However, if you're already utilizing the actor based model for
your application, definitely worth taking a look at the ACA framework
and the alpaca sort of specifically for its connection
to Kafka. The next framework we're going to take a look at is eclipse microprofile.
So this is really an open source, community driven specification for
enterprise Java microservices. So it works with Java E and Jakarta ee,
and it's really built by the community. So it's a huge range of individuals,
organizations and vendors that contribute to this project. Within this,
there are several different APIs that are offered to really enable greater
sort of ease of building microservice based applications
ready for cloud native through the use of these APIs on top of
something like Java ee or Jakarta ee. So the bottom triangle here,
the dark gray ones at the bottom in that triangle shape, they're the standard APIs
that come as standard as part of the microprofile stack. The ones on the right
hand corner here, the little sideways l shape, are the standalone projects
that the microprofile community also works on. The hope is that these standalone projects
will eventually become part of the standard stack, but right now they're just
separate projects that the same community works on and they integrate
together. So the one that we're actually interested in for reactive in this case is
the reactive messaging specification. So the reactive messaging specification makes
use and sort of interoperates with two other specifications.
One is the reactive streams operator, which is another standalone project, and it provides
a basic set of operators to link different reactive components together and
provide kind of processing on the data that passes between them. And it
actually interoperates with the reactive stream specification that we
mentioned earlier. It's not a microprofile specification it's that community driven specification
I mentioned in the reactive programming part of this presentation. So how
does microprofile reacting messaging actually work? Well, it works by providing
sort of annotations for you to utilize on an application spins methods,
so there's incoming and outgoing annotations that you can utilize.
And these annotated methods are connected together by something called channels.
Now channels are essentially just opaque strings. If you're connecting internally
within your application, it's just called a channel. If you're connecting say to an external
messaging broker, that channel changes its name to a connector. Now because microprofile
is a specification, you'll need to look at the implementations to
understand what connectors they offer, because each implementation will offer
different types of connectors. So the incoming, you can
see here that we've got on method b which says order is connected via that
channel order to say the method a with that outgoing order.
So that's how it's connected and that's how it enables this reactive
sort of behavior. So let's take a look at the final framework. We're going to
take a look at this toolkit and this is the eclipse vertex project. So the
others are sort of based on especially my profile, Java specific,
whereas this is a polyglot toolkit. So it can be used with
Java, JavaScript, Scala, Kotlin, et cetera, many other
languages. It's based on something called the reacting pattern, and this is essentially a
really event driven type of architecture. It uses a single threaded event
loop which is actually blocking on resource emitting events and dispatches
them to corresponding handlers and callbacks. So it's really event
driven, it's non blocking, it runs on the JVM and it
includes these distributed event bus and it's single threaded. And actually
we utilize Kafka in one of our demo applications when we were converting it from
a standard Java Kafka application to be a reacting Java kafka application.
So let's take a look at that demo application. So this isn't necessarily
what you'd want to do in enterprise, but this application was designed
just to be able to help people test their Kafka integration.
So have they set up Kafka correctly? Are they able to produce and consume
from Kafka? Are they able to utilize it to its full advantage
in their application? So in this we have an application that
we actually converted to use vertex and it produced to a Kafka
topic and then consumes back from that Kafka topic. And we also have a front
end application where we can put in custom data
for our messages, custom topics, and we can start
producing and stop consuming and start consuming and stop consuming,
etc. So actually these application is all open source. So if you'd like
to utilize it to test your own application and your own Kafka integration,
feel free. And the front end is all open source in that as well.
Let's take a look at how this application works. So if we press play,
hopefully we can see here that we can put these in our custom message.
We can then click that start producing button, and then we can click the start
consuming button on the right hand side to see those records that have been sent
to Kafka coming back and being consumed from Kafka. So we can see that
here, but we can also stop consuming and stop producing again. So in
this, as we said from the websocket, we're sending this start stop commands.
And so really we needed a way to be able to start and stop that
within Kafka. And when we weren't using reactive frameworks, it become
a bit of a threading nightmare because we had to sort of switch over
threads depending on what we were doing. We had to instigate this pause and resume
functionality and it was quite complicated and difficult to achieve.
Actually, when we switched to using vertex we found all sorts of advantages.
So for the producing side of things, in the original version
of our application, we were doing this sort of start command that was sent
to the back end from the websocket. That backend then started the producer, which sent
a record every 2 seconds, and then we were sending the stop command from the
websocket to the back end and the back end then would stop the producer and
no records would be produced. So in these application, the standard
producer client was essentially this call to produce a record
as a blocking call. But switching over to the vertex producer meant that we could
now get a future back from the send function. So it was a fairly
small change, but it enabled us to switch from a blocking call to
more asynchronous style call. So it meant we were able to asynchronously
produce new records while waiting for the acknowledgement from Kafka of
these in flight records. So it's a fairly small code change, as you can see
here. But changing from non blocking to asynchronous is from
sort of blocking to asynchronous is really important in reactive systems.
But let's take a look at consuming, because this is where we saw sort of
the greatest difference in the original version of the application, we were having
to use this sort of for loop inside a while loop that resulted in this
sort of flow. These we were polling for records and that poll function then
returned a batch of records, and then we're having to iterate through each record
in the batch and these for each record, send the record along the websocket to
the front end, and then only after that entire iteration through all the records was
complete, we'd be able to then essentially go and fetch new
records. So it's quite blocking because we couldn't do anything whilst those records were being
processed. And it didn't feel very asynchronous because we were grabbing batches of records rather
than responding each time a new record came in. So instead,
when we switched to the vertex Kafka client, we were able to use this concept
of a handler function. So every time a new record arrived we were essentially able
to call this handler function. So it made our flow a lot simpler and
we were now able to essentially hand off the step of polling for new records
to allow vertex to do that. So essentially now we just receive
a new record and then we send that record log in the websocket to the
front end for the processing. So this essentially allowed us to process
records on a per record basis rather than a batching basis, and it left
us free to be able to focus on the processing of records rather than
the work of consuming that from Kafka. If you want to find out sort of
all of the benefits that we experienced by switching from a non
reactive to a reacting Java Kafka application, we've written up a blog
here that you can access on the IBM developer site. Feel free to take a
look and hopefully it will show you sort of some of the benefits you might
be able to experience by switching to utilize some of these open source
reactive frameworks and toolkits. So hopefully, in summary, what I've
shown you here is that by taking sort of a non reactive application and
sticking Kafka in, or sticking some adventure of an architectural tool in,
it doesn't give you magically this asynchronous non blocking reactive application.
We need to be carefully considering the Kafka configuration that we use
to create the most reactive system possible. And we can utilize
sort of these open source toolkits and frameworks that can provide
additional benefits as well for us to be able
to create the reacting characteristics and behaviors expected and
needed to be able to create this asynchronous end to end application. But the open
source reactive community is on hand. As I said, there are many open source frameworks
and toolkits you can utilize depending on the architectural style of your application.
So I'm going to demonstrate just how easy it is to get started by
utilizing some of these reactive frameworks for Java kafka
based applications. So I've gone on to our interactive online environment
here where you can utilize some of our modules to get started
with this. And I'm looking at module one here. So reacting reactive Java microservices
where we'll be utilizing the microprofile reactive messaging to write reactive
Java microservices that interact with Kafka. So I've already got the tab open
here. So when you would open it up, it will ask you to sort of
log in via social and that could be sort of any social media really.
I've already logged in. And then when you log in, you get this online environment.
So on the left hand side here I've got the instructions. On the right hand
side I've got can ide. And if I go to terminal, new terminal,
I have a new terminal turn up in the bottom right hand corner.
Now these labs will be available after this session, so feel free to
go through them at your own pace and go through the labs I don't
cover in this session today. So let's get started.
So the great thing about all of these different loads is that they actually utilize
the same application. So here's a basic architectural diagram to
look at, sort of what that looks like. So we've got a system microservice and
an inventory microservice, and they're communicating via Kafka in
the middle there. So actually the system microservice essentially
calculates and publishes an event that contains its average system load every 15 seconds.
And it publishes that to Kafka. And then the inventory microservice consumes that information
and keeps an updated list of all the systems and these current system loads.
We can access that list by accessing the systems
rest endpoint of the inventory microservice. So you can
see here we're utilizing microprofile reactor messaging
in this section of the application and we're utilizing restful APIs to
access that endpoint. So without further ado, let's get started.
So let's just make sure I'm in the right directory here, just in
case. Great. And then we can do a git clone of our repository
and then we can CD into the correct directory which is just guide microprofile reactive
messaging. So we're actually utilizing the guides here from the open Liberty
website. If you want to take a look at what the finished
application should look like after we complete all of this lab, this module,
you can check out the finished directory, but we're going to head into the start
directory for these lab because that's essentially where we can do
all the work. And by the end of it we should get to the same
as the finished directory and it's the same for all of our labs for
that. So we're going to be utilizing these touch commands to make it a bit
easier to create our file system. And then if I head to the explorer
I should be able to see this u represents where I created sort of that
new file. So if we follow that along it should open up for us
and we should be able to see that file if I move this over.
So there's the system service class that we're creating first.
So this is a class for the system microservice, which is the one that's producing
every 15 seconds the average system load. So cpu usage for
the last minute. So if we open that up we should see an empty file
and then we can insert the code we've got available here for you.
Ibm just going to close the Explorer just so it's easier to see in this
environment you do need to save. There isn't an auto save, so make sure you
save because otherwise your application won't be able to build.
So yeah, you should see that dot disappear once it saves.
So because this is a system microservice, we're utilizing the microprofile
reactive messaging outgoing annotation because we're producing to Kafka and
we're producing to the Kafka topic system load. As you can see here,
we're actually making use in this application of one of the reactive programming libraries,
Rxjava, which you can see we're importing here as
well as obviously we're importing the microprofile reactive messaging specification here you can see.
So this Rxjava that we're utilizing, we're utilizing a
method called flowable. So you can see at the bottom here we're utilizing flowable
interval and this allows us to essentially set these frequency of
how often the system service publishes the calculation to the event
stream. We're utilizing flowable because it enables that
inbuilt back pressure. The previous sort of iteration of rxjava included
observable, but they introduced flowable so that they could have that inbuilt back
pressure as an option. So now you have a choice between flowable and observable.
We're utilizing flowable in this. So you can see here's
a basic architectural diagram of what we've just created in this code here.
So let's continue on and look at the inventory microservice.
So for the inventory microservice, we're just going to use this
touch command again and we're creating this inventory resource class.
So again, if we head to our sort of explorer mode, if I close system
and head to inventory, we can see this use sort of directs us on where
to go, which is really helpful. Otherwise you can take a look here in
the instructions and it will show you where to go to find this
new file that you've created. So let's open that up. And once
we open it up, we can then input all the code that we've got down
here. Again, you can utilize the copy buttons in the bottom
right hand corner and just paste it into the ide. That way makes it a
bit easier than highlighting everything and copying it. So because this
is the inventory microservice, we're utilizing the incoming annotation because
we're consuming from Kafka, but we're still consuming from the same system
load. So hence why we've got the system load there. And you
can see a basic architectural diagram, these on the bottom left of what this should
look like when it's all connected together. So now that we've
created these two different classes, we now need to create the microprofile
config properties for each microservice. So again,
we can utilize this touch command to create the file that we need.
So this time we're going to system first
because that's the one we're looking at first. So if we head to system and
instead of going into Java here, we're going to go into resources
and there's the microprofile config properties file. So again,
if we open that up and then copy and paste the configuration we've
got here in the instructions. And again, remember to save, I'll close my
explorer for this. So here you can see that because we're utilizing the
outgoing annotation, you can see outgoing in our configuration. And I
mentioned before how if it's an
external channel that we're trying to connect through, so we're trying
to connect to can external messaging broker, it's actually called a connector and each implementation's
connectors are different. So because we're utilizing the open Liberty
implementation of microprofile, here we're utilizing the Liberty
Kafka connector. You can also see here is where we specify which
Kafka topic that we want to connect to. In this case system load and
we've got a serializer here so we can convert our object into JSOn.
For Kafka, let's take a look at creating the
same configuration file, but this time for the inventory microservice.
So if we use that touch command again, head to our explorer and this time
I'm going to close down system and open up inventory again. Make sure you're
not going into Java, we need to be going into resources instead.
And there's our microprofile configuration file. So if we open that
up and then we can copy and paste in our different configuration.
So for this one it's a little bit different because we're consuming from Kafka
we actually have slightly different configuration. So we're using the incoming
annotation. So we've got incoming in the configuration here. We're still utilizing
the same Liberty Kafka connector because we're still connecting to Kafka using the
open Liberty implementation of microprofile and we're still connecting
to the same topic within Kafka.
This time we've got a deserializer because we want to turn it from JSON back
into an object. But we've also got an extra configuration value here that
we didn't have in the previous configuration properties file. So in this
we've got a group id. And this is what I was referring to when we
were talking about Kafka having this idea of consumer groups. Because this
is a consumer from Kafka, we need to assign it a group id so that
if we were to spin up any other consumers, they would join the same consumer
group and we wouldn't be duplicating processing of the same record
or rereading the same record and we would have that ordering guarantee. So that's
why we have that extra configuration value in this particular configuration
for the inventory microservice. So the next step is to create
a server XML file. So again we can just create this using the touch command
we've already got here. Head to our explorer. We're creating
this for the server microservice, sorry, for the system microservice
we've already created the same server configuration file
for the inventory, which is why you won't be doing it during the steps of
this guide. You can go and check that out if you would like to.
This time we're heading into system and then into liberty. And here you can see
the server XML file that we need. So if we open
that up and then I can copy and paste this code in and
you can see here the different features that we make use of. So you can
see we're making use of the microprofile reactive messaging specification,
but we're also making use of several other APIs that are offered as standards as
part of the microprofile stack. So things like microprofile configuration.
So the configuration files we just made are actually external to our application
logic, which is fantastic when you're trying to make cloud native applications.
So we use it making use of microprofile config for that. We're also
making use of things like microprofile health, JSON B and CDI.
Again, all of these are part of that standard microprofile stack and you can go
and check them out if you want to do some additional work on understanding
those different APIs. We have guides on the open Liberty website that
you can take a look at for each of those. So as I
said, we've already created the same for the inventory microservice because it's exactly the
same file. So we're not going to bother doing that here. So now let's go
ahead and create the maven configuration file so we can actually build this project or
this application again utilizing the touch command.
This time I'm going to close down Src and
I should see underneath system that POM XMl file
I've just created. So again, if we open that up and head into it,
and then I can copy and paste all of the configuration I need into this
file from the instructions copy and paste. And then
remember to save I'll just close my explorer so you can see it better.
So in this you can see all of the different dependencies that we make use
of. So for example, there's that Rx Java dependency, that reactive
programming library, and then we're also making use of Apache Kafka, and here's
that microprofile reactive messaging specification and the standard
microprofile stack as well. And we're also making
use of Jakarta ee. So it's just interesting for you to
see some of the dependencies that we're making use of in this project.
So let's go ahead and start reacting this application.
Let's run the maven install and package loads so that we can create it.
One mistake I've made there is make sure that you are in the start directory.
You don't need to be in the start directory for any of the
touch commands because they automatically make sure they're in the right place. But now
that we're creating this sort of
application, we need to be in the start directory. So now let's try and
run those commands again.
Now it's looking a bit better. Fantastic.
And then we need to do a maven package.
So this docker pull open Liberty command is just to make sure we've got the
latest open Liberty Docker image in our sort of shared online environment here.
Because we're sharing it, we just want to make sure we got the right addition
in and no one sort of messed with that, so it shouldn't take too long.
And the great thing about this environment is that if you're trying multiple
labs it actually saves your progress. So that's why we
have cleanup scripts at the end, so that different directories
and different docker images aren't messing up different labs. But it means that
if you are doing this docker pill, it will essentially stay there for any
of the other labs you do. So you don't have to do this step for
the other labs, but it shouldn't take too long in the first place.
Anyway, there we go. We're complete. So now we can go
ahead and build our two microservices. So first we're going to build the system microservice
and then we're going to build the inventory microservice. Shouldn't take too long because they're
not very big. And then we're going to utilize this start containers script
that we've got which essentially starts up the different docker containers
that we need for this application. So we'll start up docker containers for Kafka
Zookeeper and these microservices in this project. The reason we're still using
Zookeeper is because right now these version of Kafka we're using needs
zookeeper for metadata information. Future versions
of Kafka should essentially be.
Their plan is to enable that metadata to be stored in a topic in Kafka
itself. So you won't necessarily need zookeeper in the future, but right now,
unfortunately we still need it. So that's why we're having to spin that up as
well. So whilst we're waiting for this application to
get up and running, the next step that we're going to be doing is
essentially we're going to be doing a curl command so that we
can ensure that we're successfully getting our system microservice
to access and calculate that average system load, publish that to
Kafka and these successfully consume that from Kafka in our inventory
microservice, update our list of systems and these average system load and
then essentially we can access that via that rest
endpoint. There we are.
So giving it a bit more time. We can see here that now we're able
to access the list via the rest endpoint here. So we
can see the host name, so the name of the system and the average system
loads. If you were to wait a little bit longer, you could then
use these same curl command and you should see a change in that system load
because these is all being calculated every 15 seconds.
So you should see some sort of change. You can see here we've changed from
1.1719 to 1.33.
So hopefully that should show you that actually it can be fairly
easy utilizing a reactive framework or
toolkit to be able to create really reactive responsive Java
kafka based applications if you'd like. Sort of a
very easy read. It's free, it's an ebook, and it's essentially a summary of
everything I've explained in this presentation around reactive. So what reactive
systems are, how you utilize them, how to go about building them, and why
you'd want to. This really short ebook that I helped author is available for
free online. Feel free to download it, share it amongst your colleagues,
or utilize it as a reference point for the different aspects of
reactive I've covered here in this presentation. As if I haven't given you enough links
throughout this presentation, I've also got these a bunch of resources for
you to get started with. If you have any additional questions that I'm not able
to answer, please feel free to reach out to me on Twitter. My Twitter
handle is here at Grace Jansen 27 and I'd be happy to answer your questions.
Thanks very much for listening. Have a great day.