Reacting to an Event-Driven World

Video size:

Abstract

The amount of data being produced is growing exponentially and a large amount of this data is in the form of events (e.g sensor updates, clicks on a website or even tweets). How can we architect our applications to be more reactive to these fluctuating loads and better manage our thirst for data?

Summary

Grace Janssen is a developer advocate working at IBM based in the UK. Will be showing you how we can build, can architect and design our applications to be reactive to an event driven world.
In this coffee example, we have coffee lovers coming in and making coffee orders via HTTP request to a coffee shop. What people often suggest when they're looking at applications like this is to introduce something called events driven architecture. A tool that's often used is Apache Kafka.
Reactive systems aims to ensure that every stage of our application is asynchronous and nonblocking, and highly responsive and reactive. This is all about decoupling the different components of your applications so that a potential failure in one wouldn't cause a failure across the application.
So reactive architecture design patterns, there are many of them. We also need to ensure that the logic we're writing within our microservices is also highly responsive, nonblocking and reactive. How do we enable greater resiliency within our application?
When it comes to creating resilient producers, we have to consider our delivery guarantees and our configuration. With resilient consumers, the configuration value we need to be aware of is how we're committing our offsets. How often do you retry sending that record or message?
Kafka is designed to work well at scale and to be scalable itself. For producing applications it does this using partitions. Each consumer will be assigned a partition. This really allows sort of scalability of consumers by grouping them.
How do we actually go about writing reactive Kafka applications? What tools and technologies can we utilize? Take a look at some of the open source reactive frameworks and toolkits that are available.
eclipse microprofile is an open source, community driven specification for enterprise Java microservices. It interoperates with the reactive stream specification that we mentioned earlier. And actually we utilize Kafka in one of our demo applications when we were converting it from a standard Java Kafka application to a reacting one.
So we've got a system microservice and an inventory microservice, and they're communicating via Kafka in the middle there. We're going to be utilizing these touch commands to make it a bit easier to create our file system. After we complete all of this lab, this module, you can check out the finished application.
We need to create the microprofile config properties for each microservice. We're using the incoming annotation because we're consuming from Kafka, but we're still consuming from the same system load. Make sure you save because otherwise your application won't be able to build.
Right now these version of Kafka we're using needs zookeeper for metadata information. So hopefully that should show you that actually it can be fairly easy utilizing a reactive framework or toolkit to create really reactive responsive Java kafka based applications.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

You. Hello and welcome to my session reacting to an event driven world as part of Comp 42. My name is Grace Janssen and I'm a developer advocate working at IBM based in the UK. And my twitter handle there is there. If you'd like to follow me. I primarily focus most of my time sort of in the Java and the JVM ecosystem, and mostly looking at things like reacting technologies and cloud native in infrastructure and cloud native technologies. Hopefully in this session, IBM going to be showing you how we can build, can architect and design our applications to be reactive to this sort of event driven world that we've created. So let's get started. But obviously no conference session is complete without a coffee demo. So we've got a coffee demo here for you today. Kind of ironic, because I don't like coffee, but anyway, so this coffee example is actually available as an open source project. So if you'd like to check it out, the link is there in this slide made by our cousins over at Red Hat. So in this coffee example, we have coffee lovers coming in and making coffee orders via HTTP request to a coffee shop. Then the coffee shop sends that order to the baristas via HTTP again, and the baristas can make the coffee. Now the issue with this is that when the coffee lovers make this request, it's a blocking request due to the nature of this. And so the coffee lovers essentially have to wait at the till, until the barista has completed their order, until they can sit down with their coffee and they can't place the next order until the barista has completed that first order. So it's a very blocking process in this regard. But what people often suggest when they're looking at applications like this to make it slightly more asynchronous and more non blocking, is to introduce something called events driven architecture. So event driven architecture, or it's often shortened to Eda, is a very popular architectural style that really enables events to be placed at the heart of our systems and our applications. When I talk about events, an event here is essentially a record of something that's changed or something that's happened within the system. So it's usually sort of a state change within our system. And these events are immutable, so they cannot be changed, but they can be sort of a new event can be created to represent a new state change to that same entity or object. They're ordered in sequence of their creation. So you can see these, we've got a basic sort of architectural diagram of what this really sort of represents. So we've got that event driven messaging backbone. So this is that immutable log of events placed at the heart of our system. And then we've got microservices consuming and producing to this event log. So you can see microservice one, there is publishing events to that event log, microservice two and three, interestingly actually publish and subscribe and consume from that event log and service four is just consuming. So event driven architecture is just about having sort of loosely coupled microservices that are able to exchange information through this production and consumption of events. So let's see what it looks like when we put it into our barista example. When we put it in, as you can see here, we're still using HTTP to communicate between the coffee lovers and the coffee shop. But now we've got an event backbone between the coffee shop and the barista and a new component called the board. So in this case, when our coffee order is made by our coffee lovers, it goes to the coffee shop and it gets added onto an event backbone onto that queue or that topic. And that means that the barista can read from that topic as and when he's finished or can read the next record or the next coffee that needs to be made. And then once the barista has made those coffees, they can then update the board to basically announce when that coffee has been produced. And the coffee lovers will then know to go and collect their order. What this means is that the coffee lovers no longer have to wait at the till for each of their orders to be processed. Instead they can leave these till knowing that their order has been placed and that it will eventually be made. So their request will eventually be satisfied. They can go down, sit down, talk to their friends, probably discuss Java as we all love to do, and then once they see the board has updated, they can then go and collect their coffee. So it's a much more asynchronous process here. We're trying to get rid of some of that blocking component from our original application to introduce these style of architecture. A tool that's often used is Apache Kafka. So if you've not come across Apache Kafka, it's an open source project that's all about being sort of providing this distributed streaming platform. So it enables things like stream history, that immutable data that we were talking about for that event log in the center of that diagram. It enables really high availability of the information within your application and it is extremely scalable. So if we were to introduce, say, Kafka in order to create that event driven backbone in our barista application. What does this really mean? So, does this mean that our coffee shop is now non blocking and highly responsive? In other words, is our microservice system non blocking and highly responsive? And people often assume that the answer to this question is, well, yes, I'm using Kafka. I've got this adventure of an architecture, so I must be highly responsive, I'm scalable, I must be non blocking, because I've got this asynchronicity, this decoupling. That's not the answer to everything. There is more to this application if we want it to really be non blocking and highly responsive from end to end, than just shoving in tools like Kafka, or architectural approaches like events driven architecture, and expecting it to magically be non blocking and highly responsive all the way through. Kafka is a great tool, but it isn't enough just to have a good tool shove it in. We need to be using it in the right ways, and we need to be building our applications in the right manner to really enable non blocking all the way through. So this is where this concept of reacting systems comes in. So reactive systems is really about looking at that sort of high level approach, that system level view of our application, to ensure that every stage of our application is asynchronous and nonblocking, and highly responsive and reactive. So this all stemmed from a manifesto that was created back in 2013 by a group of software engineers who really wanted to sort of lay out the key characteristics and behaviors we need to be expressing in our applications in order for them to be responsive and reactive for our end users. And it's sort of based around four key behaviors or characteristics. So the underlying behavior in this manifesto is having this message driven form of communication. Now, this is all about decoupling the different components of your applications so that a potential failure in one wouldn't cause a potential failure across the application and mean that your application is unresponsive. It also enables us to have that asynchronicity between components because we've decoupled them. So this message driven form of communication really allows us to achieve this asynchronicity. But interestingly, in the original manifesto, it was actually event driven. And you can check out why they switched between reactive manifesto 1.0 to reactive manifesto 20 from event driven to message driven, there's an article online, you can just go and google it if you'd like to find out more information about that. But what's the difference between the two, because often when we think of things like event driven, architecture clues kind of in the name, we think of it being based around sort of events and being event driven. So what's the difference between the two? Well, we've got a definition here on the left for messages and a definition here on the right for events. So we've defined a message as an item of data sent to a specific location, whereas an event is more of a signal emitted by a component upon reaching a given state. So less location specific there. But interestingly, actually, when you look at this, you can have a sort of mesh of these two where a message can contain an encoded event in its payload. So actually you can have sort of a merge of the two. They're not necessarily distinct types of sort of communication methods, but when it comes to Kafka, it doesn't really matter. Although Kafka was traditionally seen sort of as event driven, they actually don't reference events anywhere in their sort of description of what they enable. Instead, they reference something called records. So they enable you to publish and subscribe to streams of records, store records in a durable way, and process streams of records as they occur. So actually, they've made this deliberate move away from the word events and instead use records because it can be used for both both events, event driven and message driven, or sort of a hybrid between the two. So actually, when it comes to our reactive manifesto and enabling that message driven, asynchronous backbone of communication for application, Kafka can be a really great tool to use. But there's a reason there's additional characteristics listed in this manifesto that are needed to be expressed by our applications in order for us to be reactive. So let's take a look at our barista example and see how we might come into problems if we were just shoving an event backbone and expecting it to be magically sort of nonblocking and asynchronous all the way through. So here's our original barista example. But in this example, we've only got three coffee lovers. Now, there are more than three coffee lovers in the world. So what happens when we get an influx of coffee lovers coming into our application or to our coffee shop trying to place coffee orders? So essentially what we've done here is we've created a massive increase in load on our system. Now, more coffee lovers obviously means more coffee orders. So although these coffee lovers aren't necessarily blocked, because once they make their order, it's placed onto the backbone and they can go back down, sit down, they're not blocked in this process, necessarily. It does mean, however, we've got this huge backlog of coffee orders on our event backbone, and unfortunately, we've only got one barista to serve them, or one barista to create them or, and complete those requests. So, in this case, we would have a potential form of maybe contention, potential failure. And it would mean, essentially, even if our barista doesn't go down or fail, or get stuck in a process, our barista is going to be slow at getting through all of those coffee orders. So our app is going to be not necessarily as responsive as we would necessarily like. Our coffee lovers are going to be waiting a while for their coffee. And this is where the next behavior in the reacting manifesto comes in, elasticity. So, being able to scale both up and down the resources within our applications, so that we can gracefully deal with load fluctuations to our application. So, it's important that, again, this change, this behavior changed between the reactor manifesto 1.0 to 2.0. It was scalable. It then got changed to elastic, because they realized that it wasn't just about scaling up the resources, although this is great when you've got increased load, but when you don't have that load based on your system, it's really important that we're able to appropriately scale our application's resources back down so that we're as cost effective as possible. So, this is where elasticity comes in. So, in our barista example, if we were able to elastically scale up the number of barista microservices, we could provide a much more responsive, much more reacting application for our coffee lovers, because we could get through those coffee orders much quicker and gracefully deal with that load without these barista becoming a potential point of failure or contention within our application. Another example here, in this barista example, we've added an additional component. So this additional component is, we're calling it the coffee serving table. So, this is representative of, say, a downstream microservice that's perhaps needed for additional processing, or perhaps an external service that you're utilizing, like an external database. So, in this case, for the barista to be able to update the board and produce and serve the coffee, they have to place the coffee on the serving table for the coffee lover to come and collect. Now, what happens in the potential scenario where our coffee lovers are being a little bit lazy, they're not coming to collect their coffee orders very quickly. And so these coffee orders are building up on the table. The serving table is now full, and the barista is juggling coffee because they can't put it down on the serving table. That's essentially representing that third party or downstream component going offline, perhaps failing, or potentially getting stuck in a process. And that prevents that barista from being able to essentially produce any more coffees because it's stuck with the coffees that it's trying to load off to that downstream component. So in this case, this could potentially become a non responsive app. Because we're no longer able to get through any more coffee orders because we don't have resiliency built in. And that's the next characteristic, being able to be resilient in the face of any potential failure that could occur within your application. So if we were to say, maybe, perhaps introduce some resilient behaviors into our application, there we could introduce, say, an event breaker, or a circuit breaker. Sorry. Or perhaps even things like back pressure communication. So basically, letting the application know that one of those downstream components is failing, or is stuck with a load of some kind or a process. And being able to build in resilient characteristics, like perhaps spinning up another barista and serving table instance or replica, so that we could redirect requests to them, or share the load across them, or perhaps rate limit the number of orders coming in. All of these behaviors would help to prevent our application from potentially failing and becoming unresponsive. And this leads to the last characteristic. By enabling elasticity, resiliency, and that message driven, asynchronous form of communication, we can enable the last characteristic of the manifesto, which is being responsive. We need our applications to be as responsive as possible to any state changes or events that are occurring within our application. And by implementing these characteristics, we can essentially achieve that reacting, non blocking behavior and characteristic we need for our event driven world that we now live in. So, how do we go about actually building these systems, building these types of applications? Everything I've mentioned so far has been fairly high level, fairly based around sort of characteristics and behaviors of our application. Let's talk about these practical how do we achieve this? So we've got here a fairly basic application, just made up of three microservices, just so we can go through the various different aspects and I guess layers within it that we need to be considering. We've already looked at the message driven, asynchronous sort of data layer and how Kafka can enable us to help decouple the aspect, the components within our application, and provide that message driven asynchronicity sort of backbone within our application. So that's great. For the data layer. But we also need to think about how our microservices are interacting together, whether that's through Kafka or not. And so for this we can introduce reacting architecture patterns. So reactive architecture design patterns, there are many of them. We've just listed four here. It's not an extensive list. These are just four that we've come into sort of contact with fairly regularly when looking at reactive applications. So we've got here cqrs, which stands for command, query, responsibility, segregation. That's essentially all about splitting out the read and the write APIs so that we can have really high availability of the data within our application. Then we've got circuit breaker. So circuit breaking is very similar to the concept in electrical engineering as it is in software. So it's essentially being these circuit. So in this case, an upstream component recognizes when a downstream component is under stress or load, or perhaps is failing by perhaps identifying the same error message coming out of its logs. And that upstream component can then temporarily put a stop on the request route to that downstream component or that downstream microservice. And instead it can reroute that request to an alternative replica microservice. And then once that downstream component essentially becomes healthy again, it can then essentially reinstantiate that route to that downstream component, which prevents it from becoming a potential point of contentional failure. Then we've got sagas. So sagas are essentially a mechanism that takes what would have been a more traditional transaction that maybe we'd have done in a monolithic architecture and do it in a more distributed manner. So we create these sort of multiple microtransactions that have this fallback behavior to account for things potentially going wrong partway through. So it's more of like a sequence of local transactions where that transaction updates data within a single service. And then we've got backpressure. So backpressure I sort of mentioned earlier, as well as circuit breaker, as a potential solution to enabling that resiliency within our barista microservice application. Backpressure is a form of feedback. So it's a feedback mechanism from a downstream component to an upstream component to essentially help rate limit so that that downstream component doesn't become overloaded or overworked or stuck in a process and potentially become a point of contention or failure. So it's a communication and feedback so that we can rate limit these requests or the messages or events coming to that downstream component from the upstream component. And many of these are utilized in reactive applications. But as I said, you can go online and find many, many more reactive architecture design patterns that you can utilize when you're looking at sort of making sure that the communication between components of your application is asynchronous and nonblocking as possible. But what about within our microservices? This is sort of the next level we need to be thinking about. We also need to ensure that the logic we're writing within our microservices, the logic within our application, is also highly responsive, nonblocking and reactive. And to do this we can utilize something called reactive programming. So reacting programming, here's a basic definition. It's all about asynchronicity. Again, it's a paradigm where the availability of new information is what drives the logic forward, rather than just by a thread of execution. And we can enable this through several different programming patterns. So as a Java developer, you're probably familiar with the concept of futures. If you're not a future is essentially a promise to hold the result of some operation until that operation is complete. So really focusing on that asynchronicity there, then we've got reactive programming libraries. So they are all about composing asynchronous again and event based programs, and they include examples like Rx Java and smallrie mutiny and actually lots of the frameworks. We'll be going on to later utilize these programming libraries within them. And then we have the reacting stream specification. So this specification is really a community driven effort to provide a standard for handling asynchronous data streams in a non blocking manner while providing back pressure to stream publishers. And again, many of the frameworks we'll look at in a minute utilize this community driven open source specification. So we've looked at sort of the different layers of our application and where we need to be making sure that we have this sort of reactive behavior. We're looking at the data layer, we're looking at the microservice within the microservice layer, and between the components, between the microservice layer. But we need to also make sure the configuration we're using with tools like Kafka also enable these reactive behaviors. So let's take a look at how we can really utilize Kafka as best as possible for these reactive systems. So as I said before, Kafka can be a really great tool for enabling this message driven asynchronous form of sort of communication within our application. But what about the other sort of characteristics and behaviors within the manifesto? So let's look at resiliency first. How do we enable greater resiliency in Kafka. So Kafka actually is really beneficial in that it actually has sort of this inbuilt resiliency within it. So it's different from traditional message queuing systems because it provides this stream history. So when a consumer loads a record from Kafka, the record isn't removed from the topic. So that means that both that consumer and any other consumers can reconsume that record at a later time if they need to. And that's really useful for enabling consuming applications to recover really well from potential failures. And that's where this sort of immutable data comes in. It doesn't get deleted, and that stream history remains, which means we have a really resilient form of data retention and data persistence for our application. Kafka also has inbuilt resiliency in regards to actually how it works itself. So Kafka, essentially, when you start up a Kafka cluster, it will contain a set of kafka brokers, and a cluster usually has a minimum of about three brokers. And we'll see sort of the reasons behind that later. So Kafka is then broken down further into topics. And these topics are where we store our records. So a topic is sort of a logical grouping of a similar kind of message or record. And then it's up to you to define which messages or which records will go into which topic. So for example, with our barista example, we might have one topic for coffee orders, and we might have another topic for user information updates. But it's up to you to define what those topics are and which messages and records are going into which. Within a topic, you'll have one or more partitions. Now, partitions are distributed across the Kafka brokers. As you can see here, we've got partition one and broker one, partition two and broker two and partition three and broker three. When it comes to the brokers themselves, we have something called leaders and followers. So for each topic, one of the brokers is elected. This is all automatic by Kafka, but it's elected as the leader. And the other brokers are automatically assigned sort of a follower status replication. So these application, your application will connect with the leader broker. And replication of those records being sent from the application to Kafka get replicated by the followers repeatedly fetching messages from the leader. Again, this is all done automatically by Kafka. So this means that all the apps will connect to the leader, they'll consume and produce to the leader, and then that information will be duplicated across the different brokers due to that follower behavior. But what happens in the potential scenario where our leader broker, for example, goes down? Well, in that case, you might think that the application's connection with Kafka would be broken. But, no, that's not the case. Again, we've got this resilient built in behavior. So instead, what happens is a leader election occurs. So, a leader election is where Kafka automatically then assigns one of the followers to become the new leader and automatically switches over that communication to the new leader broker with your application. So it ensures that your application essentially continues to be able to communicate to Kafka to consume and produce. And it means that we still have a replica microservice, a replica broker. Sorry. So that we can still replicate that data and have that data persistence and ensure that we've got that resiliency behavior built in in case that second leader were to go down, for example. And that means it gives enough time for that broker to essentially come back up again and become a new follower. So that's really great that it means that we don't lose any data and we don't lose that connection with Kafka. It's that resilient behavior built in. So, we've talked about how Kafka itself is resilient. How about how we communicate with it, with our application? So, when it comes to creating resilient producers, we have sort of two things that we need to be considering, our delivery guarantees and our configuration. So, when it comes to trying to be as resilient as possible, you can't just do this sort of fire and forget method if you want full resiliency. Because if the broker were to go down with a fire and forget methodology, your messages could be lost. So we need to be thinking about our delivery guarantees to make sure that those records are being received. So, at most once, there's two sort of options. At most once and at least once at most once. With at most once, you might lose some messages. It's not completely resilient, but if you don't mind some of the messages getting sort of getting lost, this does increase greater throughput. But it isn't the most resilient configuration value we could use. Instead, we'd be looking at at least once, which ensures that there is guaranteed delivery of that record. But you may get duplicates. So that is something to consider when you're trying to pick which delivery guarantee to use. Then we've got the configuration. So we've got axe and retries. So axe is short for acknowledgement. And this is essentially acknowledging that the records or messages have been received by Kafka. You can either set axe to zero, one, or all. So zero is for when you really don't care about getting the message acknowledged. It's really good for fast throughput, but not for resiliency. One. If you set axe to one, it makes sure that the leader successfully received the message, but doesn't really bother about making sure the followers have received it. The danger with this is that if, for example, your broker were to go down after receiving and acknowledging if the leader had acknowledged that record but the followers hadn't been able to replicate it, you could potentially lose that record. So it's, again, not the most resilient, but it is faster in terms of throughput than all. So all is the last option. And that's essentially where you wait for all of the replicas to confirm that they've successfully received the message or record. And it's the most resilient configuration value for this and these. The other thing you need to consider is retries. So, retries are for if the axe times out or fails. So how often do you want to retry sending that record or message? How often do you try reproducing that event? But what you need to think about is how will that retry potentially affect your ordering? So this is important to consider when you're setting this retries configuration. So, let's take a look at consumers. So, with resilient consumers, the configuration value we need to be aware of is our sort of how we're committing our offsets. So, again, because Kafka has these stream history, and it retains all of the data within it. So it's got this persistence. It means that we need to be aware of where we've read up to so that we don't reread a message. If our consumer was to go down, for example, and come back up again. And to do this, we use something called offsets. So, offset is just a value assigned to each record within a topic, and it increases over time. For example, if we added a new record to this particular topic, it might be either five, depending on what you class the dotted line as, or six. So there are different methods of committing that offset. So it means when you commit it. So, for example, if we had a consumer and they'd read up to one, we'd want to commit one. So that if the consumer went offline or had to come back up again, then they would start at two instead of at one. So the two different methods of committing off offsets are manual and automatic. So let's take a look at our barista example again to see how these differ in terms of their resiliency for our application. So with autocommit, there's no code in the application that determines when this offset is going to be committed. Instead, it's relying on default settings in the underlying client, which in this case is probably a Java client. And that means it's essentially based off a timer. So it will automatically commit offsets for the messages that the consumer has read from Kafka based on a default timer method. So in this case, the barista is looking at this topic. And on our topic we have three records which represents three orders from our three coffee lovers. So we've got a coffee order, a cappuccino order and a latte order. So in this, our barista will start at the beginning and will start producing the coffee order. Now, because we're on auto commit after an allotted period of time, our offset is going to be committed. So we're going to commit that offset represented by a tick here. But unfortunately, whilst the barista is making the coffee order, they trip and they spill the coffee. So they've got no coffee left anymore. So this is representing that barista microservice going down partway through processing that record. What this means is that actually when the barista microservice comes back up again, it looks at where it's committed its offset. It's already committed the first offset, that coffee order. So instead it starts at cappuccino. They successfully make cappuccino and latte. But that means at the end of it all, we've only got two coffee orders instead of three. So we've lost a record somewhere along the way. However, let's take a look at manual commit. So manual commit is the sort of other option you can choose for when to commit that offset for your consumers. And this is where you write code into your application to determine when that's going to be committed. That might be preprocessing midway through processing if you want, or post processing. So in this case, we're going to do it post processing. We're going to only commit the offset when we've actually finished to the coffee order and served it up to our customer. So in this case, when we're making the coffee order, if we were to spill it, we haven't committed that offset yet. So when our barista microservice comes back up again, it would know to start back at coffee. So now we only put that tick up, we only commit that offset when the coffee is served. And that means by the end of it all, we've got a much more resilient system because we haven't lost any records, because we're committing our offsets at the time at which we expect that behavior to occur. So this is how we know Kafka has built in resiliency through its stream history and its immutable data and things like its broker system, its leader, elections, et cetera. And we're able to introduce greater resiliency through things like our configuration for our communication between our consumers and our producers with Kafka, through things like acknowledgements, the commit method you're using, etc. So let's take a look at the next behavior elasticity. How do we enable this sort of scalable behavior when utilizing Kafka? So Kafka is designed to work well at scale and to be scalable itself and for producing applications it does this using partitions. So for any particular topic there are usually one or more partitions. So in this case there are three, because when I created the topic I specified that I wanted three partitions. So it is something you need to specify. Kafka will then aim to spread the partitions across particular, for a particular topic, sorry, across different brokers. This allows us to really scale produced very easily as they won't have load for one particular topic on just one specific broker. And we can always add more brokers if we want to and spread the loads out more. So it's really great for enabling that spreading of load to gracefully handle load in our application and enable that scalability for sort of producers. Actually the interesting part when it comes to scalability and when we really need to sort of make a conscious effort to enable this behavior is in our consumers. So consuming in Kafka is made possible by something called consumer groups. So when you're consuming messages in Kafka because of this stream history, again, we need to think about how we're going to handle it, because if we have multiple consumers all trying to consume from the same topic, we're running the risk of potentially applications, records or not getting ordering guarantees, et cetera, et cetera. So in Kafka we have this consumer group idea to help ensure that we have sort of ordering guarantee and that we're not rereading messages from these same consumers. So this really allows sort of scalability of consumers by grouping them. And this grouping is enabled via a config value. So you put in a group id so that you know which consumers are joining which consumer? The group. So let's take a look at what this looks like in practice. So on these left hand side here, we have our topic, and we've got our different partitions with some of our records, and you see the offset on them there. And then on the right hand side, we have two consumer groups. Consumer group a has three consumers within it, and consumer group b has two consumers within it. So let's take a look at how they're going to be linked up to the different partitions. So because you've got three consumers in consumer group a and three partitions, each consumer will be assigned a partition. So that means that they're only reading from one partition. And that's important because that's where the ordering guarantee comes in. Now with consumer group b, there's only two consumers. So one of those consumers will read from two partitions. So essentially you can see that bottom one's reading from partition one and partition two. So what this means is that by only reading from one partition, we can guarantee that they're not going to be reading duplicate messages. And from an ordering standpoint, Kafka provides us ordering guarantee per partition. So it means that for the first consumer, for example, it will get the messages from zero in the correct order. So we can ensure that that ordering guarantee remains. What happens if we want to scale up the number of consumers, though? Here's an example here. If we were to add a consumer to consumer group a, it actually would just sit there idle, and that's because there's no partition for it to read from because we've already got three consumers and three partitions that already matched up, and we can't assign the same partition to an additional consumer within the same consumer group because we'd lose that guarantee that we're not duplicating records and we'd also lose that ordering guarantee. So actually this consumer would just sit there idle. Now, it might be useful, for example, if you wanted a spare in case one of the other consumers went down, it could just pick up where the other one left off and be assigned that partition. But right now it's essentially just a waste of resource because it's not sitting there doing pretty much nothing. However, if you were to add it to consumer group be, it would be able to be assigned a partition because there would be a spare partition, essentially because you've got one consumer consuming from two partitions. So this is where you need to be careful when you're setting up your application. You need to make sure that you're thinking about how many partitions you're going to need for each topic based on how many consumers you're essentially going to be sort of potentially scaling up to when you're utilizing your application. Now you can add new partitions once your system is already up and running, but an important thing to consider is that the more partitions you have, the more load you're putting on a system when a leader election happens, for example, and the ordering is only guaranteed while the number of partitions remains the same. So as soon as you start adding partitions, you lose that ordering guarantee once again. So when you're setting up your system, be sure to think about the number of partitions you're going to need before you set it up. So that's how we can enable scalability within producers and consumers in our application. So we've looked at the different behaviors and how we enable that using Kafka. We've looked at the different layers within our application and where we need to be introducing these reactive behaviors for an end to end non blocking reacting application or reacting system. But how do we actually go about writing reactive Kafka applications? What tools and technologies can we utilize? So there's obviously the standard Java Kafka producer and consumer clients that you can take a look at, but these aren't really designed to be used, they're not optimized for reactive systems. So instead, what we'd really suggest is taking a look at some of the open source reactive frameworks and toolkits that are available that help to provide advantages like simplified Kafka APIs that are reactive built in back pressure, as we mentioned in the reactive architecture patterns and the enabling of asynchronous per record processing. So examples of these open source frameworks specific for sort of reactive Kafka interactions include, but they're not limited to alpaca microprofile and vertex. These are the ones we're going to be taking a look at today. There are others like Project Reactor, which you can definitely take a look at. We're just not going to be going into it in this presentation, so let's take a look at these and see what the differences are. So the Alpaca Kafka connector is essentially it's a connector that allows consuming and producing from Kafka with something called ACA streams. It's part of this ACA framework or toolkit. So ACA and things like the other libraries that sort of interact with it is actually based on something called the actor model, which is slightly different to things like a microservice based application. So in the actor model, it is the actor that's a primitive unit of computation. So it's essentially the thing that receives a message and does some kind of computation based on it. Messages are sent asynchronously between actors and they're stored in this sort of mailbox, and that's how they communicate. It's a different process. It's not a one to one mapping between an actor and a microservice. So it can take a bit of sort of a paradigm or way of shift of thinking if you're going from a microservice based to an actor model. So that's something to bear in mind if you're considering Alpaca and the actor based model and ACA. However, if you're already utilizing the actor based model for your application, definitely worth taking a look at the ACA framework and the alpaca sort of specifically for its connection to Kafka. The next framework we're going to take a look at is eclipse microprofile. So this is really an open source, community driven specification for enterprise Java microservices. So it works with Java E and Jakarta ee, and it's really built by the community. So it's a huge range of individuals, organizations and vendors that contribute to this project. Within this, there are several different APIs that are offered to really enable greater sort of ease of building microservice based applications ready for cloud native through the use of these APIs on top of something like Java ee or Jakarta ee. So the bottom triangle here, the dark gray ones at the bottom in that triangle shape, they're the standard APIs that come as standard as part of the microprofile stack. The ones on the right hand corner here, the little sideways l shape, are the standalone projects that the microprofile community also works on. The hope is that these standalone projects will eventually become part of the standard stack, but right now they're just separate projects that the same community works on and they integrate together. So the one that we're actually interested in for reactive in this case is the reactive messaging specification. So the reactive messaging specification makes use and sort of interoperates with two other specifications. One is the reactive streams operator, which is another standalone project, and it provides a basic set of operators to link different reactive components together and provide kind of processing on the data that passes between them. And it actually interoperates with the reactive stream specification that we mentioned earlier. It's not a microprofile specification it's that community driven specification I mentioned in the reactive programming part of this presentation. So how does microprofile reacting messaging actually work? Well, it works by providing sort of annotations for you to utilize on an application spins methods, so there's incoming and outgoing annotations that you can utilize. And these annotated methods are connected together by something called channels. Now channels are essentially just opaque strings. If you're connecting internally within your application, it's just called a channel. If you're connecting say to an external messaging broker, that channel changes its name to a connector. Now because microprofile is a specification, you'll need to look at the implementations to understand what connectors they offer, because each implementation will offer different types of connectors. So the incoming, you can see here that we've got on method b which says order is connected via that channel order to say the method a with that outgoing order. So that's how it's connected and that's how it enables this reactive sort of behavior. So let's take a look at the final framework. We're going to take a look at this toolkit and this is the eclipse vertex project. So the others are sort of based on especially my profile, Java specific, whereas this is a polyglot toolkit. So it can be used with Java, JavaScript, Scala, Kotlin, et cetera, many other languages. It's based on something called the reacting pattern, and this is essentially a really event driven type of architecture. It uses a single threaded event loop which is actually blocking on resource emitting events and dispatches them to corresponding handlers and callbacks. So it's really event driven, it's non blocking, it runs on the JVM and it includes these distributed event bus and it's single threaded. And actually we utilize Kafka in one of our demo applications when we were converting it from a standard Java Kafka application to be a reacting Java kafka application. So let's take a look at that demo application. So this isn't necessarily what you'd want to do in enterprise, but this application was designed just to be able to help people test their Kafka integration. So have they set up Kafka correctly? Are they able to produce and consume from Kafka? Are they able to utilize it to its full advantage in their application? So in this we have an application that we actually converted to use vertex and it produced to a Kafka topic and then consumes back from that Kafka topic. And we also have a front end application where we can put in custom data for our messages, custom topics, and we can start producing and stop consuming and start consuming and stop consuming, etc. So actually these application is all open source. So if you'd like to utilize it to test your own application and your own Kafka integration, feel free. And the front end is all open source in that as well. Let's take a look at how this application works. So if we press play, hopefully we can see here that we can put these in our custom message. We can then click that start producing button, and then we can click the start consuming button on the right hand side to see those records that have been sent to Kafka coming back and being consumed from Kafka. So we can see that here, but we can also stop consuming and stop producing again. So in this, as we said from the websocket, we're sending this start stop commands. And so really we needed a way to be able to start and stop that within Kafka. And when we weren't using reactive frameworks, it become a bit of a threading nightmare because we had to sort of switch over threads depending on what we were doing. We had to instigate this pause and resume functionality and it was quite complicated and difficult to achieve. Actually, when we switched to using vertex we found all sorts of advantages. So for the producing side of things, in the original version of our application, we were doing this sort of start command that was sent to the back end from the websocket. That backend then started the producer, which sent a record every 2 seconds, and then we were sending the stop command from the websocket to the back end and the back end then would stop the producer and no records would be produced. So in these application, the standard producer client was essentially this call to produce a record as a blocking call. But switching over to the vertex producer meant that we could now get a future back from the send function. So it was a fairly small change, but it enabled us to switch from a blocking call to more asynchronous style call. So it meant we were able to asynchronously produce new records while waiting for the acknowledgement from Kafka of these in flight records. So it's a fairly small code change, as you can see here. But changing from non blocking to asynchronous is from sort of blocking to asynchronous is really important in reactive systems. But let's take a look at consuming, because this is where we saw sort of the greatest difference in the original version of the application, we were having to use this sort of for loop inside a while loop that resulted in this sort of flow. These we were polling for records and that poll function then returned a batch of records, and then we're having to iterate through each record in the batch and these for each record, send the record along the websocket to the front end, and then only after that entire iteration through all the records was complete, we'd be able to then essentially go and fetch new records. So it's quite blocking because we couldn't do anything whilst those records were being processed. And it didn't feel very asynchronous because we were grabbing batches of records rather than responding each time a new record came in. So instead, when we switched to the vertex Kafka client, we were able to use this concept of a handler function. So every time a new record arrived we were essentially able to call this handler function. So it made our flow a lot simpler and we were now able to essentially hand off the step of polling for new records to allow vertex to do that. So essentially now we just receive a new record and then we send that record log in the websocket to the front end for the processing. So this essentially allowed us to process records on a per record basis rather than a batching basis, and it left us free to be able to focus on the processing of records rather than the work of consuming that from Kafka. If you want to find out sort of all of the benefits that we experienced by switching from a non reactive to a reacting Java Kafka application, we've written up a blog here that you can access on the IBM developer site. Feel free to take a look and hopefully it will show you sort of some of the benefits you might be able to experience by switching to utilize some of these open source reactive frameworks and toolkits. So hopefully, in summary, what I've shown you here is that by taking sort of a non reactive application and sticking Kafka in, or sticking some adventure of an architectural tool in, it doesn't give you magically this asynchronous non blocking reactive application. We need to be carefully considering the Kafka configuration that we use to create the most reactive system possible. And we can utilize sort of these open source toolkits and frameworks that can provide additional benefits as well for us to be able to create the reacting characteristics and behaviors expected and needed to be able to create this asynchronous end to end application. But the open source reactive community is on hand. As I said, there are many open source frameworks and toolkits you can utilize depending on the architectural style of your application. So I'm going to demonstrate just how easy it is to get started by utilizing some of these reactive frameworks for Java kafka based applications. So I've gone on to our interactive online environment here where you can utilize some of our modules to get started with this. And I'm looking at module one here. So reacting reactive Java microservices where we'll be utilizing the microprofile reactive messaging to write reactive Java microservices that interact with Kafka. So I've already got the tab open here. So when you would open it up, it will ask you to sort of log in via social and that could be sort of any social media really. I've already logged in. And then when you log in, you get this online environment. So on the left hand side here I've got the instructions. On the right hand side I've got can ide. And if I go to terminal, new terminal, I have a new terminal turn up in the bottom right hand corner. Now these labs will be available after this session, so feel free to go through them at your own pace and go through the labs I don't cover in this session today. So let's get started. So the great thing about all of these different loads is that they actually utilize the same application. So here's a basic architectural diagram to look at, sort of what that looks like. So we've got a system microservice and an inventory microservice, and they're communicating via Kafka in the middle there. So actually the system microservice essentially calculates and publishes an event that contains its average system load every 15 seconds. And it publishes that to Kafka. And then the inventory microservice consumes that information and keeps an updated list of all the systems and these current system loads. We can access that list by accessing the systems rest endpoint of the inventory microservice. So you can see here we're utilizing microprofile reactor messaging in this section of the application and we're utilizing restful APIs to access that endpoint. So without further ado, let's get started. So let's just make sure I'm in the right directory here, just in case. Great. And then we can do a git clone of our repository and then we can CD into the correct directory which is just guide microprofile reactive messaging. So we're actually utilizing the guides here from the open Liberty website. If you want to take a look at what the finished application should look like after we complete all of this lab, this module, you can check out the finished directory, but we're going to head into the start directory for these lab because that's essentially where we can do all the work. And by the end of it we should get to the same as the finished directory and it's the same for all of our labs for that. So we're going to be utilizing these touch commands to make it a bit easier to create our file system. And then if I head to the explorer I should be able to see this u represents where I created sort of that new file. So if we follow that along it should open up for us and we should be able to see that file if I move this over. So there's the system service class that we're creating first. So this is a class for the system microservice, which is the one that's producing every 15 seconds the average system load. So cpu usage for the last minute. So if we open that up we should see an empty file and then we can insert the code we've got available here for you. Ibm just going to close the Explorer just so it's easier to see in this environment you do need to save. There isn't an auto save, so make sure you save because otherwise your application won't be able to build. So yeah, you should see that dot disappear once it saves. So because this is a system microservice, we're utilizing the microprofile reactive messaging outgoing annotation because we're producing to Kafka and we're producing to the Kafka topic system load. As you can see here, we're actually making use in this application of one of the reactive programming libraries, Rxjava, which you can see we're importing here as well as obviously we're importing the microprofile reactive messaging specification here you can see. So this Rxjava that we're utilizing, we're utilizing a method called flowable. So you can see at the bottom here we're utilizing flowable interval and this allows us to essentially set these frequency of how often the system service publishes the calculation to the event stream. We're utilizing flowable because it enables that inbuilt back pressure. The previous sort of iteration of rxjava included observable, but they introduced flowable so that they could have that inbuilt back pressure as an option. So now you have a choice between flowable and observable. We're utilizing flowable in this. So you can see here's a basic architectural diagram of what we've just created in this code here. So let's continue on and look at the inventory microservice. So for the inventory microservice, we're just going to use this touch command again and we're creating this inventory resource class. So again, if we head to our sort of explorer mode, if I close system and head to inventory, we can see this use sort of directs us on where to go, which is really helpful. Otherwise you can take a look here in the instructions and it will show you where to go to find this new file that you've created. So let's open that up. And once we open it up, we can then input all the code that we've got down here. Again, you can utilize the copy buttons in the bottom right hand corner and just paste it into the ide. That way makes it a bit easier than highlighting everything and copying it. So because this is the inventory microservice, we're utilizing the incoming annotation because we're consuming from Kafka, but we're still consuming from the same system load. So hence why we've got the system load there. And you can see a basic architectural diagram, these on the bottom left of what this should look like when it's all connected together. So now that we've created these two different classes, we now need to create the microprofile config properties for each microservice. So again, we can utilize this touch command to create the file that we need. So this time we're going to system first because that's the one we're looking at first. So if we head to system and instead of going into Java here, we're going to go into resources and there's the microprofile config properties file. So again, if we open that up and then copy and paste the configuration we've got here in the instructions. And again, remember to save, I'll close my explorer for this. So here you can see that because we're utilizing the outgoing annotation, you can see outgoing in our configuration. And I mentioned before how if it's an external channel that we're trying to connect through, so we're trying to connect to can external messaging broker, it's actually called a connector and each implementation's connectors are different. So because we're utilizing the open Liberty implementation of microprofile, here we're utilizing the Liberty Kafka connector. You can also see here is where we specify which Kafka topic that we want to connect to. In this case system load and we've got a serializer here so we can convert our object into JSOn. For Kafka, let's take a look at creating the same configuration file, but this time for the inventory microservice. So if we use that touch command again, head to our explorer and this time I'm going to close down system and open up inventory again. Make sure you're not going into Java, we need to be going into resources instead. And there's our microprofile configuration file. So if we open that up and then we can copy and paste in our different configuration. So for this one it's a little bit different because we're consuming from Kafka we actually have slightly different configuration. So we're using the incoming annotation. So we've got incoming in the configuration here. We're still utilizing the same Liberty Kafka connector because we're still connecting to Kafka using the open Liberty implementation of microprofile and we're still connecting to the same topic within Kafka. This time we've got a deserializer because we want to turn it from JSON back into an object. But we've also got an extra configuration value here that we didn't have in the previous configuration properties file. So in this we've got a group id. And this is what I was referring to when we were talking about Kafka having this idea of consumer groups. Because this is a consumer from Kafka, we need to assign it a group id so that if we were to spin up any other consumers, they would join the same consumer group and we wouldn't be duplicating processing of the same record or rereading the same record and we would have that ordering guarantee. So that's why we have that extra configuration value in this particular configuration for the inventory microservice. So the next step is to create a server XML file. So again we can just create this using the touch command we've already got here. Head to our explorer. We're creating this for the server microservice, sorry, for the system microservice we've already created the same server configuration file for the inventory, which is why you won't be doing it during the steps of this guide. You can go and check that out if you would like to. This time we're heading into system and then into liberty. And here you can see the server XML file that we need. So if we open that up and then I can copy and paste this code in and you can see here the different features that we make use of. So you can see we're making use of the microprofile reactive messaging specification, but we're also making use of several other APIs that are offered as standards as part of the microprofile stack. So things like microprofile configuration. So the configuration files we just made are actually external to our application logic, which is fantastic when you're trying to make cloud native applications. So we use it making use of microprofile config for that. We're also making use of things like microprofile health, JSON B and CDI. Again, all of these are part of that standard microprofile stack and you can go and check them out if you want to do some additional work on understanding those different APIs. We have guides on the open Liberty website that you can take a look at for each of those. So as I said, we've already created the same for the inventory microservice because it's exactly the same file. So we're not going to bother doing that here. So now let's go ahead and create the maven configuration file so we can actually build this project or this application again utilizing the touch command. This time I'm going to close down Src and I should see underneath system that POM XMl file I've just created. So again, if we open that up and head into it, and then I can copy and paste all of the configuration I need into this file from the instructions copy and paste. And then remember to save I'll just close my explorer so you can see it better. So in this you can see all of the different dependencies that we make use of. So for example, there's that Rx Java dependency, that reactive programming library, and then we're also making use of Apache Kafka, and here's that microprofile reactive messaging specification and the standard microprofile stack as well. And we're also making use of Jakarta ee. So it's just interesting for you to see some of the dependencies that we're making use of in this project. So let's go ahead and start reacting this application. Let's run the maven install and package loads so that we can create it. One mistake I've made there is make sure that you are in the start directory. You don't need to be in the start directory for any of the touch commands because they automatically make sure they're in the right place. But now that we're creating this sort of application, we need to be in the start directory. So now let's try and run those commands again. Now it's looking a bit better. Fantastic. And then we need to do a maven package. So this docker pull open Liberty command is just to make sure we've got the latest open Liberty Docker image in our sort of shared online environment here. Because we're sharing it, we just want to make sure we got the right addition in and no one sort of messed with that, so it shouldn't take too long. And the great thing about this environment is that if you're trying multiple labs it actually saves your progress. So that's why we have cleanup scripts at the end, so that different directories and different docker images aren't messing up different labs. But it means that if you are doing this docker pill, it will essentially stay there for any of the other labs you do. So you don't have to do this step for the other labs, but it shouldn't take too long in the first place. Anyway, there we go. We're complete. So now we can go ahead and build our two microservices. So first we're going to build the system microservice and then we're going to build the inventory microservice. Shouldn't take too long because they're not very big. And then we're going to utilize this start containers script that we've got which essentially starts up the different docker containers that we need for this application. So we'll start up docker containers for Kafka Zookeeper and these microservices in this project. The reason we're still using Zookeeper is because right now these version of Kafka we're using needs zookeeper for metadata information. Future versions of Kafka should essentially be. Their plan is to enable that metadata to be stored in a topic in Kafka itself. So you won't necessarily need zookeeper in the future, but right now, unfortunately we still need it. So that's why we're having to spin that up as well. So whilst we're waiting for this application to get up and running, the next step that we're going to be doing is essentially we're going to be doing a curl command so that we can ensure that we're successfully getting our system microservice to access and calculate that average system load, publish that to Kafka and these successfully consume that from Kafka in our inventory microservice, update our list of systems and these average system load and then essentially we can access that via that rest endpoint. There we are. So giving it a bit more time. We can see here that now we're able to access the list via the rest endpoint here. So we can see the host name, so the name of the system and the average system loads. If you were to wait a little bit longer, you could then use these same curl command and you should see a change in that system load because these is all being calculated every 15 seconds. So you should see some sort of change. You can see here we've changed from 1.1719 to 1.33. So hopefully that should show you that actually it can be fairly easy utilizing a reactive framework or toolkit to be able to create really reactive responsive Java kafka based applications if you'd like. Sort of a very easy read. It's free, it's an ebook, and it's essentially a summary of everything I've explained in this presentation around reactive. So what reactive systems are, how you utilize them, how to go about building them, and why you'd want to. This really short ebook that I helped author is available for free online. Feel free to download it, share it amongst your colleagues, or utilize it as a reference point for the different aspects of reactive I've covered here in this presentation. As if I haven't given you enough links throughout this presentation, I've also got these a bunch of resources for you to get started with. If you have any additional questions that I'm not able to answer, please feel free to reach out to me on Twitter. My Twitter handle is here at Grace Jansen 27 and I'd be happy to answer your questions. Thanks very much for listening. Have a great day.

Slides

Download slides (PDF)

See all 21 talks at this event!

Conf42 Enterprise Software 2021 - Online

March 25 2021

Reacting to an Event-Driven World

Video size:

Abstract

Summary

Transcript

Slides

Grace Jansen

Developer Advocate @ IBM

Join the community!

Featured event

2025

2024

Info

Conf42 Enterprise Software 2021 - Online

March 25 2021

Reacting to an Event-Driven World

Video size:

Abstract

Summary

Transcript

Slides

Grace Jansen

Developer Advocate @ IBM

Join the community!