Transcript
This transcript was autogenerated. To make changes, submit a PR.
Event driven architectures orchestrating cloud native workflows so
this is the trending and important topic right now, right?
So I'm super excited to explore it.
And Eds all about using
events signals that something has happened to orchestrate
workflows and create responsive, scalable systems.
So let's embark on journey together and unlock the potential.
This is Mustafal Mahmoud working as a software engineer
at Brain station 23. Also proud to be
an AWS community builder in the serverless category for 2023.
Excited to be here today as we explore the fascinating
the world of event driven architecture.
Throughout my journey as a software engineer, I have had the
privilege of working on projects for leading bands
in the various industries. Welcome to my session on eventdriven
architecture. Today I will cover the basics
of EDs. I mean I will break down what they are and
the key concepts that make them why use
the event event architecture and how to use it?
Orchestration and versus choreography I will compare these
two approaches to coordinating tasks in the eventdriven
architecture. I will discuss some key factors
to keep in mind when designing your own event driven systems.
I'll share some pro tips for building robust and
scalable eventdriven. Yeah, I will
show you the live demo for error handling
workflows following the ads architecture
and I will wrap up at the end of this session with some
helpful resources to keep your EDs journey going. So let's
jump right in. So in today's dynamic
world, applications are constantly responding to user sections and
system changes. So eventdriven architecture place events
at the core of application development. So modern
applications are inherently event driven. I mean
events are everywhere. So for example a
customer placing an order, a social media post being
created, or a sensor reading being uploaded.
So event driven architecture place these events at the core
of application development, transforming them
from a byproduct into a
powerful communication mechanism. So service publish events,
something happened and subscribe to relief and events. I mean
react to what happened, leading to loose coupling and faster development
cycles. So there are some cloud providers like AWs,
Azure, GCP. So they offered
a comprehensive toolkit for building serverless event
event architecture services like AWS Lambda,
AWS, Amazon Kinesis for event streaming,
Amazon SNS, SQS, etc.
And Azure provides azure functions for serverless and azure app
service for managed serverless. Also GCP provides
cloud functions and so on. So there are
some core values of event driven architecture.
These are all about building complex applications.
Unlike traditional architectures that constantly ping
for updates, eventdriven architectures reacts to events
in the real time. This shift unlock some amazing benefits.
I mean independent feature development, effortless feature
integration, loose coupling, and modularity.
So imagine microservices like building blocks with AdA, teams can
work independently on services that publish and subscribe to events.
This reduces dependencies and lets
you roll out new features much faster and
effortless feature integration. I mean, adding new
features becomes a breeze, right? So no need to modify existing applications.
New features can simply subscribe to existing events like plugging
into a real time information string for innovation.
Event Architecture's SnConAs post sync lets your system handle
massive volumes of events without bottlenecks. Plus, if one service fails,
it doesn't bring down the whole system. Increase resilience for
your applications. Services communicate through events,
leading to cleaner, more modular code with less complex
dependencies. This makes your application easier to understand,
maintain and scale as your needs grow. I mean,
it enhances flexibility and maintainability by reducing the impact of
changes to one component or others.
These are the core values. Working together create more flexible
and robust foundation for your applications.
So why the organizations follow the event driven architectures?
So there are some strong points. So real time responsiveness
event driven architecture enables applications to detect
and response instantly to events triggered by users and systems,
providing a seamless integration and
interactive user experience with under low latency. So this
real time responsiveness enhance user engagement and satisfactions.
Eventdriven architecture facilitates seamless integration with external systems
and services. This extensibility allows
for integrating additional features and services to enhance the application's
functionality. Finally, we get the
opportunity to minimize the resource
consumption. Unlike traditional request response model, event driven architectures
minimize resource consumption by responding only to events,
reducing delays, and improving server
efficiency. This optimization enhance overall performance
and resource utilization. Right? There are some
key concepts of event, even architectures. I mean
building blocks that work together to create a powerful communication mechanism.
Let's break them down. So there
are some key points. I mean events, event producers,
event consumers, and event brokers. So let's
say events signals that something has happened, an order
created event, or a payment received event. These events are like
little snapshots on in time describing a specific
change that has occurred. They are also immutable,
meaning once created, their content cannot be altered.
This is especially beneficial in complex systems because it
eliminates the need to constantly synchronize data across different components
and event producers. These are the entities that create
and publish events. Think of them
as announcing something newsworthy. Event producers
can be various components like UI,
microservices, iot devices and other enterprise services.
Or different kind of. It can be SaaS applications
also and even consumers on the receiving
end, we have event consumers that there
are the downstream components that get triggered by events
and event can have multiple consumers, each reacting in
its own specific way. Event consumption can
involve starting offloads, running analysis or updating
database based on the receiving event. And another
thing is point event brokers.
Imagine event brokers as the communication hub.
They act as intermediaries between producer and consumers,
so it manages the publishing and
subscribing of events. They buffer communication, ensuring producer
and consumers don't need to be in sync with each other. Event bookers
come in two forms, I mean event routers and event stores,
event routers that actively push events to their subscribed consumers,
and event stores where consumers can
pull events on demand. So by understanding these key concepts
very well, on your way to leveraging the power of the event,
even architectures in your applications, let's talk
about the coupling, how tightly concerned different parts of your
application are. On one end of the spectrum
we have type coupling. So let's see how this
plays out. Development challenges scalability issues,
reduced fault tolerance. So there are many complexity.
So I mean, look at the image on the slide.
An image in additional ecommerce applications where order processing,
billing, shipping and inventory all rely on
synchronous calls, a single service failure could
disrupt the entire flow. This is where event event architecture
comes in. EDF promotes loose coupling.
Higher components communicate through events instead of direct calls.
So this approach offers a significant advantage which we
will explore in the next slide. I mean,
it is considered the power of events.
So let's see how the event driven architecture
addresses the challenges of type coupling with loose coupling, making components
communicating through independent events. Instead of relying on the direct
calls in loose coupling components,
components publish events about their step changes without requiring
immediate response from others. Think of it as sending out messages,
as announcements, components declare what happened,
and interested parties can react accordingly on
their specific way, own specific way.
So in an event event approach, components only, we need to
be aware of the events they publish and subscribe to,
not the internal working of other components. As long as
the event format remains consistent,
changes in one component won't affect other component.
Idem potency is the crucial part. So,
item potency, imagine pushing a button and action happens exactly
once, even if you pass it multiple times.
There's the essence of item potency, ensuring an operation operation
produces the same outcome after the first successful execution, regardless of
the retries. So why the item potency is
important here? This property becomes
especially important in the event, even architectures.
So when dealing with retries,
a common practice for handling potential failures.
For example, let's say a lambda function triggered by an
order place defend. So if the function encounters
an error during the initial execution because
of server failure, because of Internet issue or anything
else, the lambda service might automatically retry the invocation,
right? So without item potency safeguard, this detriment could lead
to complex, serious issue
and duplicate orders, corrupted data and anything else, you name
it. So how to achieve item potency here
in this scenario? So we can include
a unique identifier within each event as
an eigen potency key. This key allows the system to
recognize if an event has already been possessed,
preventing unintended consequences from retirees. So by incorporating
item potency, you ensure data consistency and reliable
operations in your event even architecture.
So there are some common patterns of event to end architectures.
So first of all, point to point messaging.
Actually, it's a fundamental pattern.
It is like sending
a message with a specific recipient in mind.
So this is the essence of point to point messaging.
In event driven architectures, messages are often delivered asynchronously,
meaning the sender doesn't wait
for a response before continuing messaging.
Queues like this on the slide act
as the middle ground, like mailboxes. So producers,
I mean senders, put messages in the queue,
and consumers, like receivers, retrieve them
when they are ready. So on their own specific
way when they are ready. Right? So this asynchronous
approach ensures smooth communication,
not enforced communication.
So even if the receiver is temporarily unavailable.
Plus, these cues act as buffers,
preventing message loss if the receiver is overloaded,
right? So there are several service that
can be used as message queues. Popular options include Amazon
SQs, I mean simple queue service, and Amazon
MQ, powerful tools for reliable message delivery.
And secondly, publish subscribe messaging.
So published subscribe messaging is unlike point
to point messaging, where messages are targeted to a single consumer. But published
subscribe messaging allows messages to be sent to multiple subscribers,
right? So instead of using queues, this pattern typically employs event routers
such as topics or event buses.
Examples of services supporting this pattern include
SNS, I mean simple notification service. So simple notification
service is used for topics. For the event buses,
the Amazon eventdriven is used so
in details to say, topics function
as simple hubs for distributing message to subscribers.
Event buses can provide more complex routing based on the message
attributes. So here in the slide we see
the blue and green rule with message one and message two accordingly.
So the event bus will check the rules, I mean attributes,
and then send them to the targeted consumers, I mean
subscribers. Now I'll talk about the event
streaming. It involves continuous flows of events or
data providing a way to abstract producers and consumers.
Unlike pubsub messaging, where messages are pushed to customers
in event streaming, consumers usually pull for new events,
so consumers maintain their logic for filtering events and
keep track of their position in the stream.
Event streams can consist of individual events like
location updates in a right share app, or data points collected over time
from iot devices. So data source can be anything like
logs, business metrics, any other AWS services,
and here in the middle point, data streams, actually a subset
of event streams, interpret data over time and are often used
for normally real time data analytics
applications or data persistence use cases.
So there are services supporting event and streaming
performance like Amazon Kinesis data stream and
Amazon MSk. I mean Amazon
managed streaming for Apache Kafka.
This is called the Amazon MSk. Okay,
so this is the choreography
and orchestration. This is the common pattern.
Choreography and orchestrating are the two models for how distributed
system distributed application services communicate with each
other. So in the
choreography pattern, we can
say communication happens without a central controller,
events flow between service and services and eservice reacts
to events independently, not dependable
with others. On the other hand, orchestration involves a central coordinating
service that controls the interaction and
order of service information. So we
can say that while choreography promotes decentralized and flexibility,
orchestration provides centralized control and coordination. So many applications
use a combination of both choreography and orchestration,
selecting the model that best fits
with a specific use case. Actually so there are
different use case vary on different uses pattern.
In this slide I will talk about the bounded context.
Actually. So bounded context is a fundamental concept in the domain
driven design representing a core pattern in its strategies.
Design approach strategic design section is dedicated to
handling complex models and teams efficiently.
In the choreography pattern shines the communication between bounded context.
I mean as the same concept is in the
slide the image, there are two bounded
context, I mean sales context and the support context.
So every
bundle context has the multiple microservice multiple
services. So I mean the
100 context, the one domain and one domain have
the multiple services. So for example, in the ecommerce example,
the order service and the inventory service, the order service focuses on
creating, placing, order emitting and order placed event
with the relevant details. There are many events but one
domain order service. This is the bounded context.
Another bounded context is the inventory service. I mean a
separate boundary context subscribe to the events and manage stock
levels. So importantly, both services can be made out
of others internal topic, but the servicing.
Please send the events and the inventory service reacts
accordingly. This approach fosters loose coupling,
scalability and flexibility as well. So event
buses such as event trees can be used for the
choreography. Okay, and now I'll
talk about the orchestration in details. So, orchestrating is
another key pattern in the event event architectures.
So it's a particularly well suited for scenarios within
bounded contest where you need to control
the order of service calls, manage state, and handle errors or retries
effectively. So to solve this problem here, orchestrating comes
in. So for example, in the slide you
can see the document processing in the insurance claims in
this example. So consider the document processing boundary context with the insurance claims
application. This context receives the document uploaded
event and orchestrator service within the context first classifies
the uploaded document using a document classifier service based
on the classifications, I mean driver license or car image.
So the orchestrator directs the workflows if
it's a driver license, the extract driver license
info, I mean detailed service parts, the information regardless of the
document type, the extracted data is
updated in a database. Finally, the document processing domain image
the document accepted event with all the static details.
So, orchestration provides a central centralized
control mechanism for complex workflows and
it ensures proper service execution, order, state management
and error handling also, so leading to a more reliable
and maintainable applications. So here some
services to execution
to execute these workflows. I mean AWS functions
and Amazon managed
workflows for Apache Kafka. So choreographer
and orchestrating are complementary, I mean not mutually exclusive.
So even many applications benefit from
using both patterns for different scenarios.
There are a few main points when both together come.
I mean producer producer orchestrating consumers consumer
orchestrating. Now here some key
points comes in the producer. In the producer
emits events via event breeze orchestrating
the choreography approach and in the same line,
the producer orchestrating part utilizes step
functions within its bonded context for orchestrating
API calls to Amazon API gateway.
On the receiving end, consumers multiple consumers subscribe to
events via choreography approach. I mean SNS topic app
client, I mean SQS Lambda, Amazon API
Gateway application load balancer and
in the same receiving in one there
is one consumer orchestration, I mean one consumer also leveraging the state
functions for internal orchestrating within its boundary context.
And it comes the
same process. So the previous slides explored
the choreography and organization independently, but however, their true power
lies in their ability to be used together within the same applications.
So as illustrated in the enhanced example, a producer
can emit via event breeze for the choreographing
consumption by various services
simultaneously, the producers can leverage estate
functions within its bounded context to orchestrate ape calls.
So on the receiving end, consumers can subscribe to events choreographically,
while one consumer might also employ functions
for internal orchestration within its own boundary context.
So we have the key integrate that.
I mean, by strategically combining choreography and orchestration, we can
gain a powerful toolkit for building scalable,
loosely coupled and adaptable event, even architectures. So this
approach empowers us to effectively model complex
workflows and the interaction within our applications.
There are few combining patterns.
One of them fan out. I mean, distributing a single event
to multiple subscribers. While individual patterns can address specific needs,
I mean, the true power of event lies
combining them strategically. So fan
out pattern is a fundamental concept and actually it allows
a producer to send a single message to multiple
of subscribed consumers.
This approach is particularly useful when sorry.
So let's say, for example, let's say social media notification
system where a user uses post creation
triggers fan out events. This event might be
of interest to various consumers,
such as a service for generating activity feeds,
another service for sending push notifications to followers,
and a service for timeline updates. So the fan of pattern ensures
all this communication. All these consumers receive
the same event, enabling them to perform their tasks efficiently.
Now I'll talk about the event filtering and routing. So it directs
event to specific targets based on the predefined
criteria. It inputs message relevancy for consumers
and reduces unnecessary processing. I mean, event filtering and
routing is a cornerstone of the flexible and
targeted communication in the event even architecture. It enables us
to define criteria that determine which events get delivered to specific
consumers. This ensures that consumers only receive relevant
to their dominant functionality to escape
to remove the unnecessary to remove the getting
the unnecessary messages. So Eventdriven can filter
and route events based on predefined
criteria. In shading, only ten events reach specific consumers.
And now I'll talk about the event and message buffering.
So it utilizes queues as a buffer to manage the message
volume for downstream consumers. It ensures messages are
delivered reliably, even if consumers are temporarily unavailable
or overload. So here the event and message buffering
pattern comes in. It promotes asynchronous
communication and improves overall system resiliency.
So now I'll talk about the workflow orchestrating and it is
the more important on this session.
This slide demonstrates how step functions
can be used to model a KYC workflow,
promoting an event driven approach with loose coupling and low code integration
via event in
the step number one, so we can say new
account request event, right? So the process begins with the
account system publishing and event signing a new
account request. This event triggers the KYC workflows execution within a dedicated KYC
service and the second step QIC verification.
So this might involve tasks like
identity verification and risk profile assessment.
And then step three, identity check completion and
step four, conditional events based on the risk assessment. I mean
after risk profile assessment, the workflow publishes one of
the two events depending on the outcome,
I mean whether account approved or account and
step five, given consumption by downstream services like both account
and consumer service domain. There are two domains actually.
So I have defined roles on the event prepas to process
this KYC workflow, KYC outcome events. I mean,
the account domain likely handles the account
creation or further processing based on the account approved
event, and the customer service domain might be notified for
potential outage to the customer based on the account rejected event.
From this example we have the summary. We have the
benefits of this functional workflows in the event even architecture.
So this approach promotes loose coupling as services don't
rely on direct communication, they simply publish
or subscribe to relevant events on the event principles. And another
thing we can see in this example, the modular workflow
is the function orchestrating the KYC process in a defined sequence,
ensuring a clear and maintainable workflows,
and also low code integration. Utilizing eventdriven
simplifies integration between the KYC service and other domains requiring
minimal custom code. So this is the
low code integration actually. And by leveraging step functional eventdriven.
So here we can see the financial institutions
can establish the robust and scalable device workflow and ensuring
regulatory compliance and efficient customer onboarding.
So far we have explored the various aspects of event driven
architectures. Now let's introduce AWS state functions.
Actually, it's a serverless orchestrating service that can seamlessly integrate
with eventdriven architecture. So there are some key
components for step functions, workflow.
I mean state machine steps and task steps.
So state machine represent your
entire workflow. There are few types
of state machines, I mean twice state,
parallel state and so on, especially a series of event
driven steps. Here comes in another
component is step within the workflow is
called a step. Another component is task step. This step
represent units of work executed by other AWS services,
such as invoking the lambda and task that can interact
with any other AWS service like SQs,
SNS, SES, API,
gateway and anything else.
So the core benefit of the step function is the visual workflows design.
And the graphical console provides a more
clear view of the applications workflow, making it more
easier, making it easier to understand and
manage complex event event process.
So use cases for the state functions could be maybe
machine machine learning model of flow ETL workflows, long running
workflows and so on. So state functions provide
two main ways to interact with other services within
our workflows. I mean SDK integration
and optimized integration. And there are two
types of workflows in the functions, I mean
standard and express. So standard execution
execution is the time to up to one year
and exactly once workflows execution for this type
and the pricing is also part transition.
And to say use cases for this
standard is ideal for long running auditable
workflows where execution history and visual
debugging are crucial. Express workflows
at least once workflow execution and
the execution time is up to five minutes. And to
say pricing the part number and duration of execution and
use cases of this exprs workflows is the
perfect for high event rate workflows such as streaming
data processing and iot data ingestion, et cetera.
So to choose the right workflow type between
the standard and express workflows, actually it depends on
the specific needs. So eastern workflows are ideal for
scenarios requiring strict execution
order auditability and long running process. On the other hand,
express workflows except in the hydro throughput scenarios where
rapid event processing, event processing,
streaming data processing, IIT data injection is essential here.
All right, so let's dive into some practical applications for
the step functions.
So first of all, function orchestration.
So imagine a complex of flow like processing a customer order
step function access at orchestrating these
tasks. So you can define a sequence of lambda functions where each
function performs a specific steps. So I mean calculate
total update, inventory, trigger shipment, et cetera. Systems visuals,
these interactions ensuring everything happens in the correct
order, and you can easily verify the flow and
on the branching with choice state. So state functions
allows you to incorporate decision making into your workflows. For instance,
in a credit card application process, a choice state can evaluate the request
credit limit. If it's below a threshold,
the application can be automatically approved.
However, exiting the threshold can route the application for manual review
by a manager. So here's the branching example
and for the error handling. So state functions offers mechanism
to handle errors smoothly. I mean for
example a customer registration where the chosen username is unavailable,
a retry state could automatically attempt registration
with a slightly modified username a couple of times.
Alternatively, a cat state could intercept
the error and suggest alternative usernames for the customer to choose from.
In the human integration can even integrate human
integration. Suppose you have a process requiring
manager approval. So here functions can
utilize callbacks and task tokens to send
the task to lambda function that might notify the manager and
wait for the addition before continue the workflow.
So this is useful for scenarios when human intervention is necessary.
So there are other use cases. Parallel processing dynamic parallel parallelism
with map step parallel in the parallel processing the
imaging converting a video file into different resolution for
various devices. The parallel listed allows you to distribute this workflows
amongst multiple lambda functions simultaneously.
Significantly, it is speeding up the transcoding process compared
to doing it one resolution at a time and
dynamic realism. It works
with Mapistet. Mappyset empowers us to
process collections of items in the parallel using lambda
functions. A good example is an order fulfillment scenario,
so where you might have multiple items that need to be prepared
for shipping. I mean the map state can trigger lambda
functions to process each item
concurrently. I mean checking availability, gathering the item, packaging it for
shipment, et cetera. So now let's see the demo
on error handling of course, and then I'll get to the slide.
So error handling custom error functions as the throw
new error this is a custom error state we have completed
in the state functions we have two state machines,
I mean error handling state machine with
retry in the resource sections the
copy I have to copy the ARN and paste.
If the ARN is mismatched, the error comes in the
error handling parameters. We can check the
execution input and output and
the state functions workflow. So this is a custom error.
And here the output comes
from the lambda functions and start
execution. So to view the customer message the
start execution in the graph inspector panel and review
the input tape.
So now handle the failure using catch.
So your task, map and parallel strategies may contain a field
name catch start execution task and error.
So copy and paste the lambda ern of
the error handling state machine with catch lambda.
Now custom error and next workflows step in the
custom error fallback. So when
it catches this error it passes flow control to the fallback
step. Custom error fallback so select
the custom error fallback stating the graph inspector pan and building the
input and output.
So custom error fallback message
this is a fallback from a custom lambda function exception.
So this workflow exactly
as expected. I mean it should
show this is a fallback from a custom lambda function
exception. And it does use the catch
a timer error and error held
ten function lambda to copy rn of this
lambda. Put the ErN in the
state machine where I want to execute the error.
I mean sleep set timeout function error
equals estate timeout and next timeout fallback next
event Timeout fallback so let's see what happened.
Click Save and execute. Accept the default input and
click start execution. To view the
input of the fallback state, select timeout fallback
state Timeout Faultback Step time out Faultback Step
input output this is fallback format timeout error.
It works perfectly as respectively.
Okay, and after finishing this demo,
I'm deleting this spec. All right, let's talk
about the best practices and while considering
building EDA applications. So firstly, event storming. It's a collaborative
technique that helps us visually map out a
system behavior and identify those critical
events in the event, even architectures. The idea is to bring together
stakeholders from different areas of the system. We then facilitate
a workshop where we can all together to visualize
and discuss the systems, events and actions.
The main goal here is to get everyone on the same page,
everyone under the same umbrella,
about how the system works and pinpoint those critical business events.
ECST events I mean ECST event
carried state transfer events, notification events
like concise messaging informing consumers that something
ECST events like a different approach.
They act more like data carriers,
more data carriers containing a richer payload,
with more information relevant to downstream consumers.
So imagine a user placing an order in
an event, even architecture we can create an order placed event
to notify downstream systems like inventory management or payment processing
that a new order exists. So notification events essentially
act a messenger,
informing interesting consumers that something has happened.
But on the contrary, Es ECST events
have the more information, more metadata in
the full details data. And another best
practice is the confirmation pattern, is the straightforward approach for
consuming events in the event. Even architectures in this pattern,
downstream consumers directly utilize the events published
by producers,
senders without any modifications or transformations.
And another point, Sel, I mean protecting
domain boundaries, let's say two bounded contexts with distinct
data models, business logic, and potentially even language. And SEl
acts as a mediator between these contexts. So it provides a
translation layer that safeguards data from one domain
from corruption or misuse when consumed by
another domain and another. Best practice is the OHS,
I mean shared language for communication.
OSA is open host services patterns.
It promotes communication between bounded contests by
establishing a shared public language. I mean, this language acts as a common ground
for the data exchange, defined pro,
accurate upon interfaces, contracts or schema or
anything like this. And another perspective
is the event fast thinking. Remember,
event identification, event identification and
design are the ongoing process, right? So regularly
revisiting your events, ensuring that for ensuring
they remain relevant to involve business events and business as well as
business needs final best practice
of the ordering and order and unordered events.
Not all events need to arrive in the specific sequence, but however
some scenarios require a guaranteed order for
events to be processed correctly and understanding
this distinction is vital for designing your ETF effectively
and for example a scenario where an order updated
event must be processed only after the corresponding order created
event arrives. So to achieve this
you can leverage service that guarantee event order. Here are a couple
of options on AWS I mean Kinesis data stream order
is preserved within the message in a shard and Amazon
SQs FIFO I mean first in first out. So events are delivered
within a message group ID in the order and
here are some helpful resources to keep your planning journey on event event
architectures going. I hope you found it informative
and feel free to reach out if you have any
questions. So yeah, have a good day.