Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hi, and welcome to this talk. I'm Francesco Tisiot, developer advocate at Ive.
In this session we will check how you can build event driven applications using
Apache, Kafka and Python. If you are here at conference 42,
I believe you are somehow familiar with Python, but on the other
side you may start wondering, what is Kafka? Why should
I care about Kafka? Well, the reality is that
if you are into Python, you are somehow either creating
a new shiny app or you inherited an old rusty
app that you have to support, maybe extend,
maybe take to the new world, no matter if it's new
or old, if it's shiny or rusty.
I've never seen an application, a Python program working in complete
isolation. You will have companies within
Python that needs to talk with each other, or you
are exposing this old application into the word. So you have this
old application talking with another application. Let me tell you a secret.
Kafka is a tool that makes this communication
easy and reliable at scale.
Why should we use Kafka? Well, we used to
have kind of what I call the old way of building
applications, which were an application that at a certain
point had to write the data somewhere. Where did
the application write to? Well, usually it was a database,
but the application wasn't existing to the database.
Every single record, it was taking records,
packaging them up, few of them, and then pushing them to the database.
Or at the same time when it was reading from the database, it was
reading a set of a batch of records and
then existing a little bit before rereading the following batch.
This means that basically every time we were using such
a way of communicating, we were adding a
custom delay which was called batch time, between when
the event was available in the application and when it was pushed on
the database, or when the events was available in the database
and when it was read from the application.
Now we are living in a fast word and we
cannot wait batch time. You can imagine in batch time going between
like few seconds or milliseconds to minutes or
hours, depending on the application and the use case. Now we live
in a fast world and we don't want to wait the batch time.
We want to build event driven application. What are those
application that as soon as an event happens in the real
life, they want to know about it, they want to start
parsing it in order to strike the relevant information. And probably
they want to push the output of their basing to
another application, which will be more likely another event driven
application that will create a changing of those application. And we want to do it
immediately. But let's do a step back.
Let's try to understand what is an event. We are
all used to for example mobile phones and we are all used to
notifications. Notification tell us that an
event happened. We receive a message. We made a
payment with our credit card and we received the notification.
Someone else stole our credit card details
and made a payment. We receive a word notification.
As you might understand, we cannot wait batch
time of 2 hours, ten minutes, five minutes.
We want to know immediately about someone stolen our credit
card and we persons, we react as
an event driven application by immediately phoning
up our bank to block the credit card. This is why events
and event event event driven applications important. But even without going into
the digital world, we are used to events happening in the real life
since long time. Just imagine where your alarm beeps in
the morning, you wake up and you act as an event. Event driven applications,
you are not only receiving passively events, you are creating events. Just think,
when you change the time of your alarm, that will change your future,
your actions in the future. Well, going back to mobile
phones, especially in this time of pandemic, we have all been used to
for example order food from an app.
Well from the time that you open the app, you select the restaurant,
you select which pizzas you want, then you create can order. This will
create a chain of events because the order will be taken from the app
and sent to the restaurant which will act as an event driven application
and create the pizzas for you. And once the pizza is ready, boom. Another event
for probably the delivery people to come and pick it up
and take it to your place. So why event
event event driven applications important? Because as I said, we live
in a fast word and the value of the information is strictly
relate to the time that it takes to be delivered. If we go back to
the credit card example, I cannot wait half an hour
or 5 hours before knowing that my cart has been stolen.
I want to know immediately. But even if we talk about food and you know
I'm italian so I'm a passionate about food. For examples,
if we are waiting for our pizzas at home and we want to know
where the pizza is, where the delivery person is, the information
about the position of the delivery person is useful only if from the
time that is taken from the person mobile phone to the time that
it lands on my map on my mobile phone, the delay is minimal as
10 seconds. I couldn't care lets what the position was ten minutes
ago. The information has value only if
delivered on time. So we need to have a way
to deliver this information, to create this sort of
communication between components, real time, in real
time and highly available and at scale. How can
we do that? Well, we can use Apache kafka. What is Apache
Kafka? Well, the idea of Apache Kafka is really,
really simple. The basic idea is the idea of a log file.
A log file where as soon as an event is
created, we store them. We store it. So event number zero
happens. We store it as a message in the log file. Event number one happens.
We store it after event zero, two, three and four even more.
Kafka has concept of a log file which is append only
and immutable. This means that once we store event zero
in the log, we cannot change it. It's not like a record in a database
that we can go and update it. Once the event is there,
it's there. If something changes the reality that is
represented by the event zero, we will store it
as a new event in our log. But of course we know that events
take multiple shapes in different types. Just think about
I could have events regarding pizza orders and I could have
other events regarding delivery position. And Kafka allows me to
store them in different locks logs which in kafka
terms are called toppings. Even more,
kafka is not meant to run
in a huge single server. It's meant to be
distributed. So this means that when you create a Kafka instance,
most of the times you will create a cluster of nodes
which are called, in kafka terms, brokers.
And the log information will be stored
across the brokers in your cluster not only one
time, but multiple times. The number of times that each log
will be stored in your cluster is defined by a parameter called replication
factor. In our case we have three companies of the sharp edges
log. So applications factor of three and two copies of
the runs edges log. So replication factor of two.
Why do we store multiple copies of each log?
Well, because we know that computers are not entirely reliable
so we could lose a node. But still, as you can see, we are
not going to lose any data. So let's have a deep look at
what happens with Kafka. So we have our three
nodes and let's go to a very simple version of it where we
have just one topic, our sharp edges topic with two copies.
So replication factor of two. So now let's assume that
we have a building node. What happens now?
Well, Kafka will detect this node whats been failing
and will check well, which are the logs available in my cluster. Well,
there is the sharp edges log with only one copy, but a applications
factor of two. So Kafka at that point will take care of
creating a second copy in order to keep the number of copies equal to
the applications factor. So once we have also the second copy,
even if now we lose a second node, we still are not
losing any information. So as of now we understood how Kafka
works and what Kafka is. But Kafka is something
to store events. What is an event for
Kafka? Well, for all that matters to Kafka, an event
is just a key value pair. A key value pair where
you can put whatever you want in key and value.
Kafka doesn't care. For Kafka it's just a series of bytes.
So you could go from very simple use cases where you
put key, the max temperature label and 35
three as the value itself. Or you could go wild and you could
add both in the key and the value JSON formats, explaining the
restaurant receiving your pizza order and the phone line
used to make the call, and in the value, the order id,
the name of the person calling and the list of pizzas. Usually the
payload. The message size is around one meg and you can use
like in this case JSON formats, which is really cool because I
can read through it, but it's on the other side a little bit heavy when
you push this on wire, because for every field contains both the
field name and the fill value. If you want to have more compacted
representation of the same information, you could use formats like Avro
or protopath, which detach the schema from the payload
and use a schema registry in order to tell to Kafka how
I compacted my message. So when I
read the message I can ask the schema to Kafka
and I can recreate the message properly.
So these are just methods for all that matters to Kafka
you are sending just a series of types. So how can
you send that series of bytes to Kafka? Well,
you are trying to write to Kafka and probably in this case we will
have a Python application existing to Kafka which is called a producer,
and just remembering the things that we said earlier,
it writes to a topic or multiple topics. In order to write
to Kafka, all the producer has to know is where to
find Kafka, list of hostname and ports, how to authenticate,
do I use Sl? Do I use SASL? Do you use other methods?
And then since I have, for example, my information available as
JSON objects, I need to encode that in order to be
the row series of bytes that Kafka understands. So I need to know how
to encode the information on the other side. Once I have my
data in Kafka, I want to read, and if I want to read from
a topic with my python application that is called
a consumer, how the consumer works is that it will read
the message number zero and then communicate back to Kafka.
Hey, number zero done. Let's move the offset to number one. It will
read number one and move the offset to number two. Read number two,
move the offset to number three. Why moving the
offset? Communicating back the offset is important. Well,
because we know, again, computers are not entirely
reliable, so the consumer could go down. So the next
time that the consumer pops up, Kafka still knows until
what point that particular consumer read in that
particular log. So the next time the consumer will pop up, will probably send
the item the message number three, because it was the
first not being read by the consumer. In order to consume
data from Kafka, the consumer has to know kind of the similar information as
the producers where to find Kafka Osnam import how to authenticate
before we were encoding. Now we need to understand how to decode and
we also need to understand, we need to know which topic or which topics we
want to read from Kafka. So now it was a
lot of content. Let's look at the demo.
What we will look at here is a series of notebooks that I built
in order to make our life easier. The first notebook that
I created is actually a notebook that allows me to create automatically
all the resources that I need for the follow up within
Ivan, you can run this and you will access to this series of notebooks
later on. But as of now, let me show you that I pre created
two instances, one of Kafka and one of postgres that we
will use later on. With Ivan, you can create
your open source data platform across many clouds.
In this case, we created a Kafka. Well, instead of showing you something that
is already created, let me create a new instance. As you can see,
you can create not only Kafka, but a lot of other open source data platforms.
And once you select which data platform you want to create, you can select which
cloud producers and within the cloud provider, the cloud rigid,
so you can customize this per units. At the bottom you can also select the
plan driving the amount of resources and the associated cost,
which is all inclusive. Finally, you can give a name to the instance and after
a few minutes, the instance will be up and running for you to have
a look. The goodies about Ivan is not only that you can create open
demand, but if you have an instance like this,
you can upgrade it if a new version
of Kafka comes up or you can changing the plan to
upgrade, upscale or downscale. Or you can migrate
while the service is online, the whole platform to a different region
within the same cloud, or to a completely new cloud provider. So now
instead of talking about Ivan, let's talk about how to create
and produce messages to Kafka. So let's start a producers.
The first thing that we will do is to install Kafka Python,
which is the default basic library that allow us to connect to Kafka.
And then we will create a producer. We create a producer by
saying where to find Kafka, list of host, name and port,
how to connect using SSL and three SSL certificates,
and how to encode the information, how to serialize them. So both
for key and value we will take the JSON and move it
to a series of byte encoded in Ashi.
So let's create this. Okay,
and now we are ready to send our first message. We will send
a pizza order again, I'm italian so pizza is key.
And from myself, an order for myself ordering a
pizza margarita. Okay, now the order, the message
is sent to Kafka. How can we be sure about that? Well,
let's create a consumer now let's
move the consumer on the right and let's close the list of
notebooks. We create a consumer. All we
have to say is apart from the group id that we will check later.
I'm calling it client one, I can call it whatever I want
the same properties in order to connect osname,
import SSL with the three certificates.
And how do I deserialize now the data from
the row series of bytes to JSON with the two formula
CIA. Okay, so let me create the consumer.
Now I can check which topics are available in Kafka and
I can check there are some internal topics together with a nice Francesco
pizza topic that I just created for this purpose.
I can subscribe to it and now I can start
reading. We can immediately see two things when
reading. The first one being whats the
consumer thread never ends. This is because we
want to be there, ready as soon as an order comes in
Kafka, we want to be there, ready to read it. And there is no
end time in streaming there is no end date.
We will always be there, ready to consume the data.
The second thing that we can notice is that even if we send the
first pizza order from Francesco, we are not receiving it
in here. Why is that? Well, because by
default, when a consumer attaches to Kafka, it starts
consuming. From the time that it attached to Kafka, it doesn't
go back in history. This is the default behavior and we will see how to
change this later on. But just bear in mind this is the default.
In order to show you that the wall pipeline producers consumer
works, I'm going to send another couple of events I'm
using to send an order for Adele with pizza y the pineapple
pizza and an order for mark with pizza with chocolate.
So I'm italian and both choices, there are not
what I would call right choices for pizza. However, I respect
your right to order whatever you want. Just try not to do that in
Italy. Okay, so let's produce the two orders
and if everything works, we should see them appearing on the consumer
side immediately. There we are. We see that both Adele
and mark orders are working in the consumer side.
Our pipeline is working. So now let's go back
to a little bit more slides. Let's talk about the log size.
We want to send messages to a log. We want to send
huge messages, huge number of messages to a
log. But I told you that the log is stored in a broker.
Is this meaning that we cannot have more messages
than the bigger disk on the bigger
server in our cluster? Well, this is not going
to work well, if we want to send massive amounts of events,
we don't want to have the trade off between disk space and
amount of data. We don't want to need to purchase
huge disk in order to store the wall topic in one disk
on the other side. We don't want to limit ourselves and the number of events
that we want to send to a particular topic because of disk space. We are
lucky because Kafka doesn't impose that trade off on us.
With Kafka we have the concept of partitions.
Partition is just a way of taking events of the same time belonging
to the same topic and divide them into subtopics,
sub logs. For example, if I have my pizza orders, I could
partition them based on the restaurant receiving the order because I want
all the records, for example, for Luigi restaurant being
the blue being together. But I don't care really what happens between the orders
of Luigi restaurant and the orders of Mario the yellow one or
Francesco the red ones. Now, why partitions are
good for this kind of disk space trade off because the partition is
what is actually stored on a node. So this means that if we
want to have a huge amount of events landing
in a topic, we just need more partition to fit
the wall topic into smaller disks. And again,
since it's distributed, we will have them stored across our
cluster in a number of copies.
In this case, the number of copies is equal to every partition because it's a
topic level. And even if we lose a code again,
we will not lose any data from any of the partitions because it will be
available in the other copies in the other brokers. We said
initially that we push data to Kafka and then it
will be stored in the log forever. Well, this is not entirely
true, because we can set what are called topic
retention policies, so we can say for how long we want
to keep the data in Kafka. We could say that based on time. So we
can say, well, I want to keep the data on Kafka for two
weeks, six hour, 30 minutes, or forever.
Or we can say that based on log size. Basically,
I want to keep the events in kafka until the
kafka log reaches 10gb and then delete the oldest
chunk. I can also use both, and the first threshold that will
be hit between time and size will dictate when I will delete the
oldest set of records. So we understood that partitions
are good. How do you select a partition? Well, usually it's done with the key
component of the message. And what
Kafka does by default is that it ashes the key
and takes the result of the ash in order to select one partition,
ensuring that messages having the same key always land
in the same partition. Why this is useful?
Well, let me show you what happens when you start using partition.
It's useful for ordering. Let me show you this little example. I have
my producer which produces data to a topic with two partitions,
and then I have a consumer. Let's assume a very simple use case
where I have only three events, blue one happening first,
yellow one happening second, red one happening third. Now,
when pushing this data into our topic, the blue
event will land in partition zero, the yellow event will land in partition
one, and the red event will be in partition zero again.
Now, when reading data from the topic, it could happen,
it will not always be the case, but it could happen that I will read
events in this order. Blue one first, red 1
second, yellow one third. So if you check, the global ordering
is not correct. Why is that? Well, because when we start using partition,
we have to give up on global ordering. Kafka ensures the correct ordering
only per partition. So this means whats we have to start
thinking about for which events,
for which subset of events the related altering
is necessary and for which not. If we go back to our pizza order
example, it makes sense to keep all the orders of
the same restaurant together because we want to know which person ordered
before or after the other. But we don't really care if an
order for Luigi's pizza was done before another order for
Mario's pizza. So we understood that partitions
are good because they ease the trade off between disk
space and log size. But partitions are bad because we
have to ive up on global ordering. But if you think
about partitions, and if you think about a log with a single partition,
it's just one unique thread appending one event
after the other. And you can think that the throughput is done by the single
thread doing the work. Now, if we have more partition,
we have multiple independent threads that can append
data one after the other. So you can still think, I know that there
are other components, but the throughput of those three processes
is roughly three times the throughput of the original
process. So this means that we can have much more producer producing
data to Kafka, and also we can have much more threats consuming
data from Kafka. But still we want to consume all the events of
a certain topic, but we don't want to consume the same event twice.
How does Kafka handles that? Well, it does by assigning
a non overlapping subset of partitions to the consumer.
If these last few words didn't make a lot of sense for you, well,
let's check. In this demo we have two consumers and three partitions.
What Kafka will do is assign the top, the blue
partition to consumer one, and the yellow and red partition
to consumer two, ensuring that everything works
as expected. Even more. Let me just focus
on this one. If consumer one now
dies, Kafka will understand that after a timeout and redirect
the blue arrow from consumer one which died,
to the consumer which is still available consumer two. So now
let me show you this behavior in a demo. Let me show you
partitioning in a demo. And this time let's create
a new producers which is similar
to the above, nothing different. Apart from now we
are using Kafka admin client to connect on the admin
side of Kafka and create a new topic
with two partitions. So we will force the number
of partitions here. Okay, now what we
want on the other side is we have a producer with
a topic of two partitions. Let's create two consumers that
are working one against the other to consume all
the messages from that topic. So let's move consumer
one here and consumer two
there. So I'm saying to Kafka
that im existing those two consumers to the same topic.
Since I have two partitions and two consumers,
what Kafka should do is assign one consumer
to the top partition and one consumer to the bottom partition.
If I go now, let me check that the
top one is started. Let me start also the bottom consumer.
So all the two consumers are started. Now let me
go back to the producer here and let me send
a couple of messages. If you remember what I
told you before, the partition is selected with the key.
So Kafka does a hash of the key and selects the partition.
What I'm doing here, I'm using two records with
slightly different key ed one ed zero. So I'm expecting them
to land into two different partitions. Let me try this
out exactly. So I receive one record
on the top consumer, one record at the bottom consumer.
If you wonder what those two flex means, this means that the top consumer is
reading from partition zero, the offset zero. So the first record
of partition zero, the bottom consumer is reading from partition one,
offset zero, first record of partition one. If now I
send another couple of messages from the
same producer reusing the same keys since what I
told you before, that messages with the same key
will end up in the same partition. Im expecting mark order to
land in the same partition as Frank because they share the same
key and the same for Jan and Adele. So let's
check this out. Runs this and as expected,
mark is landing in the same partition as Frank and can in the
same partition as Adele. So offset 1
second record or partition zero offset 1 second record
of partition one. So let's check also the latest bit.
What happens if a consumer now fails?
Kafka will somehow understand that after a timeout
and should redirect also partition zero to consumers one.
So let's check now what happens if I send another two events
to the topic? I'm expecting to read both of them at the bottom consumer.
Let's check this out exactly. So now I'm reading both from
partition zero and partition one with the consumer
which was left alive. So all as of
now working as expected. Let's go back to a little bit
more slides. So as of now we saw a pretty linear
way of defining data pipelines. We had one or more
threads of producer Kafka and one or more threads of consumers
that were fighting one against the other in order to read all
the messages from a Kafka topic. For example, if we go back to
the pizza case we had, those two consumers could be the
two pizza makers that are fighting one against the other in order to consume
all the orders from the pizza order topic. But they
don't want to make the same pizza twice. So they don't want to read the
same pizza order twice. However, when they read a message,
the message is not deleted from Kafka. This makes it available
for other applications to read. So for example, I could have my
billing person that wants to receive a copy of every order in order
to make the bill and it wants to read from the topic at its
own pace. How can I manage that? Well, with Kafka it's really simple.
It's the concept of consumer groups. So I have to define the two
pizza makers are part of the same consumer group. And then I will create a
new application called like billing person and we'll
set that as part of a new consumer group and Kafka
will understand that it's a new application and we'll start sending a copy of
the topic data to this new application that will read at its own pace that
has nothing to do with the two pizza makers. Let's check this
out again. Let's go back to the notebook. And now
what we will see is we
will create a new consumer part of a new consumer group.
So if we go back to the original consumer we had
this group id that I told you before. We will check that later. Well,
it's time now. The top consumer was called pizza makers.
This is how you event driven applications being part of a consumers group
on can Android. The new consumer is called bill in person.
So this is for Kafka. It's a completely new application reading
data from the topic. If you remember when
we had this original consumer we managed to attach to the topic
but read not from the beginning of the topic, from the time
when we attach to the topic. So we were missing
the order number one. Now with things new application
we also say out offset reset equal to earliest. So we
say we're attaching to a topic in Kafka and we want to read from
the beginning. So when we now start this new application
we should receive the two messages above
plus also the first message, the original message of Francesco
order. There we are. We are receiving all the three messages since
we started reading from the beginning. Now if we go to the original
producer and we now send a new event to
the original topic, we should receive it in both because those are
two different application. Let's try this out exactly.
We will receive down order both in the top and the bottom application.
Everyone adding the data because there are different application reusing
the same topic. Now we understood also different
consumer groups. Let's check a little bit more.
What we saw so far was us writing some code
in order to produce data or to consume data. However, it's very
hard that Kafka will be your first tool, your first data
tool in your company. You will probably want to integrate
Kafka with an existing set of data tools,
databases, data stores, any kind of data tool.
And believe me, you don't want to write your own code for each of those
connectors. There is something that solves this problem for you
and it's called Kafka Connect. Kafka Connect is a pre built framework
that allows you to take data from existing data sources
and put them in a Kafka topic, or take data from a
Kafka topic and push them to a set of data syncs.
And all is driven by just a config file. And one
or more threads of Kafka Connect allows you to
event event driven applications. If we go back to one of the initial slides
where we had our producer producing data to a database, well,
we now want to include Kafka in the picture, but still we don't want
to change the original setup, which is still working. How can we include
Kafka in the picture? Well, with Kafka Connect and with a change
data capture solution, we can monitor all changes happening in
a set of tables in the database and propagate those changes as events,
as messages in Kafka. Very very easy. But also
we can use Kafka Connect in order to distribute events.
So for example, if we have our application that already writes to
Kafka and we have the data in Kafka in a topic,
well, is our team needing the data in a
database? Any JDBC database? With Kafka Connect we just can
ship the data to a JDBC database. They want another copy to a
postgres database. There we are. They want a third copy to Bigquery.
Really easy. They want a fourth copy into s
three for long term storage. Just another Kafka connect thread.
And again, what you need to do is just define a
config file from which topic you want to take and
where you want to bring them. And with Ivan,
Kafka Connect is also managed service. So you just have to figure out
how to write the config file. So let's check also this as a
demo. If we go back to our notebook
we can now check the Kafka
connect one. Let me close a little bit of my notebooks.
What we are doing here, we are creating a new producer. What we will
create now is a different topic and within each message
we are going to send both the schema of the
key and the value and the payload of the key and the value. Why do
we do that? Well, because we want to make sure that our Kafka
connect connector understand the structure of the record and
populates a downstream application that in this case it's postgres table
properly. So we define the schema of the key and
the value. And now we pass, we create three
records. We push three messages to Kafka containing both the schema
and the value itself. With Frank ordering a pizza margarita,
Dan ordering a pizza with fries and Jan ordering a pizza with mushrooms.
Okay so we push the data into a topic.
Now we want to take the data from this topic and push that
to postgres. In order to do that I have everything scripted
but I want to show you how you can do that with Ivan web
UI. If I go to config I can
check that there is a Kafka connect set up waiting for me.
So this is the config file that you have to write in order to send
data from Kafka to any other data target. So let's check this out.
What do we have to show? We have to create send the
data to a postgres database using this connection using a
very secure new PG user new password. One, two,
three. Very secure. And what do we want to send to this
postgres database? A topic called Francesco
pizza schema and we are going to call things connector sync
Kafka postgres. Additionally we want to tell that the
value is adjacent and we are using a JDBC sync
connector. We are pushing data to a JDBC and please if
the target table doesn't exist, autocreate.
Okay let me show you on the database side.
Whats have postgres database called 42 whats
I'm connecting to. It has a default Db there whats schemas?
The public schema and set of tables which is empty.
Okay now let me take the config file that
I've been talking to you about earlier on and let me go
to Ivan Web Ui. Im going into my kafka
service. There is a connector tab where I can create a new connector.
I select a JDB sync and syncing the data to a JDBC database
and I could fill all the information here. Or since I have my config file
I can copy and paste into the configuration section and
things parses the information and fills all the details as
shown before im creating a connector with name sync Kafka postgres
classes JDBC sync connector and the value is a JSON database
and I'm sending the Francesco pizza schema topic.
So now we can create a new connector and the connector is
running. So this means that now the data should be available in postgres.
Let me check that out. We can see that now my database
has still tables. Let's refresh the list of tables and I
see my Francesco pizza schema which has the same name as the topic.
If I double click on it and
I select the data I can see the three orders from
Frank can and Jan being populated correctly.
If I go back to the notebook and let's
go back to our Kafka connect and send
another order for Giuseppe order in Pizza y this
is sent to Kafka. Let's go back to the database and let's
try to refresh and also the order for Giuseppe ordering pizza
y is there. So Kafka connect managed to create the table,
populate the table, and keeps populating the table as soon as a
new row arrives in the Kafka topic.
So going back to last few slides, I believe
in less than an hour we saw a lot of things.
We saw how to produce and consume data, how to use partitions,
how to define multiple applications via multiple consumer groups
and what Kafka Connect is and how to make it work.
Now if you have more questions or if you have any
questions, I will give you some extra resources. First of
all, my twitter handle. You can find me at ftziot. My messages
are open so if you have any question regarding any of the content that I've
been talking to you so far, just shoes out. Second thing, if you want a
copy of the notebooks that I've been showing you, you can find
them as an open source GitHub repository.
If you want to try Kafka but you don't have a nice streaming data
source that you can use to push data to Kafka. Well, I created a
fake pizza producer which creates even more complex
fake pizza orders than the one that I've been showing you. The last one is
if you want to try kafka but you don't have Kafka.
Well check out Ivan IO because we offer that as managed
service. So you can just start your instance.
We will take care of it and you will have only to take care about
your development. I hope this session was useful for you to
understand what Kafka is and how you can start using it with Python.
If you have any questions, I will be always there available for
you to help. Just reach out on LinkedIn or Twitter. Thank you very much and
Ciao from Francesco.