Transcript
This transcript was autogenerated. To make changes, submit a PR.
Hello everyone, and welcome to my presentation. Thanks for attending and thanks to
the organizing team of Conf 42, Python 2021
for putting together this great event for giving me the opportunity to present on
it. Today I'm going to talk about delivering successful API integrations with
Python. But before I do that, let me introduce myself.
My name is Jose and I work as an independent contractor based in
London. I work as a full stack developer, so usually working
both with front end applications and with backend applications as well.
And for the past few years I've had a long term affair with microservices
and APIs. So I've helped multiple clients build microservices
architectures and to delivering API integrations. Along the
way, I've made multiple mistakes, but I've also learned
a great deal of things about what it takes to deliver a successful API integrations.
As a result of this experience, I got the idea of writing a book
about developing microservice APIs with Python. The book is published
by Manning and it's currently available through manning early access program.
If you want to get a copy of the book, you can use the following
discount code to get a 35% discount from the
price. I'm also the founder and creator of Microapis IO.
This is a website that makes it easy to launch API mock service
with one click, with no signup or login required. All you need
to do is copy the API specification in the input panel on
the left side here on the website. This will trigger a validation
process in the server of the API document and
if the document is valid you will be able to click on this button to
get the mock server. You will get a base server
URL for the API as well as a sample URL from the specification
which you can click to test that the server is working correctly.
If you would like to connect with me, I try to be active in all
of these platforms in Twitter, GitHub, Medium and LinkedIn.
So if you would like to discuss anything related to software,
microservices, APIs or Python, please feel free to
reach out to me in these platforms. So what's the agenda for
this presentation? First of all, I'm going to talk about the
completely of API integrations. Why are they so difficult? Why it's
so tricky to make them succeed from the start? Then I'm going
to introduce documentation driven development as an approach that
can help us address the complexity of these integrations and it can
help us reduce the chances of failure. I'm going to show you how you can
introduce documentation driven development within your API
development workflow by showing you a simple example of a to
do applications with a rest API in front of it, which we are going to
build with fast API. We're going to test and validate the API
with thread and we are going to run mock services on this specification.
I'm going to show you how you can include this process within your
continuous integration server, and we are using to deploy all
of this to Heroku. If you're interested in checking out the code for this presentation,
it's available in my personal GitHub repository in the repo
that I prepared for this presentation. So why are API integrations so
tricky? If you picture the typical situation of
a team working on an API integration, we usually have a
team working on the API backend and a team working on
the API clients. The client can be anything from a front end application running
on the browser to a mobile app or another microservices.
Typically what happens is both teams will be working in
parallel and at some point we release the API servers,
then we release the API clients, we get them talking
to each other and the integration fails. And this failure
can happen for multiple reasons, but very often it happens for simple
things like a misunderstanding between the frontend and
the backend about what the format of a certain field could be. In this
case, I'm showing the example of a misunderstanding with a date
field, which the client is sending with slashes separating
the different elements of the date, while the server is expecting dashes
in the date field. These kind of errors are very common and
they lead to all sorts of problems in our API integrations.
Now, how do we end up in this situation? What takes us here?
And obviously there is a large variety of factors that can
affect the outcome of our API integrations.
But when it comes to failure in communication between the
front end and the backend, usually everything build down to one of
the following three factors. It can be the case that we don't have documentation
at all, and instead of it, maybe we are working with JSON examples.
Now JSON examples are a very common resource when working
with API integrations, but they don't offer sufficient coverage
of the different payloads which are accepted by the server or which are
returned by the server. So in these situations it is very difficult for
the API client development team to figure out the integration with the
server. In some other cases we have something like documentation,
but it's not really documentation, or it's not documentation written
in a standard format. What do I mean by this is we often end
up writing API documentation in something like Google Docs,
sharepoint or confluence, and we do that in non standard formats.
Or again, maybe we include JSON examples instead of actual
schemas of the API. The problem with this is this type of
documentation cannot be leveraged with the whole ecosystem of
tools and frameworks that we have in the API ecosystem to make our life
easier when it comes to maintaining and releasing our APIs,
they don't help us to test and validate our implementations
because it's difficult to manage our API integrations. Another problem
we often find as well is we don't have a proper design stage
of the API. So instead of having the frontend team sitting together
with the backend team and talking about what the API should look like
and what the requirements should be, often the API is kind of
an afterthought of the backend development process and we
end up releasing random URLs and we end up with a messy
collection of URLs and resources in the servers, which are difficult for the client development
team to understand and figure out how to use in the clients. Now,
to avoid these situations, obviously the best thing is to use
proper API documentation, and what actually counts as API
documentation depends on the type of API that we are using
or the type of protocol. So if we are working with rest
APIs, we want to use the open API specifications format.
If we are working with a GraphQL API, that should be the schema definition language,
and if we are working with something like GRPC, we should be expecting the documentation
to come in the form of protocol buffers. So each protocol or type
of API has its own format, and we should be making an
effort to use that format, because then we will be able to use various tools
and frameworks to make our life easier when working with these
APIs. Now, what is documentationdriven development? So this is an
approach that you may know already under different names.
It's also known as design first, API first or contract first API
development. The basic idea is we design and products
the API specification first before we start working on the API client
and the server. Then we build the API client and
the server against the specification, and we use the specification
to validate our code before we release it. Now,
how does this work in practice? I'm going to show you with a simple example.
Like I said before, I prepared a simple example of
to do application. It's a very simple application that allows
us to create a task. We can update a task,
we can delete it, or we can retrieve the details of one task or multiple
tasks from the server. The specifications is written in the open API
format 30 and it's written in Yaml. But to
get a better visualization of it, let's move on to a swagger UI.
So when working with API documents, I highly recommend
you to use Swagger editor. Swagger editor makes it really easy to work
with API documents because first of all it can validate the document.
So if there is any problem with it, it will tell you where the error
is and what type of error it is. So if we remove this line,
for example, it tells us that there is a structural problem in line
13 and it tells us what kind of problem it is. So let's put
this line back. Now the bi specification is valid and
we can see that it has five simple endpoints.
We can also see the payloads, sorry, the schemas for the
payloads in this API. So we have a schema for error
messages. We have a schema for the type of payload that we
should be sending to the server when we create or update a task.
And we have a schema for the type of payload that we will get from
the server when we get the details of a task. Now let me show you
how we can leverage this documentation on the client side to
run a mock server when we're working on the client, I'm going to show you
two ways of doing it. One of them is running the mock server locally,
and another way is running it in the cloud. To run the server locally,
I recommend you to use Prism, which is a CLI tool built by
stoplight. It's a very powerful tool to run rest
applications locally against the specification document.
A prism is an NPM package, so you will need to have NPM
as well as a node JS runtime available locally
and optionally. If you want to manage the dependencies with yarn, you will
need to have yarn as well. The NPM dependencies for this project
are listed in the package JSOn file. So we have the Prism
Cli as well as thread. I will introduce thread a little bit later in
this presentation to install the dependencies I'm going to use yarn.
All you need to do is run the yarn command just like that,
and after the dependencies are installed you will see can old
modules folder available in your directory. So this
folder contains all the dependencies that come with these libraries as well
as all the commands that are available with them. So the prism
Cli is available within this folder that
is under bin and prism. To run the
mock server we use the mock command available in this Cli and
we give it the path to the open API specification file. So if
we run this command, the mock server is now running.
The first thing we see in the logs is a list of the endpoints
available in this mock server. So let's try out running
a request against one of these endpoints.
So I'm going to do it with CuRl and I'm going to
use JQ to parse the JSON payload returned by the server to
have a nice representation of it. If you can see here, this is
a list of items. Prism has included only one item in the list,
but that's a perfectly valid situation for this API.
And the payload contains all the expected attributes that we should be
finding in this payload. Now this
is how we would run the mock server locally. We can do it as well
in the cloud. I'm going to show you how to do it with microapisio.
So what we need to do is we need to copy the API specification
and we need to paste it in the input panel which
is here on the left. We paste it here. First thing that the
server is using is validating the API. Now we can
click on this button to launch the MOOC server. Like I said before,
we get the base URL as well as a sample URL of the
API. So if we click on this one we get a payload with
a valid list of items from the server. Let's try this
in the terminal as well with curl. Let's clear the
terminal and do a curl on this. We will use GQ
as well to pass the JSON payload and what we get
is a list of items. In this case we are getting more than one item.
So the experience is a little bit different from what you get with Prism,
but in both cases we're getting perfectly valid payloads from the servers.
So this is a very useful approach to building API
integrations. I'm currently using this MOOC service when I work with
my clients in my current contract, and it's making
our life to build the API integration so much easier because
we can simulate the interactions with the server while the
endpoints are big implemented and we can get simulation
of network latency issues. So obviously
there's some data that is not going to be available within microseconds.
It might take a little bit longer than that and we can implement the API
client in a way that is able to handle these situations.
Now this is how we leverage the benefits of documentation driven development
in the client side. Let's see how we benefit
from this. In the backend so like I said before, we are going to build
the API with Fastapi. For those of you who don't know,
Fast API is a highly performant web API
framework for Python. It's a very recent project, but it's one of the
most popular Python projects at the moment. It's very well written,
very well designed, it has a very intuitive interface, and it
is extraordinarily well documented. So if you haven't checked but Fastapi
yet, you should definitely check it out. I think it's without any question the
best API development framework for Python right now. Now,
Fastapi works with pydantic to validate data
payloads in the server. So Pydantic is another very
popular library in Python for data passing and validation.
And as you can see in this example, the way we define the validation
rules for our payloads is by using type hints.
So this gives us a very pythonic and very intuitive
way of defining the validation rules for our models.
So let me show you the dependencies for
the app. They are listed in a PIP file. We have
a couple of development packages. We have threadhooks which
I will introduce a little bit later. We have also data model code generator,
which is a library that is going to help us translate the open API specification
schemas into pydantic models. And we have a couple of products
packages. Obviously we have Fast API which is going to help us build
the API application, and we have Uvicon which is going to help us run
the web server. Now for those of you who are not familiar with it,
data model code generator is a library that helps you
translate any kind of JSON schema specification
into binatic models. It is extremely useful and very powerful,
and in most cases I would recommend you don't write your data
models manually if you have a specification
at hand. First, I recommend you that you use this tool to do the translations
since it can help you avoid many mistakes in the translation process.
So first things first, let us stop
the mock server of prism that we started before. Let's clear up the terminal
and let's install the python dependencies for this project.
So we do that because we have dev dependencies. We do
that with the dev flag.
So our dependencies are already installed and now we have to activate the
PPNV environment. We do that with PPNV shell. The environment
is already activated and now we can start working on the project. Like I said,
the first thing we want to do is we want to generate the pydantic models
so we do that with the following command, which is data model code
gen. We need to give it the input which is the open APIs specification
file, and we need to give it an output which is the file where we
want to see the schemas written to. So we run the command and
the schemas file is already available. So this is the
file. It contains some metadata like the
file we use to generate the models as well as the time when we run
the command. So data model code generator makes an
excellent job at translating the open APIs schemas into
pydantic models. But if we look closely at the file,
we see there are a couple of duplications. So in particular
these two classes are duplicated here below and really they should be
the same. So let's delete these two classes to make maintenance easier
in the future, and let's rename these classes
here. If you're not familiar with pedantic, you may find this ellipses here a little
bit confusing. This is just a way of telling pedantic that a field that
has been declared with the field class is required.
Otherwise, any field that is directly declared with a
type hint is supposed to be required,
unless you mark it explicitly with the optional type hint.
So now we have our pydantic models ready, let's move on to the
API file. So I have created
the API endpoints for the API in this
file. In API Py, the first thing we have is a list for the to
do items. So to keep things simple, I've decided to use an
in memory list implementations of the data instead of a persistent storage.
If you're familiar with flask, the interface here
will look very familiar. The way we define the endpoints is
using decorators on the application object. The decorators are
named after the HTTP endpoint for
each endpoint in the API. So this is for a get endpoint
on the to do URL path. In cases where the
API endpoint returns data for the user, we want to use a
response model class that fast API can use
to validate the data that we are sending to the user,
and also to serialize the data. In cases where the
API endpoint returns a status code which is different
from the standard 200 status code, we can specify that
in the status code parameter. If the API endpoint
accepts a payload in the request from
the user, we can also tell fast API, which is
the pedantic model that should be used to validate the request
payload. And in cases where the URL contains a
parameter, we can also tell fast API, what is the
type that should be used to validate the URL parameter?
So this is all about the API layer. To initialize
the server, all we need to do is create an instance
of the fastapi class we are running here in debug mode,
and we are also importing the API roots into this file
to load them at the startup time. There are other ways of loading
the API views. So if you are familiar with flask blueprints,
fast API has a similar concept to them. And that's really the type of
approach you want to use if you're building a production application
with fast API. In this case, just to keep things simple, I'm using this
approach. So we have the API implemented.
Let's now run it with Uvicon. So to run the application
with Uvicon, we have to give it the path to the file with
dot notation, and within the file after a column we have to tell it
which is the variable that represents the server. So if
we run this command, we get the API server running and
we can check it out in this URL. So let's go ahead and
visit that URL. Now if we visit the to
do endpoint, for example, we will get can empty list. Fast API is
capable of generating an open API specification from
the implementation and also a swagger UI visualization
of the API. So let's go ahead and visit that URL. And this is
the swagger UI for the API that we have written in Python
with fast API. With this we can already interact with the server. So if
we run the get endpoint, we get again an empty
list. So let's go ahead and create one item to verify things are working
fine. So if we execute this, we get a valid response
from the servers with a 201 status code.
So if we run the get endpoint again, now we
have one item in the list. So the API seems to
be working as expected. And typically now we would write
a bunch of unit test cases to make sure all the rest of the endpoints
are working fine and we cover some edge cases in the API.
Now those unit tests are fine, but in addition to that, I encourage
you to use as well a proper API test framework
because usually unless you are ultimate expert in
APIs, I can guarantee you that you're going to miss some edge
cases in your test suite. So in this presentation I'm going to introduce
you to thread. There are some other test frameworks as
well, which you can explore, but thread is a classic API test framework
that will get you covered with all possible edge cases when
it comes to API communication. So it will test the payloads
of your server. It will test the return status
codes and the payloads for those status codes. So a combination
of unit tests that make sure that the application is behaving
correctly in communication with a test suit run by dread will
get you covered with your API applications. Now how we run dread
first of all, dread is an NPM package. So I already listed before
dread in the package JSOn file that I showed you before that
is here. So when we run yarn to install these
dependencies before we already installed thread, and to run thread,
what we need to do is bit by big.
That is the thread CLI together with
the first input is the path to the
API file that we are going to use for validation. The second
parameter is the URL where thread will be able to
run the tests. And the third parameter here is
with the server flag is the command that thread
needs to run to make sure that it can
launch the web server to run the test suite. So if we run
this now,
threat is now testing the API and it's telling us that five
tests are passing but three are failing. Now let's have a
close look at those tests that are failing. So this one for example,
is the delete request on a specific resource that
is failing. The but
request on a specific item is also failing,
and the get request on
a specific item is failing as well. All the other requests
are fine. This is just a repetition of the request that
I mentioned before that was failing. So if you think about it,
what is failing here is those requests that target the
specific items in the server. So when dread is trying to perform
an operation with an existing item, it is failing. And now the reason
why this is happening is because thread is using random ids
to test the API. It's using one id for
creating the resource, and then it's using a different id to retrieve
a resource or to perform operations on it. Obviously, random resources don't
exist on the server, so we have to customize the behavior of thread in such
a way that it always uses the same id that was returned when
we created a resource on the server. We can do that with
threadhooks. That was the dependency that I included here.
This plugin allows us to customize the behavior of thread
and to keep track of the state of the test suite. I've prepared a file
with collection of tools here that affect the
behavior of thread. The first thing we are doing is creating a dictionary which
we are going to use to keep track of the state of the test suite.
And then what we are doing is telling dread that when we
create a task using the post method,
we need to save the id of this task. We are
going to use this id in the following methods that perform
operations on a specific task. So when dread wants to update
a task, we are going to tell it to use the existing task id that
we obtained when we created the task. And when we delete
the task or get the details of the task, we're going to tell it to
use that same id. So the file is already available. And to use
the file we used thread with the following
parameters. We need to use the hook files flag and
point it to the path of the file that contains the hooks.
We have to tell it in which language the hooks are written. So in this
case we wrote the tools in Python. So we just simply tell
it that the hooks are written in Python. If we run this command now,
the test suite is passing fine. Now no errors,
eight tests and the eight tests are passing
now we have verified that our API implementation completely
with the API specification file. We would like to make sure that
this test is run every time we make any changes to
the code and the code is validated before we make a release. How we
do that we can do it by incorporating this test suite in our continuous
integration server. So I've prepared a Travis file for
this exercise. So the Travis file contains instructions
about how to install thread to run the test suite.
It tells the server also how to install the python dependencies.
It tells the server what's the command that has to be run
to run the test. And we also provide some deployment configuration so
that this can be deployed into Heroku. We are also telling the server to use
the correct version of Python so that Ppemp doesn't complain.
So to test how this works, we are going to make a commit here.
We are going to commit the changes we made to the schemas file before
and now we are going to push this to make it simple, I'm going to
push directly to master. So git push origin master
I think we are in master. Yes, so git push
origin master master.
So this is now in GitHub and Travis is going to
get a trigger, a notification when at
some point to run the test suite and then deploy the code if
it is valid. So we just need to give it a moment. Now the
build is starting, we're going to follow the logs.
So Travis has run the threat test suite against the application.
The tests are passing and therefore Travis is moving on
to the deployment stage of the continuous integration
process. We can monitor the deployment process in Heroku,
and that is already triggering the deployment it's
about to release. And this is the URL
that we can use to visit the application. Just need to
leave it a minute. Okay, so the build is finished now, we can now visit
the application on this URL. So let's go ahead and do that. We're going to
visit the docs endpoint that I mentioned before contains the swagger
UI to interact with the application more easily. So let's go ahead
and do that. And this is the swagger UI.
Now if we try the get endpoint, the list
is empty, obviously, because the application was just released and we
are visiting a new instance of the app. So let's go ahead and create
a task. We do that, we obtain a successful
response, and now we execute the get endpoint again,
and we have one task in the list. So the server was validated,
it's working as expected. At the same time,
the client development team has been working against the
same specification. So things are never 100%
certain. But if we follow this approach, we can be fairly certain that at least
when it comes to API communication, both the server and
both the client and the server will be talking to each other in the same
language. So this is really everything I wanted to introduce in this presentation.
Thanks for listening again. If you're interested in connecting with me, please do
so in any of these platforms on Twitter, GitHub,
medium or LinkedIn. And if you want to get a 30% discount code
on my book, feel free to use the following code.
Thank you and have a great day.