Transcript
This transcript was autogenerated. To make changes, submit a PR.
Okay, welcome to my talk about queryable APIs with GraphQL.
In this talk we are going to start off by explaining the
use case or the example that we are using. During the talk we
will then build this example using jaxrs.
That should show us the problem of overfetching and underfetching.
And then we will convert that application to a graphQl application to
show how graphQl solves that. And then we will dive a
little bit deeper into what we can do with graphql before
we end with what's in the pipeline or what are we working on
at the moment. Okay, so let's start with the use case
to explain this example. So what we have is an application
that's kind of like a gamification application or a reward system or scoring
system that can score users based on some action
that they've done. As an example, a travel system
that you get points or miles if you fly to
specific places. And that scores a person.
So we have a person object, which is quite a big object and it contains
quite a lot of fields and information about this person. And then we have
a score object that can define a type of score
and the value of that score. And then obviously we want to display
some of these fields onto different uis
depending on what the user is on. So let's start off by building this application
in rest and we'll use Jax Rs to do that. So what I've
done is I went to codecorequist IO
and I started off with just these three extensions.
But as I've built it I've added more extensions. This will
help you to bootstrap can application. Now the
high level design of this application is basically we have
different data stores for people and score.
So the person or people are being stored in
a relational database and is started with a person service, which actually
is JPA, to store and retrieve these entities.
And then the score service is a flat file that gets stored on
disk. And then both of these are fronted
by a rest API that can then be consumed by
the user. And then obviously the object model exists that defines
person and score and everything that goes with it. It's actually quite a big object
model. Okay, so let's get into it.
I already have built the application,
like I said. So a person service. Here's the rest API.
You can see that this is a plain old Jackson rest API
that injects the backend service and then exposes getting
it by id by basically just passing that along and
getting everybody by calling the back end service and similarly the
score service. Go get
the score off the score service, it just injects it and make it available
via rest. So the application is already running in the
background. So if you go to localhost
80 80 Q dev on caucus, you get to the dev console.
You can see all the extensions that I
have. So we want to look at open API or
specifically swagger UI so that we can see these rest endpoints that we've
built. Okay, let's clear the log file so
you can see I can go and get all people and
that will return all the people in the database. And it is quite a
lot of information. Like I said, there's a lot of information on person,
but I can also then get a specific person, like person one.
So after I've gotten this person with all the information which I
maybe only want name and surname of, I can also get
some id number which is what I can use to
go and get the scores for this person because this is
a separate system, like I said. So now I have the score. So I
had to make two HTTP calls. Both returned
too much data, but the first one didn't return
all the data that I wanted. I had to filter out too much data and
then made a subsequent call to get even more data that's too much,
filter that out and combined a lot to actually display the page that
I want. That brings us to the actual problem,
the problem of over and under fetching. So overfetching is when I
fetch way too much data and I don't want to use
all of it and I want to actually define what exactly
I'm interested in. The web page might want more data,
but the mobile site might want less data because it has less
screen space to actually display it on. And then under fetching is when
I made a call, but there's not all the information in there that I
need, and I have to make subsequent calls to get all the
information that I need. So that brings us to GraphQL.
We're going to take that application and convert it to a graphQl application.
Before we do that, let's quickly just go through a little bit of history of
GraphQL. It's developed and open source by Facebook.
It's just a specification, you can find it there
under that URL. It's been positioned as an alternative to
rest, even though you can use the two together.
It does declarative data fetching, which is solving the problem that I've
just described. Facebook developed it because of the increased use
mobile usage that they have on their platforms. And mobile is
much more sensitive towards network traffic. It also
allows a variety of different front end frameworks and
a rapid feature development. Because the separation between
your front end team and your back end team is much cleaner. Because I
can do queries, I do not have to go to a back end team to
say, I need this specific service, please build it for me so that I
can build my front end. They've been doing this since 2012 and
GraphQL has a specification has been available since 2015.
Okay, so if we look at our high level design again, what we're
going to do is we're going to remove, we're going to take this and
this one and replace it with one API,
person GraphQl API. Okay, so let's get into that.
So we'll start off by just grabbing the person rest
API and copying and pasting it to a different namespace.
And now we'll start changing it. So we'll start off
by renaming it to be person Graphql
API. Let me just go bigger.
And now we can actually remove all of this jacksory
specific things. And we can say that this is a GraphQL
API and we still want to inject person
service because we still get the data from the backend service.
And now we can say this is going to be a query, which means
I'm fetching data. I don't have the
concept of a path parameter. And this will also
be a query.
And that's it. I can obviously remove all the unused
imports now. So I've now taken that service and made a graphql
endpoint out of it. So let's go have a look at it.
So back into my dev UI you will see that there's also
a graphql UI which is similar to swagger UI. It's just a way that
it's just a front red hat gives you access to your services.
So the first thing that you'll notice is that it's got a built in
schema with Jax Rs. You have to use something
on top of Jax Rs like Openapi to actually create
that schema. Now we have this built in
schema. As you can see here you can traverse the schema
and because of that you also have code inside. So here I can do
people like we've done. But the difference now is that I can say
what fields I'm interested in. So if I'm only interested in
all the people's names, the payload obviously coming back from the
server is much smaller because I filtered already out
all the fields that I don't want. And similarly I can do get
me person where
Id is one. And now I can
say what do I want back? I want names and let's say
surname and then I only get that back.
So you can already see that the overfetching problem is solved.
So let's see if we can solve the underfetching problem. So what
we'll do is we'll take the score service,
no, not the school service, the actual rest service,
and we'll just copy this part and
paste it in here.
Okay, so we still want to obviously inject the back end service
because that's where we get our data from. But we want to change
this rest input. Now the difference now is that we're not going to make this
queryable. What we're going to do is we're going to say that
if there is a query where the source,
meaning the output of this query, the source
is a person.
So source person p then add a
list of scores as a field to that person and then I
can say p dot id number
before I save. What you'll notice here is that there's no score
service at the moment.
But once I save and
refresh you
will see that there's now a scores field on this as
a queryable field and then I can obviously go deeper into that graphql
to find exactly what I want. So what I've done now is
I've fixed both the over and under fetching problem with
one call. I've got exactly the data that I want and
that is what we are solving.
What is happening in the background here is that when
a query comes in we see that you want a person and we see
that you also want that score. Now basically
what's happening in the background is we make a call to this method and then
we make a call to this method. The problem comes in with something like
get people where you might have thousand people and
we don't want to make 1000 method calls to this. That is just not efficient.
We could make it a batch type scenario where we say actually send
me in the list of people. So do
this. So both get person and
get people will now call this with a list. Even though get person,
the list will be the size of one. And what we'll do
now here is we'll say p stream map
and then we say we want person gets
id number and then we want to collect it
in a collector to
list. Right? So now I have a list
of ids which
I can much more efficiently call the
back end service. Now again, before I click save here, let's actually
do people again. Um,
so, and then we'll say, give me the names and the
surnames and scores,
and then you'll see the problem, we'll just do name.
Okay? So if I go to my log file now,
you'll see this is the problem that I went and I got all the people,
but then I called that method a few times and that's what we want to
try and solve so that we don't have to do that. So this
will solve that.
So if I red hat call now, let's clear the lock
from,
you can see that now I made that call with all the ids,
so it's a much more efficient call for the backend. Now that's called
batching. In graphql you
can actually call multiple requests. So as an example,
let me just close this one as an example, let's go back to our
person query. So this return
one person, but maybe I want person one and two. So I can
do something like this where I say person one will be this
and then I also want person two.
And maybe for person two I only
want the name and the surname, not the scores.
Maybe it's showing you your friends or whatever.
So names and surname and
again in one call I can combine these things
and this doesn't have to be from the same entity like a person.
So as an example, let's say that like I said, this could be a system
that rewards you for traveling. And as an added benefit they
will show you the exchange rate against
your base currency of the country that you're going to. So for
instance, we can add another
source method. Let me do this. So we inject
an exchange rate service
and then we add an exchange rate field onto
person. Similarly like we do with scores.
And here you can see that apart from the
source that you pass in, you can pass in even more queryable
parameters in. So this should give us a exchange
rate. The other thing that we want to do is we want to also show
the weather. So let's add a
weather API. So that means when I travel to London
I can see the exchange rate and the weather of my destination.
So I'm going to add a new API,
the weather graphql API. And I'm just
going to cut and paste it because it's not important how
we pulled it necessarily.
But you can see it's again just a graphql API with
injecting a backend service. This is actually making an actual weather call
to can API somewhere on the Internet
that will give me the weather of a service. So back
here I should now
actually say, well, let's say person one travels
is traveling to London. So here I want
to just refresh so that my schema updates.
So I want to get the exchange rate against
the pound and I just want to get the rate.
And then I also want to get weather, which is now separate API.
It's not a field on person.
And let's see, what do I need to
pass in I think city and
London, right. And then let's see,
I want to return the description, the minimum and
the maximum.
Okay, so it took a bit longer because
it's actually making a call out onto the Internet. But you can see I've got
the weather, I've got the exchange rate and I've got the person. So I
can combine any
amount of graphQl endpoints into one
request to get exactly the data that I want.
Okay, so that brings us to the next point is asynchronous.
So at the moment what is happening is that as the request
comes into the server we go fetch person. We need person
to be able to get exchange rate. So we can't do these two sequential
because I need the base currency against which
of this person to get the exchange rate. And that's why it's a
source field. But something like weather doesn't need to wait for person
because I just need to get the weather for London. So we could theoretically actually
just call this and this concurrently. But at the moment it's
been called sequential. Now the way in which we're going to change that
so that we can do it concurrently is just
to use asynchronousity.
So we'll start off with the weather service. We'll basically say that
this should be a completion
stage of weather.
Now the actual back end service is already actually asynchronous.
So we can just do this.
And then for person API we
now want to do the same here config person.
And for now we'll just do something like this.
So that should allow us to actually call weather and
person at the same time. Now to see that in action
we need to also look at the log file.
So this was the previous call. You can red hat. I got the person
and then after I got the person I got the exchange rate and then I
got that back. And only then I started getting the weather for London.
And then I got the weather and then I got
the scores. So let's clear this
and do that call again and
get an error. Let me just,
I do get errors every now and then,
and it's usually the hot reload. So I'm
just restarting caucus in deaf mode and
then we'll give that try again.
Let's go. Okay, here we
go. So back
here, we'll do the call again.
It's still getting an error.
Oh, that's that error context. Not aware I
can actually quickly fix it. There's a problem that this needs to be
transactional for some reason. I'm not sure why,
actually that could come out. We don't want to show that now.
Um,
so actually let's see if it works. Now, I'm not going to get
into detail on why this is broken or not working.
I'm not transactional. But anyway. Okay, so I got the data back and let's look
at the log file now. And you can red
hat. I'm getting the weather and the person and
then the exchange rate. Then I got the weather and the exchange rate.
So you can see that this is definitely a better,
splitting the work up into two. A better way to do this.
Now,
what will happen if I basically say
that I want to travel to London
and New York? So I want to do this where
I say that this is Great
Britain pound
and this will be then USD.
So let's do USD here.
And then I want the weather for both of those as well. So I want
London's weather and then I want New York's weather.
Right,
New York. So now it
should call person weather and
weather concurrently.
But what I also want to do is I want to call these two concurrently,
because again, once I have person, the one doesn't have to wait for
the other. So again, in person service,
I can even change the exchange rate
to be a complete completion stage.
And that will allow then for me to call all the fields that I've
added with a source to be returned
concurrently. Right.
So now let's look at the log file.
You can see that it's getting the weathers at the same time as the person,
and then it's getting the two exchange rate. The one is not waiting for the
other, which is obviously a much better design
in certain cases. Okay,
so what we have looked at so far is how graphql solves
over and under fetching by using a query and
by using the source to add fields onto it. And we've looked
at batch, multiple requests and asynchronous.
So if I just recap batch basically before batch,
or if you don't do batch, you can get in a situation where a
source field in a collection is
being called multiple times and that's not necessarily efficient.
But then you can do something like that where it's all batched up
in one call. And then with asynchronous,
we started off by looking at, before we use any
completion stage, everything was
sequential. So we will get the person and then we'll get the exchange rate and
then we'll get the weather before we return.
If we change weather and person to be a
completion state return, then we can call those two concurrent,
and then we can go further to say that even the source fields,
so these two fields can be called concurrent.
So that's all going to depend on your use case.
So let's dive a little bit deeper into graphql and see what
else can we do with graphql. Okay,
so we'll start off with errors and partial
response. So I'm just going to revert back to just
getting a person and scores.
Red hat we'll do is we'll say, let's pretend that
the score system, the one that provides this data, is down.
So what we'll do is back in our API, we'll just
here with scores,
we will just throw a runtime exception,
throw new
runtime exception and
we'll just say the score
system down,
right. So what I don't want to do is penalize
the caller and give him nothing. So what
we have here is partial results that I can at least return what I have,
which is the person, but I do not have the
score. So I return null there. So I could still kind
of recover or show alternative in a front
end case. And this error could be, there could be multiple errors.
And you can see here that the message is not the one that I've typed
in my code. It's not saying the school system is down, it's saying server
error. And this is because I'm throwing a runtime exception.
So by default, runtime exceptions will hide the message,
mostly for security reasons, and then checked exceptions
will actually spill it over the boundary. So you can see it. So if we
change this, for instance, to be a
new score not available exception,
which we will then throw,
this is now a checked exception. So now I
will get the message there. Now both of checked and
unchecked exceptions behavior is configurable, meaning that
you can say that for this runtime exception I actually want to pass the
message over the boundary. And for this checked exception I don't.
So all of those combinations is configurable. But the default
is basically a blanket rule, all runtime and all checked
exceptions. So that's
errors and partial responses. You also have normal errors,
like if you put in something here that doesn't exist,
although you can already see it here. If you do a query, you'll get a
validation error that says that that's not a valid field,
it's a normal validation. And beam validation also works.
I'm not going to show that now because I don't have enough time.
Next I want to show transformation and mapping. So up until now,
let me just quickly fix this before I continue.
Right, so now the system is back up and this
will work again. Now I'm a save,
the system is back up. Now. Up until now we've basically
used the default mapping that graphql gives us out of the box
to say that allowing is mapping to a
big integer and a date is mapping
this way. But you can actually control that.
So we'll start off by looking at the schema here. So if you look at
person, you'll see that our id is mapped as a big integer.
Now that is because in our model a
person, the id is marked as a
long. Now you might not
want to have it as a big integer
in the schema and you can then do something like this where you would say
to scalar and then you say
which type you want this to map to.
So in our case we say that this should map to an int.
Now if I go and look at this again,
the person id is now an Int.
Okay, so this is a basic scholar to scarlet mapping that's available.
Then we support JSon B out
of the box, but also with our own annotations.
So if you don't have JSoN B, then you don't have to use it.
So let's quickly add birthday
here. Let's see, when is this
person one's birthday? So you'll see the date format is year,
year, month, month, day. But I can then
go change it either with JSON V date format,
let's find the birthday,
or with the built in profql
one called date format, which is they act
exactly the same. So this should change it to
dayday, month, month, year.
Right? So now I've done the date format, and then depending on the
input on where you place it, I've placed mine on the
field, which means that this applies on both inputs
and output. But if you put it on the setter, it'll only apply
on the input and on the getter only on the output.
One of the more complex things that you can do is if you look here,
you'll red hat. Email is
actually a complex object here in this system.
And you
know that this can be represented as a string. So what
you want to do is if I just show you this at the
moment, if I want to see email, because it's a complex object,
I need to actually get the value back, right. And I know that even
though it's a complex object in graphQl, I want to represent it as a string.
So what I can do is I can then here say
add to scholar
and then make it a string.
Now, there's some default rules in how it determines how to do this.
For marshalling to string, it will just use two string. So you
need to make sure that your two string is implemented for creating
an input from a string.
It will go and see if there's a constructor or a
set value or whatever the field is to
reconstruct this object.
Right. So now this will be invalid because that's not a
complex object anymore. Wait, I didn't save,
um,
so you can see there already it says email and doesn't have a subfield because
we've now made it a string. Right? So that's that,
right.
So we
have looked at errors, we have looked
at transformation and mapping. Next, I want to talk about mutations.
Now, mutation. Up until now we've only query
data. Mutations is basically the rest of the crud
operations. When data is going to change, it mutates.
So that's basically we
want to add a person or delete a person or update a
person. And that's very straightforward to do.
Let's go to our graphql endpoint and
add this. So it's basically just
a new annotation to say mutation. It always also
returns a person, which means the result of a mutation
is also a queryable result.
So as an example, let's see if this works. I'm going to
create a person. This might not work and then
I'll explain why not.
Yeah, schema is not configured for mutations,
so I'll just restart again.
Um,
the hot reload, for some reason, doesn't pick this up.
It used to, but you'll see that the hot reload now
did some clever footwork around class paths. And for some reason this
isn't picked up. So that's something that we need to fix in caucus. It's hot
reloading,
but anyway, so you can see here, I'm basically, again just calling the back end
person service with an object. So what I'll do now with mutations
is I'm basically saying update the person and I
give it a name. And because I didn't give it an id it will generate
a new one because it's ot generated id. And then I'm asking
back for this fields which I should only get name and id
back because um,
did I not save? So maybe the whole reload was working
actually and I just didn't save.
Yeah, so this is the other error that I knew about.
So what's happening there is that I didn't set up my
postgres properly. So when I insert the initial test
data I didn't know what the next id is. So I get this duplication
error and I didn't bother to fix it. But bottom
line is this is how I can create a person.
You can see now that I got the id from the
database that's degenerated and
then all of this is still set to null because I haven't
created those yet. But again this is queryable what I
ask. So I can now go and update that person.
Now I'll pass in the id because I want to do an update
and then those fields are populated.
And similarly you can do a delete. I'm not going to show a delete,
but you can do the rest of the crowd operations.
Okay, what I want to show next is we'll
quickly do introspection because that's quite fast. And then we'll go on
to security. So introspection is an interesting concept
where I can use graphql
to query graphql's schema.
And again this is because there is a strongly typed schema
available and I can now query that schema
with GraphQl. So I can say get me all the types,
all the names of the types in the schema and then you can get all
the names of the types and similarly you can query whatever
you want about the actual schema document.
That is quite an interesting feature that you have out of the box.
Okay, so let's move on to security.
What I'm going to do is security is
not really part of the GraphQl specification.
It's up to the runtime to implement this. And because we make this
available over HTTP and because this is in Quarkus,
the normal Quarkus security works. What I'm going
to demo is just plain old JWT. So I
have a keyclock server running in the background.
And what I'm going to do is I'm going to say that this roles
allowed roles allowed only
allow employees to do this.
I think it must be okay,
which means if I now go and go back and do a
person id one
and names, I should get an error because
I'm not authorized, right? So unauthorized
exception. So what I'll do now, let me just do
this to show,
so I'll get a fresh token from keycloak
here, and then I'll pass a request header
in a better token, basically copy,
and then I can get the data back. And again, any of the
security models in caucus will work
with GraphQl.
Okay, let me remove this because this can get
quite annoying. And remove this.
So that's securing your endpoints.
And by the way, I'm not going to show that, but you
can add security or different role around this message and
then you will get a partial result back if your user is
not in that role. Okay,
next I want to show context.
Now, context allows you to inject
the context of this query anywhere in
your code. And this is useful if you look at our back end
as an example, our person service and
this service is making calls to a database.
And I can now go and inject this context.
And this context gives me all
sorts of things like the actual request, the query, the name,
the variables, and a whole lot more
things about the original request. Which means at
the moment when I do find person, the database still
returns all the fields on that person.
It's only after it came from the database that we filter out only
the fields that should go over the network to the client.
But this will allow you to inspect the original request and do
a better query on the back end. So the one
way is to do this, so inject it. The other way is
to use events. Now there's quite a few events that gets fired
both during startup. So as we build your schema, there's a
bunch of events that allows you to take part in that building of the schema.
And then there's events like this one where just before we
execute, we will call this event
and you can take part in that thing basically to say at observe.
And then you can get the context. As you can see here,
there's all sorts of hooks where you can hook into this execution.
So let's see if this works. I might have to restart
again,
right? So let's
just do this.
So here you can see that I've got the context and
I'm just printing out the query, but already I can see that only
names has been asked, which means I can actually optimize my
back end query. And that's the idea of the context.
Okay, the next thing I want to show is custom execution.
So up
until now we basically allow execution based on
this methods which will serve incoming traffic
over HTTP. But you might be in a situation where you want to
do a graphql query on startup or on
some schedule or just programmatically for some other reason.
In my example I have this use case that I'm saying
that every morning I want to do a query to find all the scores so
that I can send out a mail to send the leaderboard out as an
example. So I have this leaderboard service which
is running on a schedule. So what you can do is you can inject
an execution service right this one
here, and then that gives you access to the actual service that's being called.
That's the same service that the
web request will call. So the only difference now is
that you build up your own query in here and then
this should make the call. So to test that,
in fact it should already run now because it'll run on startup.
But let's clear the lock so you can actually in
Dev Ui you can trigger a schedule so I can invoke
it and then you can see there I have printed out
what that execution service executed.
Okay, so that is custom execution.
We have quite a lot of integrations already with microprof on GraphQl
JSONB. I've shown a little bit when we used the date format, but properties
and hiding elements and all the other JSON B properties is
supported security. I've shown how we've integrated that context.
I've shown how we can pass this context along using context propagation.
What I didn't show is beam validation. So you can mark either your pojos
and or your method signatures with beam validation and that will
apply, throw a normal error which could result in partial responses.
We've also integrated with metrics and tracing and both of those
you don't have to do anything but include the extension in
your application. And then by default we will start emitting metrics
and trace requests and then genetic support. Your methods
can use genetics and that will work. That brings us to the last part of
this, and that is what's in the pipeline. So what are we
working on at the moment? So we are nearly ready with a client,
or the client is actually already ready, but pulling them into caucus.
So this is not the execution service injection like I've shown here.
That gives you on the server side access to the execution service.
This is a proper client which allows you to make calls from
a client that will make a call over HTTP. There are two variants
of this client that we're pulling. One is similar to the JAX
Rs client that uses this. We call it the dynamic
client and it allows this builder pattern to build a query
up. And the other one is a typesave client, which is similar to
rest client in microprofile. So that's a typesave one where
you define the objects. Red hat will make the call. The other interesting
thing we're working on is subscriptions, which is one of the outstanding
features that the GraphQL specification has. So graphQl specification
on a high level have queries, mutations, and then subscriptions.
And subscriptions is basically it allows you to subscribe to
a query, you get push notifications as this
query, as data comes in that match that query. We are busy
implementing it over websockets in caucus. Red Hat hopefully will
be available pretty soon. And then one of the other things that's high on our
priority list is to make paging and filtering easier. So at
the moment, if you want to do paging of big payloads, you will have
to code it in yourself with defining the page number
and number of pages that you want to turn and so on as parameters.
But we want to make that easier by just allowing like at
paging or at filtering type annotation. And that is
it for me. I know it was very fast.
So what I've shown is made up of these two examples,
and this examples contain even more things that I didn't show.
But you can go and have a look there, reach out to me, find me
on the Internet, and if you have any questions.
And that's it. Thank you very much.