Writing queryable APIs with MicroProfile GraphQL

Video size:

Abstract

REST is great, except you can not ask what you really want. You get what the server gives you. Find out how to make your API queryable allowing your consumenrs to get exactly what they what. MicroProfile GraphQL fix the over and under fetch problem that exist in today’s APIs, and it’s easy to write!

Summary

During the talk we will then build this example using jaxrs. That should show us the problem of overfetching and underfetch. And then we will convert that application to a graphQl application to show howgraphQl solves that.
In graphql you can actually call multiple requests. The problem comes in with something like get people where you might have thousand people. Now I have a list of ids which I can much more efficiently call the back end service. And then I also want to get weather, which is now separate API.
graphql solves over and under fetching by using a query and by using the source to add fields onto it. We've looked at batch, multiple requests and asynchronous. Let's dive a little bit deeper into graphql and see what else can we do with graphql.
Next I want to show transformation and mapping. Up until now we've basically used the default mapping that graphql gives us out of the box. But you can actually control that. Can change it either with JSON V date format, or with the built in profql one called date format.
Email is actually a complex object here in this system. And you know that this can be represented as a string. For marshalling to string, it will just use two string. You need to make sure that your two string is implemented for creating an input from astring.
Next, I want to talk about mutations. Mutations is basically the rest of the crud operations. When data is going to change, it mutates. And then we'll go on to security. What I'm going to demo is just plain old JWT.
context allows you to inject the context of this query anywhere in your code. This will allow you to inspect the original request and do a better query on the back end. The other way is to use events.
We have quite a lot of integrations already with microprof on GraphQl JSONB. We've also integrated with metrics and tracing and both of those you don't have to do anything but include the extension in your application. Your methods can use genetics and that will work.
We are nearly ready with a client, or the client is actually already ready, but pulling them into caucus. The other interesting thing we're working on is subscriptions, which is one of the outstanding features that the GraphQL specification has. One of the other things that's high on our priority list is to make paging and filtering easier.

Transcript

This transcript was autogenerated. To make changes, submit a PR.

Okay, welcome to my talk about queryable APIs with GraphQL. In this talk we are going to start off by explaining the use case or the example that we are using. During the talk we will then build this example using jaxrs. That should show us the problem of overfetching and underfetching. And then we will convert that application to a graphQl application to show how graphQl solves that. And then we will dive a little bit deeper into what we can do with graphql before we end with what's in the pipeline or what are we working on at the moment. Okay, so let's start with the use case to explain this example. So what we have is an application that's kind of like a gamification application or a reward system or scoring system that can score users based on some action that they've done. As an example, a travel system that you get points or miles if you fly to specific places. And that scores a person. So we have a person object, which is quite a big object and it contains quite a lot of fields and information about this person. And then we have a score object that can define a type of score and the value of that score. And then obviously we want to display some of these fields onto different uis depending on what the user is on. So let's start off by building this application in rest and we'll use Jax Rs to do that. So what I've done is I went to codecorequist IO and I started off with just these three extensions. But as I've built it I've added more extensions. This will help you to bootstrap can application. Now the high level design of this application is basically we have different data stores for people and score. So the person or people are being stored in a relational database and is started with a person service, which actually is JPA, to store and retrieve these entities. And then the score service is a flat file that gets stored on disk. And then both of these are fronted by a rest API that can then be consumed by the user. And then obviously the object model exists that defines person and score and everything that goes with it. It's actually quite a big object model. Okay, so let's get into it. I already have built the application, like I said. So a person service. Here's the rest API. You can see that this is a plain old Jackson rest API that injects the backend service and then exposes getting it by id by basically just passing that along and getting everybody by calling the back end service and similarly the score service. Go get the score off the score service, it just injects it and make it available via rest. So the application is already running in the background. So if you go to localhost 80 80 Q dev on caucus, you get to the dev console. You can see all the extensions that I have. So we want to look at open API or specifically swagger UI so that we can see these rest endpoints that we've built. Okay, let's clear the log file so you can see I can go and get all people and that will return all the people in the database. And it is quite a lot of information. Like I said, there's a lot of information on person, but I can also then get a specific person, like person one. So after I've gotten this person with all the information which I maybe only want name and surname of, I can also get some id number which is what I can use to go and get the scores for this person because this is a separate system, like I said. So now I have the score. So I had to make two HTTP calls. Both returned too much data, but the first one didn't return all the data that I wanted. I had to filter out too much data and then made a subsequent call to get even more data that's too much, filter that out and combined a lot to actually display the page that I want. That brings us to the actual problem, the problem of over and under fetching. So overfetching is when I fetch way too much data and I don't want to use all of it and I want to actually define what exactly I'm interested in. The web page might want more data, but the mobile site might want less data because it has less screen space to actually display it on. And then under fetching is when I made a call, but there's not all the information in there that I need, and I have to make subsequent calls to get all the information that I need. So that brings us to GraphQL. We're going to take that application and convert it to a graphQl application. Before we do that, let's quickly just go through a little bit of history of GraphQL. It's developed and open source by Facebook. It's just a specification, you can find it there under that URL. It's been positioned as an alternative to rest, even though you can use the two together. It does declarative data fetching, which is solving the problem that I've just described. Facebook developed it because of the increased use mobile usage that they have on their platforms. And mobile is much more sensitive towards network traffic. It also allows a variety of different front end frameworks and a rapid feature development. Because the separation between your front end team and your back end team is much cleaner. Because I can do queries, I do not have to go to a back end team to say, I need this specific service, please build it for me so that I can build my front end. They've been doing this since 2012 and GraphQL has a specification has been available since 2015. Okay, so if we look at our high level design again, what we're going to do is we're going to remove, we're going to take this and this one and replace it with one API, person GraphQl API. Okay, so let's get into that. So we'll start off by just grabbing the person rest API and copying and pasting it to a different namespace. And now we'll start changing it. So we'll start off by renaming it to be person Graphql API. Let me just go bigger. And now we can actually remove all of this jacksory specific things. And we can say that this is a GraphQL API and we still want to inject person service because we still get the data from the backend service. And now we can say this is going to be a query, which means I'm fetching data. I don't have the concept of a path parameter. And this will also be a query. And that's it. I can obviously remove all the unused imports now. So I've now taken that service and made a graphql endpoint out of it. So let's go have a look at it. So back into my dev UI you will see that there's also a graphql UI which is similar to swagger UI. It's just a way that it's just a front red hat gives you access to your services. So the first thing that you'll notice is that it's got a built in schema with Jax Rs. You have to use something on top of Jax Rs like Openapi to actually create that schema. Now we have this built in schema. As you can see here you can traverse the schema and because of that you also have code inside. So here I can do people like we've done. But the difference now is that I can say what fields I'm interested in. So if I'm only interested in all the people's names, the payload obviously coming back from the server is much smaller because I filtered already out all the fields that I don't want. And similarly I can do get me person where Id is one. And now I can say what do I want back? I want names and let's say surname and then I only get that back. So you can already see that the overfetching problem is solved. So let's see if we can solve the underfetching problem. So what we'll do is we'll take the score service, no, not the school service, the actual rest service, and we'll just copy this part and paste it in here. Okay, so we still want to obviously inject the back end service because that's where we get our data from. But we want to change this rest input. Now the difference now is that we're not going to make this queryable. What we're going to do is we're going to say that if there is a query where the source, meaning the output of this query, the source is a person. So source person p then add a list of scores as a field to that person and then I can say p dot id number before I save. What you'll notice here is that there's no score service at the moment. But once I save and refresh you will see that there's now a scores field on this as a queryable field and then I can obviously go deeper into that graphql to find exactly what I want. So what I've done now is I've fixed both the over and under fetching problem with one call. I've got exactly the data that I want and that is what we are solving. What is happening in the background here is that when a query comes in we see that you want a person and we see that you also want that score. Now basically what's happening in the background is we make a call to this method and then we make a call to this method. The problem comes in with something like get people where you might have thousand people and we don't want to make 1000 method calls to this. That is just not efficient. We could make it a batch type scenario where we say actually send me in the list of people. So do this. So both get person and get people will now call this with a list. Even though get person, the list will be the size of one. And what we'll do now here is we'll say p stream map and then we say we want person gets id number and then we want to collect it in a collector to list. Right? So now I have a list of ids which I can much more efficiently call the back end service. Now again, before I click save here, let's actually do people again. Um, so, and then we'll say, give me the names and the surnames and scores, and then you'll see the problem, we'll just do name. Okay? So if I go to my log file now, you'll see this is the problem that I went and I got all the people, but then I called that method a few times and that's what we want to try and solve so that we don't have to do that. So this will solve that. So if I red hat call now, let's clear the lock from, you can see that now I made that call with all the ids, so it's a much more efficient call for the backend. Now that's called batching. In graphql you can actually call multiple requests. So as an example, let me just close this one as an example, let's go back to our person query. So this return one person, but maybe I want person one and two. So I can do something like this where I say person one will be this and then I also want person two. And maybe for person two I only want the name and the surname, not the scores. Maybe it's showing you your friends or whatever. So names and surname and again in one call I can combine these things and this doesn't have to be from the same entity like a person. So as an example, let's say that like I said, this could be a system that rewards you for traveling. And as an added benefit they will show you the exchange rate against your base currency of the country that you're going to. So for instance, we can add another source method. Let me do this. So we inject an exchange rate service and then we add an exchange rate field onto person. Similarly like we do with scores. And here you can see that apart from the source that you pass in, you can pass in even more queryable parameters in. So this should give us a exchange rate. The other thing that we want to do is we want to also show the weather. So let's add a weather API. So that means when I travel to London I can see the exchange rate and the weather of my destination. So I'm going to add a new API, the weather graphql API. And I'm just going to cut and paste it because it's not important how we pulled it necessarily. But you can see it's again just a graphql API with injecting a backend service. This is actually making an actual weather call to can API somewhere on the Internet that will give me the weather of a service. So back here I should now actually say, well, let's say person one travels is traveling to London. So here I want to just refresh so that my schema updates. So I want to get the exchange rate against the pound and I just want to get the rate. And then I also want to get weather, which is now separate API. It's not a field on person. And let's see, what do I need to pass in I think city and London, right. And then let's see, I want to return the description, the minimum and the maximum. Okay, so it took a bit longer because it's actually making a call out onto the Internet. But you can see I've got the weather, I've got the exchange rate and I've got the person. So I can combine any amount of graphQl endpoints into one request to get exactly the data that I want. Okay, so that brings us to the next point is asynchronous. So at the moment what is happening is that as the request comes into the server we go fetch person. We need person to be able to get exchange rate. So we can't do these two sequential because I need the base currency against which of this person to get the exchange rate. And that's why it's a source field. But something like weather doesn't need to wait for person because I just need to get the weather for London. So we could theoretically actually just call this and this concurrently. But at the moment it's been called sequential. Now the way in which we're going to change that so that we can do it concurrently is just to use asynchronousity. So we'll start off with the weather service. We'll basically say that this should be a completion stage of weather. Now the actual back end service is already actually asynchronous. So we can just do this. And then for person API we now want to do the same here config person. And for now we'll just do something like this. So that should allow us to actually call weather and person at the same time. Now to see that in action we need to also look at the log file. So this was the previous call. You can red hat. I got the person and then after I got the person I got the exchange rate and then I got that back. And only then I started getting the weather for London. And then I got the weather and then I got the scores. So let's clear this and do that call again and get an error. Let me just, I do get errors every now and then, and it's usually the hot reload. So I'm just restarting caucus in deaf mode and then we'll give that try again. Let's go. Okay, here we go. So back here, we'll do the call again. It's still getting an error. Oh, that's that error context. Not aware I can actually quickly fix it. There's a problem that this needs to be transactional for some reason. I'm not sure why, actually that could come out. We don't want to show that now. Um, so actually let's see if it works. Now, I'm not going to get into detail on why this is broken or not working. I'm not transactional. But anyway. Okay, so I got the data back and let's look at the log file now. And you can red hat. I'm getting the weather and the person and then the exchange rate. Then I got the weather and the exchange rate. So you can see that this is definitely a better, splitting the work up into two. A better way to do this. Now, what will happen if I basically say that I want to travel to London and New York? So I want to do this where I say that this is Great Britain pound and this will be then USD. So let's do USD here. And then I want the weather for both of those as well. So I want London's weather and then I want New York's weather. Right, New York. So now it should call person weather and weather concurrently. But what I also want to do is I want to call these two concurrently, because again, once I have person, the one doesn't have to wait for the other. So again, in person service, I can even change the exchange rate to be a complete completion stage. And that will allow then for me to call all the fields that I've added with a source to be returned concurrently. Right. So now let's look at the log file. You can see that it's getting the weathers at the same time as the person, and then it's getting the two exchange rate. The one is not waiting for the other, which is obviously a much better design in certain cases. Okay, so what we have looked at so far is how graphql solves over and under fetching by using a query and by using the source to add fields onto it. And we've looked at batch, multiple requests and asynchronous. So if I just recap batch basically before batch, or if you don't do batch, you can get in a situation where a source field in a collection is being called multiple times and that's not necessarily efficient. But then you can do something like that where it's all batched up in one call. And then with asynchronous, we started off by looking at, before we use any completion stage, everything was sequential. So we will get the person and then we'll get the exchange rate and then we'll get the weather before we return. If we change weather and person to be a completion state return, then we can call those two concurrent, and then we can go further to say that even the source fields, so these two fields can be called concurrent. So that's all going to depend on your use case. So let's dive a little bit deeper into graphql and see what else can we do with graphql. Okay, so we'll start off with errors and partial response. So I'm just going to revert back to just getting a person and scores. Red hat we'll do is we'll say, let's pretend that the score system, the one that provides this data, is down. So what we'll do is back in our API, we'll just here with scores, we will just throw a runtime exception, throw new runtime exception and we'll just say the score system down, right. So what I don't want to do is penalize the caller and give him nothing. So what we have here is partial results that I can at least return what I have, which is the person, but I do not have the score. So I return null there. So I could still kind of recover or show alternative in a front end case. And this error could be, there could be multiple errors. And you can see here that the message is not the one that I've typed in my code. It's not saying the school system is down, it's saying server error. And this is because I'm throwing a runtime exception. So by default, runtime exceptions will hide the message, mostly for security reasons, and then checked exceptions will actually spill it over the boundary. So you can see it. So if we change this, for instance, to be a new score not available exception, which we will then throw, this is now a checked exception. So now I will get the message there. Now both of checked and unchecked exceptions behavior is configurable, meaning that you can say that for this runtime exception I actually want to pass the message over the boundary. And for this checked exception I don't. So all of those combinations is configurable. But the default is basically a blanket rule, all runtime and all checked exceptions. So that's errors and partial responses. You also have normal errors, like if you put in something here that doesn't exist, although you can already see it here. If you do a query, you'll get a validation error that says that that's not a valid field, it's a normal validation. And beam validation also works. I'm not going to show that now because I don't have enough time. Next I want to show transformation and mapping. So up until now, let me just quickly fix this before I continue. Right, so now the system is back up and this will work again. Now I'm a save, the system is back up. Now. Up until now we've basically used the default mapping that graphql gives us out of the box to say that allowing is mapping to a big integer and a date is mapping this way. But you can actually control that. So we'll start off by looking at the schema here. So if you look at person, you'll see that our id is mapped as a big integer. Now that is because in our model a person, the id is marked as a long. Now you might not want to have it as a big integer in the schema and you can then do something like this where you would say to scalar and then you say which type you want this to map to. So in our case we say that this should map to an int. Now if I go and look at this again, the person id is now an Int. Okay, so this is a basic scholar to scarlet mapping that's available. Then we support JSon B out of the box, but also with our own annotations. So if you don't have JSoN B, then you don't have to use it. So let's quickly add birthday here. Let's see, when is this person one's birthday? So you'll see the date format is year, year, month, month, day. But I can then go change it either with JSON V date format, let's find the birthday, or with the built in profql one called date format, which is they act exactly the same. So this should change it to dayday, month, month, year. Right? So now I've done the date format, and then depending on the input on where you place it, I've placed mine on the field, which means that this applies on both inputs and output. But if you put it on the setter, it'll only apply on the input and on the getter only on the output. One of the more complex things that you can do is if you look here, you'll red hat. Email is actually a complex object here in this system. And you know that this can be represented as a string. So what you want to do is if I just show you this at the moment, if I want to see email, because it's a complex object, I need to actually get the value back, right. And I know that even though it's a complex object in graphQl, I want to represent it as a string. So what I can do is I can then here say add to scholar and then make it a string. Now, there's some default rules in how it determines how to do this. For marshalling to string, it will just use two string. So you need to make sure that your two string is implemented for creating an input from a string. It will go and see if there's a constructor or a set value or whatever the field is to reconstruct this object. Right. So now this will be invalid because that's not a complex object anymore. Wait, I didn't save, um, so you can see there already it says email and doesn't have a subfield because we've now made it a string. Right? So that's that, right. So we have looked at errors, we have looked at transformation and mapping. Next, I want to talk about mutations. Now, mutation. Up until now we've only query data. Mutations is basically the rest of the crud operations. When data is going to change, it mutates. So that's basically we want to add a person or delete a person or update a person. And that's very straightforward to do. Let's go to our graphql endpoint and add this. So it's basically just a new annotation to say mutation. It always also returns a person, which means the result of a mutation is also a queryable result. So as an example, let's see if this works. I'm going to create a person. This might not work and then I'll explain why not. Yeah, schema is not configured for mutations, so I'll just restart again. Um, the hot reload, for some reason, doesn't pick this up. It used to, but you'll see that the hot reload now did some clever footwork around class paths. And for some reason this isn't picked up. So that's something that we need to fix in caucus. It's hot reloading, but anyway, so you can see here, I'm basically, again just calling the back end person service with an object. So what I'll do now with mutations is I'm basically saying update the person and I give it a name. And because I didn't give it an id it will generate a new one because it's ot generated id. And then I'm asking back for this fields which I should only get name and id back because um, did I not save? So maybe the whole reload was working actually and I just didn't save. Yeah, so this is the other error that I knew about. So what's happening there is that I didn't set up my postgres properly. So when I insert the initial test data I didn't know what the next id is. So I get this duplication error and I didn't bother to fix it. But bottom line is this is how I can create a person. You can see now that I got the id from the database that's degenerated and then all of this is still set to null because I haven't created those yet. But again this is queryable what I ask. So I can now go and update that person. Now I'll pass in the id because I want to do an update and then those fields are populated. And similarly you can do a delete. I'm not going to show a delete, but you can do the rest of the crowd operations. Okay, what I want to show next is we'll quickly do introspection because that's quite fast. And then we'll go on to security. So introspection is an interesting concept where I can use graphql to query graphql's schema. And again this is because there is a strongly typed schema available and I can now query that schema with GraphQl. So I can say get me all the types, all the names of the types in the schema and then you can get all the names of the types and similarly you can query whatever you want about the actual schema document. That is quite an interesting feature that you have out of the box. Okay, so let's move on to security. What I'm going to do is security is not really part of the GraphQl specification. It's up to the runtime to implement this. And because we make this available over HTTP and because this is in Quarkus, the normal Quarkus security works. What I'm going to demo is just plain old JWT. So I have a keyclock server running in the background. And what I'm going to do is I'm going to say that this roles allowed roles allowed only allow employees to do this. I think it must be okay, which means if I now go and go back and do a person id one and names, I should get an error because I'm not authorized, right? So unauthorized exception. So what I'll do now, let me just do this to show, so I'll get a fresh token from keycloak here, and then I'll pass a request header in a better token, basically copy, and then I can get the data back. And again, any of the security models in caucus will work with GraphQl. Okay, let me remove this because this can get quite annoying. And remove this. So that's securing your endpoints. And by the way, I'm not going to show that, but you can add security or different role around this message and then you will get a partial result back if your user is not in that role. Okay, next I want to show context. Now, context allows you to inject the context of this query anywhere in your code. And this is useful if you look at our back end as an example, our person service and this service is making calls to a database. And I can now go and inject this context. And this context gives me all sorts of things like the actual request, the query, the name, the variables, and a whole lot more things about the original request. Which means at the moment when I do find person, the database still returns all the fields on that person. It's only after it came from the database that we filter out only the fields that should go over the network to the client. But this will allow you to inspect the original request and do a better query on the back end. So the one way is to do this, so inject it. The other way is to use events. Now there's quite a few events that gets fired both during startup. So as we build your schema, there's a bunch of events that allows you to take part in that building of the schema. And then there's events like this one where just before we execute, we will call this event and you can take part in that thing basically to say at observe. And then you can get the context. As you can see here, there's all sorts of hooks where you can hook into this execution. So let's see if this works. I might have to restart again, right? So let's just do this. So here you can see that I've got the context and I'm just printing out the query, but already I can see that only names has been asked, which means I can actually optimize my back end query. And that's the idea of the context. Okay, the next thing I want to show is custom execution. So up until now we basically allow execution based on this methods which will serve incoming traffic over HTTP. But you might be in a situation where you want to do a graphql query on startup or on some schedule or just programmatically for some other reason. In my example I have this use case that I'm saying that every morning I want to do a query to find all the scores so that I can send out a mail to send the leaderboard out as an example. So I have this leaderboard service which is running on a schedule. So what you can do is you can inject an execution service right this one here, and then that gives you access to the actual service that's being called. That's the same service that the web request will call. So the only difference now is that you build up your own query in here and then this should make the call. So to test that, in fact it should already run now because it'll run on startup. But let's clear the lock so you can actually in Dev Ui you can trigger a schedule so I can invoke it and then you can see there I have printed out what that execution service executed. Okay, so that is custom execution. We have quite a lot of integrations already with microprof on GraphQl JSONB. I've shown a little bit when we used the date format, but properties and hiding elements and all the other JSON B properties is supported security. I've shown how we've integrated that context. I've shown how we can pass this context along using context propagation. What I didn't show is beam validation. So you can mark either your pojos and or your method signatures with beam validation and that will apply, throw a normal error which could result in partial responses. We've also integrated with metrics and tracing and both of those you don't have to do anything but include the extension in your application. And then by default we will start emitting metrics and trace requests and then genetic support. Your methods can use genetics and that will work. That brings us to the last part of this, and that is what's in the pipeline. So what are we working on at the moment? So we are nearly ready with a client, or the client is actually already ready, but pulling them into caucus. So this is not the execution service injection like I've shown here. That gives you on the server side access to the execution service. This is a proper client which allows you to make calls from a client that will make a call over HTTP. There are two variants of this client that we're pulling. One is similar to the JAX Rs client that uses this. We call it the dynamic client and it allows this builder pattern to build a query up. And the other one is a typesave client, which is similar to rest client in microprofile. So that's a typesave one where you define the objects. Red hat will make the call. The other interesting thing we're working on is subscriptions, which is one of the outstanding features that the GraphQL specification has. So graphQl specification on a high level have queries, mutations, and then subscriptions. And subscriptions is basically it allows you to subscribe to a query, you get push notifications as this query, as data comes in that match that query. We are busy implementing it over websockets in caucus. Red Hat hopefully will be available pretty soon. And then one of the other things that's high on our priority list is to make paging and filtering easier. So at the moment, if you want to do paging of big payloads, you will have to code it in yourself with defining the page number and number of pages that you want to turn and so on as parameters. But we want to make that easier by just allowing like at paging or at filtering type annotation. And that is it for me. I know it was very fast. So what I've shown is made up of these two examples, and this examples contain even more things that I didn't show. But you can go and have a look there, reach out to me, find me on the Internet, and if you have any questions. And that's it. Thank you very much.

Slides

Download slides (PDF)

See all 21 talks at this event!

Conf42 Enterprise Software 2021 - Online

March 25 2021

Writing queryable APIs with MicroProfile GraphQL

Video size:

Abstract

Summary

Transcript

Slides

Phillip Krüger

Principal Software Engineer @ Red Hat

Join the community!

Featured event

2025

2024

Info

Conf42 Enterprise Software 2021 - Online

March 25 2021

Writing queryable APIs with MicroProfile GraphQL

Video size:

Abstract

Summary

Transcript

Slides

Phillip Krüger

Principal Software Engineer @ Red Hat

Join the community!