GraphQl - Commercetools - GraphQL - Booklet - EN
GraphQl - Commercetools - GraphQL - Booklet - EN
m
pl
im
en
ts
of
GraphQL for
Modern Commerce
Complement Your REST APIs
with the Power of Graphs
Kelly Goetsch
GraphQL for Modern
Commerce
Complement Your REST APIs with the
Power of Graphs
Kelly Goetsch
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. GraphQL for
Modern Commerce, the cover image, and related trade dress are trademarks of
O’Reilly Media, Inc.
The views expressed in this work are those of the author, and do not represent the
publisher’s views. While the publisher and the author have used good faith efforts to
ensure that the information and instructions contained in this work are accurate, the
publisher and the author disclaim all responsibility for errors or omissions, includ‐
ing without limitation responsibility for damages resulting from the use of or reli‐
ance on this work. Use of the information and instructions contained in this work is
at your own risk. If any code samples or other technology this work contains or
describes is subject to open source licenses or the intellectual property rights of oth‐
ers, it is your responsibility to ensure that your use thereof complies with such licen‐
ses and/or rights.
978-1-492-05684-3
[LSI]
To Oleg Ilyenko
Who introduced all of us at commercetools to GraphQL in 2015. His
passion for GraphQL and leadership in the open source community
served as a model for the rest of us. We are eternally grateful to have
had him as a friend, colleague, and collaborator. Rest in peace,
@easyangel.
Table of Contents
1. Introducing GraphQL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Commerce Requires More Than REST 2
What Is GraphQL? 9
GraphQL Compared to REST APIs 18
Final Thoughts 20
3. GraphQL Clients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Low-Level Networking 42
Batching 42
Authentication 43
Caching 44
Language-Specific Bindings 46
Frontend Framework Integration 46
Final Thoughts 47
4. GraphQL Servers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Server Implementations 56
Monitoring 58
Testing 58
v
Security 59
Merging Schemas 64
Final Thoughts 68
vi | Table of Contents
CHAPTER 1
Introducing GraphQL
1
Commerce Requires More Than REST
In the 1990s, commerce platforms shipped with the frontend and
backend as one indivisible unit. In that decade, the web was the only
channel and developers needed only to build a website. Given the
immaturity of frontend frameworks, the commerce platforms had to
offer the frontend as part of the commerce platform. “Headless”
didn’t exist as a concept. REST hadn’t been invented yet and Java‐
Script first appeared in 1995. These early commerce platforms con‐
tributed to or invented much of the software used to dynamically
generate static web pages, including JSP, JHTML, PHP, and ASP. The
major problem with this approach is that any changes, no matter
how small they were, had to be handled by IT and slotted in as part
of a formal release. This upset marketers who wanted to be able to
make changes on their own.
In the 2000s, content management systems (CMSs) like Day, Inter‐
woven, and Vignette began to offer integrations with commerce
platforms, whereby the commerce platforms would serve the back‐
end and the CMS would serve the web frontend. This allowed IT to
manage the backend and marketing to manage the frontend.
Although this made life easier for marketers, the integration
between the frontend and backend was hardcoded with what could
only be described as spaghetti code. The two were permanently
wedded and inseparable. This was all premobile, so, again, the only
channel anyone cared about was the web.
In the 2010s, there was an explosion of consumer electronic devices
that put the internet in everyone’s pockets. Mobile phones with fully
featured browsers and native apps became widely used (see
Figure 1-1). The Internet of Things (IoT) became real as the price of
semiconductors and internet connectivity fell. Even devices as mun‐
dane as TVs started to become connected to the internet and could
be used to facilitate commerce.
Commerce has transformed into something that’s part of our every‐
day lives, embedded into the dozens of screens and other internet-
connected devices that we interact with on a daily basis.
While commerce was transforming, representational state transfer
(REST) APIs were emerging as the default means of exchanging data
between disparate systems. When compared to CORBA, SOAP, and
the other standards that preceded it, REST was an enormous
Figure 1-1. Daily hours spent with digital media per adult user, USA
(source: Bond)
Lack of a specification
REST doesn’t actually have an actual specification, because it’s more
of an architectural style than a formal standard. The term REST was
coined and its principles were put forth in Roy Fielding’s 2000 PhD
Underfetching data
Related to overfetching data, another problem with REST APIs is
underfetching data. To render an order history web page, for exam‐
ple, you’ll first need to retrieve orders placed by a given customer.
Next, you’ll need to retrieve the shipment status of each order. Then,
you’ll need to find out what a customer can do with each order, such
as whether an order can be returned. Often times, the requests to
the different APIs must be made serially because the data from one
API is needed to make the call to the next API, as demonstrated in
Figure 1-3.
What Is GraphQL? | 9
Commerce Graphs
Commerce is full of graphs. There are objects (customers, orders,
products, etc.) and relationships between those objects (customers
place orders, which contain products). Figure 1-4 shows a graph
with five directed relationships.
In the Beginning…
In the 2010s, Facebook began to have the same problems with
REST as outlined earlier. The company had more than a billion
active users in 2012. Those billion-plus users accessed Facebook
from every conceivable device in every conceivable configuration.
What Is GraphQL? | 11
• Provide a roadmap for the GraphQL specification
• Oversee changes to the specification
• Provide legal, marketing, and community facilitation support
Introducing GraphQL
GraphQL is a formal specification for retrieving and changing data,
similar to how SQL is used to change and retrieve data from data‐
base tables. At a very high level, GraphQL describes the language
and grammar that should be used to define queries, the type system,
and the execution engine of that type system.
A very basic example of a GraphQL query is something like the
following:
query {
product(id: "94695736", locale: "en_US") {
brandIconURL
name
description
price
ableToSell
averageReview
numberOfReviews
maxAllowableQty
images {
url
altText
}
}
}
The GraphQL server would then respond with something like this:
{
"data" {
"product": {
"brandIconURL": "https://
www.legocdn.com/images/disney_icon.png",
"name": "The Disney Castle",
"description": "Welcome to the magical
Disney Castle!....",
"price": "349.99",
Rather than interacting with the multiple APIs that hold this data,
the client-side developers can make one GraphQL query against an
endpoint that’s typically exposed as /graphql. The GraphQL server
What Is GraphQL? | 13
then validates the query and calls what are known as “resolvers” to
retrieve (or modify) data from the underlying APIs, protocol buffer
(protobuf), datastore, or any other source system. GraphQL doesn’t
take a stance on what programming language you should use, how
to retrieve data from resolvers, or much of anything else. GraphQL
is a specification, not a specific implementation.
Benefits of GraphQL
GraphQL offers many benefits. Let’s explore some of them.
What Is GraphQL? | 15
Figure 1-6. GraphiQL, the GraphQL IDE
Improved performance
Next, GraphQL dramatically improves the performance for end
customers.
GraphQL makes data from multiple source systems available in one
JSON document. Frontend developers specify exactly the data that
they need, and GraphQL provides it by making parallel requests to
the source systems that have that data. Within a datacenter, latency,
bandwidth, and computing power are basically unlimited. It’s far
more advantageous from a performance standpoint to make all of
those requests within a datacenter and then offer up the client a sin‐
gle document containing that data. Clients are often connected over
mobile networks, which are bandwidth and latency constrained. Cli‐
ent devices are often constrained by their computing power. Making
all those HTTP requests has a cost, and small IoT-style clients, wear‐
ables, and other devices don’t have that much computing power to
work with.
Also, GraphQL completely eliminates the problems of over- and
underfetching. Frontend developers specify exactly what they need
and GraphQL provides it. There is likely to be some over- and
underfetching when the GraphQL layer calls the source systems, but
again, within a datacenter resources like latency and bandwidth are
essentially unlimited.
Drawbacks of GraphQL
Like all new technology, GraphQL isn’t a silver bullet that magically
fixes all of your problems.
GraphQL is another layer that must be maintained with its own
architecture, development, operations, and maintenance needs,
though it does eliminate the need for dozens or hundreds of back‐
ends for frontends.
Also, security, as we discuss in Chapter 4, can be challenging with
GraphQL. The GraphQL specification leaves out security entirely,
leaving it up to each vendor.
The final challenge with GraphQL is that it can be difficult to com‐
bine multiple GraphQL endpoints and schemas. Frontend develop‐
ers want one endpoint (/graphql) with one schema, but different
teams and different vendors will all have their own endpoints and
schemas. The commerce platform vendor can expose its
own /graphql endpoint and schema, for example. Your cloud vendor,
CMS vendor, search vendor, various internal teams, and so on might
all expose their own endpoints and schemas. Frontend developers
then need to access multiple endpoints. At that point, why even
bother with GraphQL when REST is already available? Fortunately
GraphQL server vendors do offer a way to offer your frontend
developers a single GraphQL endpoint and schema, which we dis‐
cuss in detail in Chapter 4.
What Is GraphQL? | 17
GraphQL Compared to REST APIs
Part of the reason REST has become so popular is because microser‐
vices have emerged as the default architecture for building many
cloud-based applications and commerce applications in particular.
As Figure 1-7 shows, microservices are individual pieces of business
functionality that are independently developed, deployed, and man‐
aged by a small team of people from different disciplines. For more
on microservices, take a look at Microservices for Modern Commerce
(O’Reilly, 2017).
Final Thoughts
In this chapter, we discussed the challenges of REST and why com‐
merce, specifically, requires more than REST on its own. Then, we
introduced GraphQL and discussed its pros and cons. Finally, we
discussed why you should view GraphQL as a complement to REST,
not a replacement for it.
In Chapter 2, we cover the GraphQL specification.
21
As previously discussed, REST doesn’t have a specification. A REST
endpoint can return a response in a specific standards-compliant
version of JSON or XML, but you’ll find little agreement across
REST endpoints on how response bodies should be structured or
how errors are logged.
Evolvable
With change being the only constant in software development, it’s
inevitable that your GraphQL schema will need to change. Maybe
you add a new product type, add a new promotion attribute, or
must deprecate a payment method. Your GraphQL schema will
change.
When designing GraphQL, its founders strongly opted for evolving
schemas rather than versioning them. Evolving schemas means
adding nonbreaking changes to the schema, thereby slowly evolving
it over time. The GraphQL specification says:
Client Centric
GraphQL is unapologetically built for frontend developers who are
building experiences for clients. The first sentence of the actual
specification (June 2018 version) says:
GraphQL is a query language designed to build client applications
by providing an intuitive and flexible syntax and system for
describing their data requirements and interactions.
The server side is obviously important, but GraphQL was built for
and by frontend developers. Most of the GraphQL ecosystem,
including Facebook’s reference implementation, which underpins
many of the commercial and open source offerings, is written in
JavaScript. Facebook also developed and released the wildly popular
React frontend framework alongside GraphQL, with the two often
being used together.
Strongly Typed
A defining characteristic of the GraphQL specification is that it’s
strongly typed, meaning that the author of the GraphQL schema
precisely defines the types (objects like customers, orders, and prod‐
ucts), fields (attributes like firstName, shippingAddress, and
GraphQL Terminology
Before we get too far, we need to explain in more detail some
GraphQL terms that you’ve seen earlier.
Types
A type in GraphQL is an object, like a customer, order, or product.
It’s a “thing”—a noun.
Here’s an example:
type Product {
name: String!
description: String!
price: Float!
...
}
GraphQL Terminology | 27
In this example, Product is the type. Types are the building blocks of
GraphQL.
When modeling types, don’t fall into the trap of mapping your REST
APIs each to their own type. REST APIs, backed by microservices,
are often extremely limited in scope to one type of data and/or one
piece of functionality. Because separate teams build each microser‐
vice, and one of the central tenets of microservices is to avoid
dependencies between teams, you are unlikely to find too many
relationships between REST APIs. REST APIs are typically designed
so that they are entirely self-contained, with few or any links to
other REST APIs. The whole point of GraphQL, however, is to allow
frontend developers to traverse a graph of connected objects. When
you design your types, make sure that you have as many type-based
references as possible. For example, you’d want your customer type
to refer to your orders type, and your order type to refer to your
customer type:
type Customer {
orders: [Order!]!
}
type Order {
customer: Customer!
}
Developers then can traverse the graph by referencing
customer.orders or order.customer in their code.
Fields
A field is an attribute like name, description, price, and so on. Fields
belong to types. Here’s an example of some of Product’s fields:
type Product {
name: String!
description: String!
price: Float!
parentCategory: Category!
reviews: [Review]
...
}
Following the name of the field, you’ll find its data type. Data types
can be scalars (primitives) of the following type:
• Strings (String)
• Integers (Int)
In the previous example, you can see that name and description are
String`s, whereas `price is a Float.
Fields can also have a data type of another type within the same
GraphQL schema. In the previous example, the parentCategory
field points to a Category type. When you build your query, you can
retrieve field values from the referenced types:
query {
product(id: "94695736") {
category {
name
}
}
}
This would return something like the following:
{
"data" {
"product": {
"category": {
"name": "Men's Belts"
}
}
}
}
Fields can be marked as required by adding an exclamation point
after the data type. Here’s an example of a required field:
type Product {
name: String!
}
Here’s an example of an optional field:
type Product {
reviews: [Review]
}
Rather than having a singular value, GraphQL also supports fields
having multiple values through the use of brackets ([]). A field
defining a single String would be represented as follows:
GraphQL Terminology | 29
type Product {
review: Review
}
Arguments
You can pass arguments to any GraphQL operation, and they are
often used for retrieving specific nodes in the graph. Going back to
the example we’ve been using in this chapter, the query begins with
an argument to retrieve the product whose ID is "94695736":
query {
product(id: "94695736") {
name
}
}
Notice how the argument is named, unlike in many other program‐
ming languages. In most others, you’d pass in "94695736" but
GraphQL requires both the parameter name and value, which in
this case is id: "94695736"
You can pass in as many arguments as you’d like. Multiple argu‐
ments are often passed in for pagination, or for mutations. Here’s
how you’d create a new product and then retrieve its ID and name
after it has been created:
mutation {
createProduct(
name: "The Disney Castle",
description: "Welcome to the magical Disney Cas-
tle!...."
) {
id
name
}
}
Variables
GraphQL allows you to pass in named variables to queries. Variables
are prefixed with a $ and are immediately proceeded by the data
type and then an exclamation point if it’s required. Queries can have
an unlimited number of variables. Here’s an example of a query with
two variables:
query orderHistory ($id: ID! $year: Int){}
Fragments
Sometimes, GraphQL queries can become repetitive by having to
call out the same fields over and over. Let’s take an address, for
example. Retrieving an address would look something like this:
query {
customer(id: "47937102") {
firstName
lastName
addresses {
type
address1
address2
address3
city
state
zip
country
phone
GraphQL Terminology | 31
}
}
}
Rather than typing those field names throughout your queries, you
can define a fragment as follows:
fragment addressInfo on Address {
type
address1
address2
address3
city
state
zip
country
phone
}
Fragments can belong to only a single type, in this case the Address
type:
query {
customer(id: "47937102") {
firstName
lastName
addresses {
...addressInfo
}
}
}
Wherever you need to retrieve those specific fields from the Address
type, you can now use ...addressInfo and spare yourself from
having to type all those fields manually. An added benefit of using
fragments is that if you add additional fields to types, you can
update the fragment directly rather than having to update each
occurrence manually.
Interfaces
Like other object-oriented programming languages, GraphQL
allows you to define interfaces. An interface is basically a type that
can be implemented by other types or queried.
Suppose that you have an order type and want to add subtypes for
B2C and B2B orders. Most of the fields are generic enough to belong
to the order type, but some are specific to B2C orders and some are
Then, you’d define new types (in this case B2COrder and B2BOrder)
that implement the Order interface:
type B2COrder implements Order {
facebookHandle: String
}
GraphQL Terminology | 33
Common examples of interfaces in commerce include orders, prod‐
ucts, customers, payment methods, and so on.
Inputs
When working with mutations, the number of inputs can get to be
excessive. Going back to the order example, let’s define mutations to
create B2C and B2B orders:
type Mutation {
createB2COrder(date: String!, products: [Product!]!,
merchandiseTotal: Float!, shippingTotal: Float!, orderTotal:
Float!, payment: Payment!, facebookHandle: String): B2COrder
Queries
The vast majority of operations executed against a GraphQL server
are queries. A query is a simple retrieval of data, analogous to an
HTTP GET with REST or a SELECT with a database. Data is not
changed, it’s simply retrieved. Retrieving data is what GraphQL was
built and optimized for.
A query follows this structure:
query ProductDetailWebPage { # some query }
GraphQL Operations | 35
{
"data" {
"product": {
"brandIconURL": "https://
www.legocdn.com/images/disney_icon.png"
...
}
A response could have both "data" and "errors" in the same JSON
document.
Mutations
Mutations are changes to data (create, update, and/or delete) fol‐
lowed by an optional query. Think of them as HTTP POST/PUT/
PATCH/DELETE methods or INSERT/UPDATE/DELETE com‐
mands in a database.
As mentioned earlier in the design principles of GraphQL, GraphQL
is intended to be a data access layer. It is not intended to be a layer
that contains much, if any, business-level functionality. Many muta‐
tions require executing business functionality. For example, when
you create a customer, you’ll also want to do the following:
And so on.
None of this code should be in GraphQL. The code should be in
your microservices layer. Mutations should be used selectively for
modifying data that doesn’t require much business logic. Product
catalog–related data is probably best for mutations, but anything
Mutations are structured like queries. Start with the operation name
(in this case mutation) followed by an optional name for your muta‐
tion (in this case createDisneyCastle). Then, the underlying muta‐
tion (in this case createProduct) is invoked. Finally, the mutation
calls for the retrieval of the id and name of the product that was just
created:
mutation createDisneyCastle {
createProduct(
name: "The Disney Castle",
description: "Welcome to the magical Disney
Castle!....",
price: "349.99") {
id
name
}
}
Mutations work with GraphQL concepts such as arguments, vari‐
ables, fragments, interfaces, and inputs. Inputs can be used only with
mutations.
Mutations can easily lead to unintended bulk changes of data. A
developer could define a deleteAllProducts mutation for testing
locally and forget to remove it in production. Unless properly
secured by role, a developer could easily invoke this in production.
Subscriptions
Subscriptions are real-time streams of data that allow bi-directional
communication over a single Transmission Control Protocol (TCP)
socket. Facebook originally built subscriptions to allow its custom‐
ers to see real-time “Likes” without having to refresh the page.
Suppose that you want to be notified every time your inventory is
changed. Your subscription would look something like this:
GraphQL Operations | 37
subscription {
inventoryChange (productId: "94695736", loca-
tion:"FulfillmentCenter32") {
inventoryCount
}
}
Every time the underlying inventory is changed, the new value for
inventoryCount will be pushed over a web socket or some other
type of persistent connection.
The only way a subscription differs from a query is the use of the
operation name subscription and the ongoing push nature rather
than a one-time pull, as with a query.
Introspection
As discussed earlier, a defining characteristic of GraphQL is that it’s
introspective, meaning it’s possible to query the schema to under‐
stand the following:
Final Thoughts | 39
CHAPTER 3
GraphQL Clients
Formal GraphQL clients offer basic HTTP request handling plus the
following:
• Low-level networking
• Batching
• Authentication
• Caching
• Language-specific bindings
• Frontend framework integration
41
Let’s explore each one of these in greater depth.
Low-Level Networking
Making an HTTP request sounds simple, but things can go wrong
in the long, complicated journey from a client to the GraphQL
server and back. The GraphQL server might not respond, the
response might be slow, there might be errors calling the underlying
datastore, the response size could be very large, and so on. The real
world is full of problems. The networking stack of your client can
greatly help with these issues by allowing you to configure the
following:
• Retry policies (how many times to retry, how long between each
retry, etc.)
• Limits on response sizes
• Timeout limits
Depending on your client, you can even swap out HTTP for another
protocol.
Batching
The entire point of GraphQL is to allow your clients to retrieve
everything needed to render a page or other experience with one
request. In an ideal world, you’d have one HTTP request per page.
Frontend frameworks like React encourage modularization. Each
“component” in React should be more or less self-contained, includ‐
ing fetching data. Here’s an example of a very simple component:
import React from 'react'
import {Query} from 'react-apollo'
import gql from 'graphql-tag'
return "{data.product.price}"
}}
</Query>
)
Imagine having a product detail page composed of 10 of these com‐
ponents, each with their own GraphQL query. Even though this
makes the frontend code more manageable from a development
standpoint, you’re back to the same problem as REST APIs.
Some clients allow you to batch together multiple GraphQL queries
so that you can still have the modularity, but multiple queries are
sent to the GraphQL server in one batch. The client will gather all of
the GraphQL requests over a period of tens of milliseconds and then
submit one request to the server.
Authentication
All GraphQL clients require that you authenticate with the GraphQL
server before being able to execute queries. Because GraphQL is
served almost exclusively over HTTP, the authentication scheme
you’ve been using to secure your REST APIs (typically OAuth 2.0)
can easily be reused for GraphQL. Chapter 4 examines this in
greater depth.
Authentication typically requires inserting an HTTP header, as
follows:
const httpLink = new HttpLink({
uri: "https://api.myserver.com/graphql",
headers: {
authorization: `Bearer ${token}`
}
});
All HTTP clients, whether GraphQL focused or not, allow you to
insert custom HTTP headers.
Authentication | 43
Caching
Most of the requests handled by a GraphQL server are for queries.
As discussed in Chapter 1, queries simply retrieve data; they do not
change it. Therefore, many of these queries can be cached locally.
One of the advantages of REST is that it uses the HTTP ecosystem
around caching. You could make an HTTP GET to /ProductCatalog/
Product/12345 with an HTTP request header of “max-age=180” and
your client (or an intermediary) could cache the results. Every piece
of software touching the HTTP request between the client and
server knows how to handle that HTTP request. With GraphQL, all
operations are typically submitted over HTTP GET or POST
(though, as we’ll discuss later, GraphQL is transport layer agnostic).
HTTP is used as a tunneling mechanism and you’re not able to use
the native verbs and caching mechanisms. It’s a different paradigm
entirely.
Let’s go back to the very first GraphQL response from Chapter 1:
{
"data" {
"product": {
"brandIconURL": "https://
www.legocdn.com/images/disney_icon.png",
"name": "The Disney Castle",
"description": "Welcome to the magical
Disney Castle!....",
"price": "349.99",
"ableToSell": true,
"averageReview": 4.5,
"numberOfReviews": 208,
"maxAllowableQty": 5,
"images": [
{ "url": "https://
www.legocdn.com/images/products/94695736/1.png", "altText":
"Fully assembled castle" },
{ "url": "https://
www.legocdn.com/images/products/94695736/2.png", "altText":
"Castle in the box" }
]
}
}
}
Here’s how you’d cache an entire image type for a whole day using
the Apollo GraphQL implementation:
type image @cacheControl(maxAge: 86400) {
id: ID!
url: String!
altText: String!
}
Here’s how you’d cache individual fields within a type:
type product {
id: ID!
brandIconURL: String @cacheControl(maxAge: 300)
name: String! @cacheControl(maxAge: 300)
description: String! @cacheControl(maxAge: 300)
price: Float!
ableToSell: Boolean!
averageReview: Float
numberOfReviews: Int
maxAllowableQty: Int @cacheControl(maxAge: 300)
images: [Image] @cacheControl(maxAge: 300)
}
Caching | 45
cached. Therefore, you really need to use an actual GraphQL client,
rather than a traditional HTTP client.
Language-Specific Bindings
Frontend developers want to be able to call the GraphQL server
using the programming language in which they’re writing the cli‐
ents. JavaScript developers want a GraphQL client written in Java‐
Script. Swift developers want a GraphQL client written in Swift. And
so on. Even within the programming languages, there are “flavors.”
In the JavaScript world, there’s Angular, Vue, Meteor, Ember, and
others.
A good client offers the same functionality regardless of the pro‐
gramming language the developer chooses.
Final Thoughts
In this chapter, we explained what GraphQL clients are, why they’re
better than clients that just handle HTTP requests, and what value
specifically GraphQL clients offer.
In Chapter 4, we discuss GraphQL servers.
Final Thoughts | 47
CHAPTER 4
GraphQL Servers
49
The HTTP server accepts the GraphQL queries and then passes
them to the core GraphQL engine. When the engine responds, the
HTTP server then passes the JSON response back to the client.
Let’s discuss this in more detail.
• Authentication
• Authorization
• Protection from malicious queries
• Monitoring
• Metrics collection
• Health checking
GraphQL Servers | 51
And so on. There are hundreds of feature-rich, mature HTTP
servers available, so it makes sense to take advantage of that ecosys‐
tem to front your GraphQL engine.
Most GraphQL servers are written in JavaScript and therefore
require a JavaScript-based HTTP server such as Express, Koa, and
Hapi. You can easily embed these HTTP servers in your application.
You can also layer your HTTP servers with other intermediaries that
are capable of working with HTTP. For example, you could put AWS
Elastic Load Balancer (ELB) in front and have that route HTTP
requests down to Express. AWS ELB could have the logic for secu‐
rity, monitoring, and more, and Express could serve as a pass-
through to the GraphQL engine.
By custom, GraphQL is exposed as /graphql. All requests are posted
to that single URI, typically as HTTP POST, though HTTP GET is
often used, as well.
Here are the variables that you’ll need to post:
query
The actual query, like "{product(id: $id) {price}}." This is
required.
operationName
The name of the query to execute, in case there are multiple
queries provided. Recall that in Chapter 2 we discussed that
multiple queries are possible in the same string. This is required
only if multiple queries are provided.
variables
A map of key/value pairs that are used as variables.
Here’s an example of what you’d post to a /graphql URI over HTTP
POST:
{
"query": "{product(id: $id) {price}}",
"variables": { "id": "94695736"}
}
Next, it’s time for the GraphQL engine to parse the query.
GraphQL Servers | 53
Figure 4-3. An AST representation of a sample query (from https://
astexplorer.net)
Executing Queries
Now that the server has been started and the query has been parsed
and validated, it’s time to actually execute it.
The GraphQL server starts by crawling the AST it produced earlier.
It then executes what are called “resolver” functions for each type
and field to retrieve the data from the underlying source. Finally, the
GraphQL server constructs a single JSON response that is passed
back to the HTTP server in front.
Let’s spend some time on resolvers because they’re really the heart of
GraphQL. A resolver is the function that calls the underlying REST
APIs, databases, legacy backends, or any other source of data.
Resolvers are what actually call the underlying source of data and
return the data in the proper format. They shouldn’t have any busi‐
ness logic, because GraphQL is strictly an intermediary.
GraphQL Servers | 55
Here’s an example of a simple query:
query {
product(id: "94695736", locale: "en_US") {
name
}
}
And here’s a simple response:
{
"data" {
"product": {
"name": "The Disney Castle"
}
}
In the GraphQL server, there’s a resolver function for the “name”
field that returns the actual name of the product. Here’s what that
function would look like in GraphQL.js, the JavaScript-based
GraphQL reference implementation from Facebook:
name(obj, args, context, info) {
if (context.product == null) {
fetch('https://api.myserver.com/product/
94695736')
.then(resp => resp.json())
.then(context.product = resp)
}
return context.product.name;
)
In this example, name is the field name, obj is the parent object (in
this example, the query), args are any arguments provided to the
field (like (locale='en_US')), and context is a catch-all object that
can be used to store long-lived objects of value to other resolver
functions. You wouldn’t want to call a REST API for every field.
Server Implementations
GraphQL is a specification, not a specific implementation. Anyone
can write a GraphQL server using whatever programming language
and implementation methodology, so long as it adheres to the speci‐
fication. As we’ve discussed, the GraphQL specification is fairly
silent on how the internals of a GraphQL server should work.
Much of the GraphQL community uses GraphQL.js either directly
or indirectly as the GraphQL server. GraphQL.js was released by
Facebook (along with the original specification) in 2015 and is
Server Implementations | 57
Monitoring
Like all services in production, you must monitor GraphQL for
availability, functionality, and performance. Unlike REST APIs, for
which each endpoint can be monitored separately, all GraphQL
requests are GETs or POSTs to /graphql. A query can be for one field
or for an entire product catalog.
The key to monitoring GraphQL is to monitor your resolver func‐
tions because that’s where the real work happens. The GraphQL
server itself is very unlikely to fail. Within each resolver function, it’s
best practice to log a correlation ID, the operation type, and the
query complexity (which we discuss soon) to a log file and/or a time
series database like InfluxDB or Prometheus. You then can layer an
analytics platform like Grafana on top to aggregate and analyze the
data.
Per the GraphQL specification, GraphQL servers execute resolvers
concurrently in the case of queries, and sequentially in the case of
mutations. Therefore, the performance of any given GraphQL query
is a function of the slowest resolver and the performance of any
given GraphQL mutation is the sum of all resolvers.
Commercial GraphQL vendors have full monitoring solutions in
place already, so it’s best to use that functionality if it’s available.
Testing
Every time you touch your schema, resolvers, or underlying data
sources (REST APIs, databases, legacy backends, or any other source
of data) you need to retest your entire GraphQL layer to make sure
you didn’t introduce any errors.
Fortunately, GraphQL is easy to test in local environments as well as
in integration or QA environments. With GraphQL, you have a
fixed set of inputs (queries, mutations, and variables) and a fixed set
of outputs (in nicely formatted JSON). It’s easy to write test cases
with real or mocked data that exercise every type and field in your
schema. Writing tests should be mandatory for every new type and
field introduced to your schema.
If you want to provide your frontend developers with some mocked
data so that they can build their frontends as the backend is being
built in parallel, it’s easy to have each resolver return some mocked
Security
A topic of particular importance to those adopting GraphQL is
security. GraphQL’s centralization makes security easier (authentica‐
tion, authorization, etc.) but also more challenging (expensive quer‐
ies, destructive mutations, etc.).
Let’s explore the security-related topics that you’ll need to address.
Authentication
Though the /graphql URI is often publicly available, you don’t want
anyone to be able to call it. Users should be required to properly
authenticate. Authentication ensures that a user, whether a human
or another system, is who he/she/it purports to be.
Authentication should be performed in a layer that sits atop your
GraphQL server. The HTTP servers embedded within a typical
GraphQL server tend to be fairly minimalist and therefore might
not support the additional authentication-related features that a
more robust HTTP server/load balancer/reverse proxy would be
able to support. You also want to shield your GraphQL server from
abusive queries and denial of service attacks.
Because GraphQL is served over HTTP, you can take advantage of
all of the common authentication schemes and tooling available for
traditional REST APIs. See APIs for Modern Commerce (O’Reilly
2017) for more information.
Security | 59
Authorization
After you’ve authenticated your client, you must now authorize that
client to call specific operations (query, mutation, subscription,
introspection), specific types (products, orders, inventory, etc.) and
specific fields (price, quantity, availableToSell).
Here are some common business rules you’d want to implement:
• User ID/name
• Role(s)
• Organization
Within each resolver, you can then apply limited business logic.
Here’s a very simple example of how you’d prevent merchandising
team members from viewing the products field of an order:
products(obj, args, context, info) {
if (context.user == null || context.user.role == "mer
chandising") {
return null;
}
return context.order.products;
)
If the user isn’t attached to context or the user has the wrong role,
you could return null (as in this example), an empty value/array, or
throw an error to the client. It’s up to you.
Expensive Queries
One of the benefits of GraphQL is that it allows you to query and
mutate large amounts of data with a single line of text. Can you
imagine the amount of work a GraphQL server would need in order
to serve this response?
query lotsOfData {
allProducts
allSKUs
allCategories
allCustomers
allOrders
}
That single query would quickly peg any server’s CPU. The response
size would be well into the gigabytes. If someone were to acciden‐
tally run that a few times, it could very quickly bring down an entire
GraphQL server.
Another challenge with GraphQL is that its ability to traverse a
graph can lead to deep recursions that burn valuable resources. You
could define a product with a reference to category and a category
with a reference to all of its products as follows:
type Product {
category: Category!
}
type Category {
products: [Product]
}
Now imagine a query like this:
query somethingMalicious {
allProducts {
category {
products {
category {
products {
category
}
}
}
}
Security | 61
}
}
Fortunately, there are well-established ways to protect your
GraphQL server from overly complex queries, whether malicious or
not.
• Number of characters
• Number of bytes
• Number of unique types and/or fields requested
This filtering can be done in the HTTP server above your GraphQL
server.
Clearly this isn’t very effective, but queries that are dramatically
larger than others should be filtered. You might want to block all
HTTP requests that are larger than 250 kilobytes, for example.
Timeouts
Another way to protect your GraphQL server is to kill queries that
are taking too long to execute. If a query is running for 10 seconds,
something is probably wrong and the query should be terminated.
It’s best to terminate long-running queries at both the HTTP server
above the GraphQL server as well as within the GraphQL server
itself. Depending on the implementation, it is possible to halt the
execution of the resolvers, but there might be unexpected errors as
connections to data sources are abruptly terminated.
Allowlists
Rather than allow all GraphQL queries, another approach is to cre‐
ate an allowlist of acceptable queries. Only queries on the list are
allowed to be executed.
If you have control over your clients, you can run tools like Persist‐
GraphQL to analyze your code and pull out any GraphQL queries.
Those queries are then added to the list. If you were to pull up
GraphiQL and arbitrarily execute queries, they wouldn’t work.
Query depth
Another option for protecting your GraphQL server is to check the
depth of your queries before you execute them. Using something
like graphql-depth-limit, you can see how many levels deep your
queries are.
For example, this query is one level deep:
query somethingMalicious {
allProducts
}
This query is two levels deep:
query somethingMalicious {
allProducts {
category
}
}
And so on.
Part of the value of GraphQL is that you can form these large, nested
queries. But there needs to be a limit. A query five levels deep is
verging on abusive. A query 10 levels deep is definitely abusive.
An advantage of checking query depth is that it forces your develop‐
ers to come up with more elegant solutions. Nobody wants to write
or debug a query that’s five levels deep.
Query complexity/cost
Similar to query depth, you can estimate the cost of a query before
it’s executed by looking at its complexity. Here are some factors that
influence a query’s complexity:
• Nesting depth
• How many fields are requested
• How many types are requested
Security | 63
query {
product(id: "94695736", locale: "en_US") { # cost=1
name # cost=1
}
}
You could assign a higher weight to specific fields. For example,
retrieving a product and its attributes is pretty straightforward. Now
let’s add in price:
query {
product(id: "94695736", locale: "en_US") { # cost=1
name # cost=1
description, # cost=1
brandIconURL, # cost=1
price # cost=5
}
}
This query comes out with a cost of 9. Price has a higher complexity
score because retrieving it requires another call to a different REST
API. Network hops are always more expensive and therefore cost
more.
You then can set a maximum cost for a query. For example, you
could set a cost limit of 100. As with query depth, enforcing a maxi‐
mum cost forces your developers to come up with more elegant sol‐
utions.
Merging Schemas
The greatest value of GraphQL is that there’s a single schema for
frontend developers to execute queries against. Developers can get
any data they want from any backend datasource with one query.
The GraphQL schema and the interconnectedness of the types and
fields is what enables this. Unfortunately, the GraphQL layer can
quickly become a monolith. Imagine having 25 different microser‐
vice teams, each trying to contribute to a single GraphQL schema as
depicted in Figure 4-4. It quickly becomes complicated. The whole
“monolith in the pipes” problem is what led to service-oriented
architecture’s decline.
Given this tension between centralization and decentralization, how
do you allow individual teams to work in parallel while exposing a
single cohesive GraphQL schema to your clients?
Separate Files
A very simple yet effective way of distributing the ownership of your
GraphQL schema is to break apart your schema into separate physi‐
cal files. When you instantiate your GraphQL server, you need only
to pass it a string containing your schema definition. That string
could be retrieved from 1 file or 100 files, it doesn’t matter to your
server. It needs a single string. At runtime or at build time, you
could easily combine multiple files. Popular libraries like graphql-
import can automate this process for you.
In this model, you would have your pricing microservice team
own pricing.graphql, while your product catalog team would own
product_catalog.graphql, and so on. Different teams will be
touching other team’s schema definitions, but at least there’s some
ownership and physical separation of definitions.
Schema Stitching
To this point, we’ve only discussed having different microservice
teams within an organization contributing to the same schema.
With many software vendors exposing their own GraphQL end‐
points, your frontend developers could end up having to call
Merging Schemas | 65
different /graphql endpoints based on what data they want to
retrieve. Suppose that you have a commerce platform vendor, a
CMS vendor, and some microservices you’ve built in-house. Each
organization exposes its own /graphql endpoint as follows:
Commerce platform: http://commerce-platform-vendor.com/graphql
query product {
product(id: "94695736") {
displayName
}
}
CMS: http://cms-vendor.com/graphql
query content {
content(productId: "94695736") {
longDescription
images
}
}
Your own microservices: http://your-company.com/graphql
query inventory {
inventory(productId: "94695736") {
quantity
}
}
Imagine the challenges your frontend developers would have calling
each of those endpoints, each with their own authentication and
authorization schemes. You might as well go back to calling individ‐
ual REST endpoints. You can run into the same issue if you have
multiple teams within an organization each exposing their own
GraphQL endpoint.
With GraphQL schema stitching (again, not part of the GraphQL
specification), you can combine those three queries into one:
query productDetailPage {
product(id: "94695736") {
displayName
}
inventory(productId: "94695736") {
quantity
}
content(productId: "94695736") {
longDescription
Schema Federation
Increasingly, schema federation is replacing schema stitching.
Schema federation allows multiple teams/vendors to contribute to a
single type, so that clients need to query only one type.
Here’s an example of how you’d instantiate your Apollo-based
GraphQL server with different types from different sources:
const gateway = new ApolloGateway({
serviceList: [
{ name: 'product', url: 'http://commerce-
platform-vendor.com/graphql' },
{ name: 'content', url: 'http://cms-vendor.com/
graphql' },
{ name: 'inventory', url: 'http://your-
company.com/graphql' }
]
});
(async () => {
const { schema, executor } = await gateway.load();
const server = new ApolloServer({ schema, executor });
server.listen();
})();
You can then query a single type (in this case product) and it will
magically pull fields from the appropriate types from the appropri‐
ate /graphql endpoints provided the @key and @external directives
are properly used:
query productDetailPage {
product(id: "94695736") {
displayName # from http://commerce-
Merging Schemas | 67
platform-vendor.com/graphql
longDescription # from http://cms-vendor.com/
graphql
images # from http://cms-vendor.com/
graphql
inventory # from http://your-
company.com/graphql
}
}
Schema federation at scale is difficult but worth it due to the
autonomy it gives each team.
Final Thoughts
By now you should have a firm understanding of the shortcomings
of using plain REST for commerce, graphs and how they are used in
everyday life, the origins of GraphQL, the GraphQL specification,
and how GraphQL clients and servers work.
Adopting GraphQL will give your frontend developers a substantial
competitive advantage by allowing them to get exactly the data they
want, without the burden of trying to find and retrieve data from
various REST APIs. Even though GraphQL is an additional layer to
maintain, many will find that not having to support many diverse
frontends that rapidly change will far outweigh the overhead incur‐
red by adopting it.
Happy GraphQL’ing!
Acknowledgments
Thank you to Tori Hall, Stephanie Sprinkle, Rob Senn, Laura Luiz,
Yann Simon, Tamas Piros, and Camilo Jimenez.