March 23, 2016

Building a GraphQL store from first principles

Sashko Stubailo

Sashko Stubailo

Last week in my post about converting the Discourse API to GraphQL, I promised a part 2 this week. But we’re going to take a detour first and think about one answer to the question: what does it look like to cache a GraphQL result on the client? How does this relate to similar systems like Redux and Meteor?

Welcome to Building Apollo, a publication where we share what we’ve learned about all things GraphQL as we work on Apollo: a data stack for modern apps.

Let’s think about some of the goals we might have for caching on the client:

  1. Reducing the amount of data we load by using data we already have on the client
  2. Refetching data that might have changed when we mutate the server-side data
  3. Optimistic UI by operating on the cache directly while we wait for (2)
  4. Pre-loading data we might want later

This points to the need for having a relatively smart library on the client that can manage your data lifecycle so that you don’t have to.


Option 1: cache the whole query result

Let’s remind ourselves what a GraphQL query and response looks like:

// The query
query {
  item(id: 1) {
    id
    stringField
    numberField
    nestedObj {
      id
      stringField
      numberField
    }
  }
}// The result
{
  item: {
    id: '5',
    stringField: 'Hello',
    numberField: 6,
    nestedObj: {
      id: '7',
      stringField: 'World',
      numberField: 99
    }
  }
}

OK, so let’s think of the first thing that comes to mind when we think about reducing the amount of data loading we want to do: Let’s store the whole query result, and have the cache key be the query, or maybe a query ID.

This will actually already help a lot to reduce roundtrips to the server. For example, if we were looking at a particular view in our app, navigated away, then returned immediately, it’s likely that we would be interested in exactly the same result as before. I’m pretty sure that this will work well for a lot of apps.

This achieves goal 1, because we load less data than before. What about the others — refetching after a mutation, optimistic UI, and preloading?

Refetching after a mutation

Let’s say we have done a mutation that we expect will mutate the nested object in the result, the one with an ID of ‘7’. How would we refetch just that part of our object tree?

We’d have to work backwards and use the original query somehow, or do a new query and patch that into the response. Wouldn’t it be nice if we could just reach in and say “update field ‘stringField’ on object with ID 7”? Same goes for optimistic UI, where it is probably easiest to think in terms of fields on specific objects. What if our cache let us do that? Well, maybe it can.


Option 2: Cache objects and fields

On the server, GraphQL objects are defined using GraphQLObjectType. They have a certain set of fields. If you follow the Relay Object Identification spec, they even have IDs. Dealing with objects like this is much more useful than just having a tree-shaped blob of JSON!

Let’s make a basic sketch of what a cache that stores objects and fields might look like:

{
  '5': {
    id: '5',
    stringField: 'Hello',
    numberField: 6,
    nestedObj: '7'
  },
  '7': {
    id: '7',
    stringField: 'World',
    numberField: 99
  }
}

It turns out this contains all of the information we need to reconstruct the original result given the query! It also has some nice properties:

  1. It’s trivial to integrate the results of new queries into this cache, for example the result of refetching an object after a mutation.
  2. Making an optimistic update on this data is simple — you just specify which object you want to temporarily modify, and which fields you want to set on that object. The format of the cache is simple enough to understand directly without needing special tools to manipulate it.

Prior art for normalized stores: Relay, Redux, and Meteor

Let’s be clear — I didn’t invent this concept, this comes straight out of how Relay’s store works, as described in this spectacular article by Huey Petersen.

It also happens to be extremely similar to the recommendation from the Redux docs:

In a more complex app, you’re going to want different entities to reference each other. We suggest that you keep your state as normalized as possible, without any nesting. Keep every entity in an object stored with an ID as a key, and use IDs to reference it from other entities, or lists. Think of the app’s state as a database.

Sounds like we’re really on to something! And to add even more, this is very similar to how Meteor’s client-side cache, Minimongo, is usually used, where people use tools like publish-composite to load tree-shaped data into a flat cache.


Implementation

These are just the first steps, what does it look like when you expand the concept? Well, you’ll see very soon, since we are hard at work on a simple implementation of a caching GraphQL client:

It uses a cache structure very similar to the above, and stores the cache data in a Redux store to get nice features like the redux dev tools, time traveling, and optimistic UI via a transactional library like Redux-Optimist.

We’re focusing on making all of the components very easy to understand, so there is no magic under the hood about how the queries are managed. You’ll be able to inspect how your query results are put into the cache, how they are read out, and how queries are diffed against the existing data to reduce unnecessary fetching.

Here’s a quick sneak peek, hot off the presses (literally got this working yesterday!), of what it looks like in the Redux devtools when a query result arrives from the server:

We’re working towards having an end-to-end demo of a real app built with our tools as soon as we can — get excited! I know some of you are wondering where you’ll get a GraphQL server; that’s the other half of what we’re up to.

See you next time on Building Apollo!

Written by

Sashko Stubailo

Sashko Stubailo

Read more by Sashko Stubailo