Mirror: The highly customizable and versatile GraphQL client with which you add on features like normalized caching as you grow.
1--- 2title: Normalized Caching 3order: 1 4--- 5 6# Normalized Caching 7 8In GraphQL, like its name suggests, we create schemas that express the relational nature of our 9data. When we create and query against a `Query` type we walk a graph that starts at the root 10`Query` type and walks through relational types. Rather than querying for normalized data, in 11GraphQL our queries request a specific shape of denormalized data, a view into our relational data 12that can be re-normalized automatically. 13 14As the GraphQL API walks our query documents it may read from a relational database and _entities_ 15and scalar values are copied into a JSON document that matches our query document. The type 16information of our entities isn't lost however. A query document may still ask the GraphQL API about 17what entity it's dealing with using the `__typename` field, which dynamically introspects an 18entity's type. This means that GraphQL clients can automatically re-normalize data as results come 19back from the API by using the `__typename` field and keyable fields like an `id` or `_id` field, 20which are already common conventions in GraphQL schemas. In other words, normalized caches can build 21up a relational database of tables in-memory for our application. 22 23For our apps normalized caches can enable more sophisticated use-cases, where different API requests 24update data in other parts of the app and automatically update data in our cache as we query our 25GraphQL API. Normalized caches can essentially keep the UI of our applications up-to-date when 26relational data is detected across multiple queries, mutations, or subscriptions. 27 28## Normalizing Relational Data 29 30As previously mentioned, a GraphQL schema creates a tree of types where our application's data 31always starts from the `Query` root type and is modified by other data that's incoming from either a 32selection on `Mutation` or `Subscription`. All data that we query from the `Query` type will contain 33relations between "entities", JSON objects that are hierarchical. 34 35A normalized cache seeks to turn this denormalized JSON blob back into a relational data structure, 36which stores all entities by a key that can be looked up directly. Since GraphQL documents give the 37API a strict specification on how it traverses a schema, the JSON data that the cache receives from 38the API will always match the GraphQL query document that has been used to query this data. 39A common misconception is that normalized caches in GraphQL store data by the query document somehow, 40however, the only thing a normalized cache cares about is that it can use our GraphQL query documents 41to walk the structure of the JSON data it received from the API. 42 43```graphql 44{ 45 __typename 46 todo(id: 1) { 47 __typename 48 id 49 title 50 author { 51 __typename 52 id 53 name 54 } 55 } 56} 57``` 58 59```json 60{ 61 "__typename": "Query", 62 "todo": { 63 "__typename": "Todo", 64 "id": 1, 65 "title": "implement graphcache", 66 "author": { 67 "__typename": "Author", 68 "id": 1, 69 "name": "urql-team" 70 } 71 } 72} 73``` 74 75Above, we see an example of a GraphQL query document and a corresponding JSON result from a GraphQL 76API. In GraphQL, we never lose access to the underlying types of the data. Normalized caches can 77ask for the `__typename` field in selection sets automatically and will find out which type a JSON 78object corresponds to. 79 80Generally, a normalized cache must do one of two things with a query document like the above: 81 82- It must be able to walk the query document and JSON data of the result and cache the data, 83 normalizing it in the process and storing it in relational tables. 84- It must later be able to walk the query document and recreate this JSON data just by reading data 85 from its cache, by reading entries from its in-memory relational tables. 86 87While the normalized cache can't know the exact type of each field, thanks to the GraphQL query 88language it can make a couple of assumptions. The normalized cache can walk the query document. Each 89field that has no selection set (like `title` in the above example) must be a "record", a field that 90may only be set to a scalar. Each field that does have a selection set must be another "entity" or a 91list of "entities". The latter fields with selection sets are our relations between entities, like a 92foreign key in relational databases. 93Furthermore, the normalized cache can then read the `__typename` field on related entities. This is 94called _Type Name Introspection_ and is how it finds out about the types of each entity. 95From the above document we can assume the following relations: 96 97- `Query.todo(id: 1)``Todo` 98- `Todo.author``Author` 99 100However, this isn't quite enough yet to store the relations from GraphQL results. The normalized 101cache must also generate primary keys for each entity so that it can store them in table-like data 102structures. This is for instance why [Relay 103enforces](https://relay.dev/docs/guides/graphql-server-specification/#object-identification) that 104each entity must have an `id` field. This allows it to assume that there's an obvious primary key 105for each entity it may query. Instead, `urql`'s Graphcache and Apollo assume that there _may_ be an 106`id` or `_id` field in a given selection set. If Graphcache can't find these two fields it'll issue 107a warning, however a custom `keys` configuration may be used to generate custom keys for a given 108type. With this logic the normalized cache will actually create the following "links" between its 109relational data: 110 111- `"Query"`, `.todo(id: 1)``"Todo:1"` 112- `"Todo:1"`, `.author``"Author:1"` 113 114As we can see, the `Query` root type itself has a constant key of `"Query"`. All relational data 115originates here, since the GraphQL schema is a graph and, like a tree, all selections on a GraphQL 116query document originate from it. 117Internally, the normalized cache now stores field values on entities by their primary keys. The 118above can also be said or written as: 119 120- The `Query` entity's `todo` field with `{"id": 1}` arguments points to the `Todo:1` entity. 121- The `Todo:1` entity's `author` field points to the `Author:1` entity. 122 123In Graphcache, these "links" are stored in a nested structure per-entity. "Records" are kept 124separate from this relational data. 125 126![Normalization is based on types, keys, and relations. This information can all be inferred from 127the query document.](../assets/query-document-info.png) 128 129## Storing Normalized Data 130 131At its core, normalizing data means that we take individual fields and store them in a table. In our 132case we store all values of fields in a dictionary of their primary key, generated from an ID or 133other key and type name, and the field’s name and arguments, if it has any. 134 135| Primary Key | Field | Value | 136| ---------------------- | ----------------------------------------------- | ------------------------ | 137| Type name and ID (Key) | Field name (not alias) and optionally arguments | Scalar value or relation | 138 139To reiterate we have three pieces of information that are stored in tables: 140 141- The entity's key can be derived from its type name via the `__typename` field and a keyable field. 142 By default _Graphcache_ will check the `id` and `_id` fields, however this is configurable. 143- The field's name (like `todo`) and optional arguments. If the field has any arguments then we can 144 normalize it by JSON stringifying the arguments, making sure that the JSON key is stable by 145 sorting its keys. 146- Lastly, we may store relations as either `null`, a primary key that refers to another entity, or a 147 list of such. For storing "records" we can store the scalars in a separate table. 148 149In _Graphcache_ the data structure for these tables looks a little like the following, where each 150entity has a record from fields to other entity keys: 151 152```js 153{ 154 links: Map { 155 'Query': Record { 156 'todo({"id":1})': 'Todo:1' 157 }, 158 'Todo:1': Record { 159 'author': 'Author:1' 160 }, 161 'Author:1': Record { }, 162 } 163} 164``` 165 166We can see how the normalized cache is now able to traverse a GraphQL query by starting on the 167`Query` entity and retrieve relations for other fields. 168To retrieve "records" which are all fields with scalar values and no selection sets, _Graphcache_ 169keeps a second table around with an identical structure. This table only contains scalar values, 170which keeps our non-relational data away from our "links": 171 172```js 173{ 174 records: Map { 175 'Query': Record { 176 '__typename': 'Query' 177 }, 178 'Todo:1': Record { 179 '__typename': 'Todo', 180 'id': 1, 181 'title': 'implement graphcache' 182 }, 183 'Author:1': Record { 184 '__typename': 'Author', 185 'id': 1, 186 'name': 'urql-team' 187 }, 188 } 189} 190``` 191 192This is very similar to how we'd go about creating a state management store manually, except that 193_Graphcache_ can use the GraphQL document to perform this normalization automatically. 194 195What we gain from this normalization is that we have a data structure that we can both read from and 196write to, to reproduce the API results for GraphQL query documents. Any mutation or subscription can 197also be written to this data structure. Once _Graphcache_ finds a keyable entity in their results 198it's written to its relational table which may update other queries in our application. 199Similarly queries may share data between one another which means that they effectively share 200entities using this approach and can update one another. 201In other words, once we have a primary key like `"Todo:1"` we may find this primary key again in 202other entities in other GraphQL results. 203 204## Custom Keys and Non-Keyable Entities 205 206In the above introduction we've learned that while _Graphcache_ doesn't enforce `id` fields on each 207entity, it checks for the `id` and `_id` fields by default. There are many situations in which 208entities may either not have a key field or have different keys. 209 210As _Graphcache_ traverses JSON data and a GraphQL query document to write data to the cache you may 211see a warning from it along the lines of ["Invalid key: [...] No key could be generated for the data 212at this field."](./errors.md/#15-invalid-key) _Graphcache_ has many warnings like these that attempt 213to detect undesirable behaviour and helps us to update our configuration or queries accordingly. 214 215In the simplest cases, we may simply have forgotten to add the `id` field to the selection set of 216our GraphQL query document. However, what if the field is instead called `uuid` and our query looks 217accordingly different? 218 219```graphql 220{ 221 item { 222 uuid 223 } 224} 225``` 226 227In the above selection set we have an `item` field that has a `uuid` field rather than an `id` 228field. This means that _Graphcache_ won't automatically be able to generate a primary key for this 229entity. Instead, we have to help it generate a key by passing it a custom `keys` config: 230 231```js 232cacheExchange({ 233 keys: { 234 Item: data => data.uuid, 235 }, 236}); 237``` 238 239We may add a function as an entry to the `keys` configuration. The property here, `"Item"` must be 240the typename of the entity for which we're generating a key. The function may return an arbitarily 241generated key. So for our `item` field, which in our example schema gives us an `Item` entity, we 242can create a `keys` configuration entry that creates a key from the `uuid` field rather than the 243`id` field. 244 245This also raises a question, **what does _Graphcache_ do with unkeyable data by default? And, what 246if my data has no key?**<br /> 247This special case is what we call "embedded data". Not all types in a GraphQL schema will have 248keyable fields and some types may just abstract data without themselves being relational. They may 249be "edges", entities that have a field pointing to other entities that simply connect two entities, 250or data types like a `GeoJson` or `Image` type. 251 252In these cases, where the normalized cache encounters unkeyable types, it will create an embedded 253key by using the parent's primary key and combining it with the field key. This means that 254"embedded entities" are only reachable from a specific field on their parent entities. They're 255globally unique and aren't strictly speaking relational data. 256 257```graphql 258{ 259 __typename 260 todo(id: 1) { 261 id 262 image { 263 url 264 width 265 height 266 } 267 } 268} 269``` 270 271In the above example we're querying an `Image` type on a `Todo`. This imaginary `Image` type has no 272key because the image is embedded data and will only ever be associated to this `Todo`. In other 273words, the API's schema doesn't consider it necessary to have a primary key field for this type. 274Maybe it doesn't even have an ID in our backend's database. We _could_ assign this type an imaginary 275key (maybe based on the `url`) but in fact if it's not shared data it wouldn't make much sense to 276do so. 277 278When _Graphcache_ attempts to store this entity it will issue the previously mentioned warning. 279Internally, it'll then generate an embedded key for this entity based on the parent entity. If 280the parent entity's key is `Todo:1` then the embedded key for our `Image` will become 281`Todo:1.image`. This is also how this entity will be stored internally by _Graphcache_: 282 283```js 284{ 285 records: Map { 286 'Todo:1.image': Record { 287 '__typename': 'Image', 288 'url': '...', 289 'width': 1024, 290 'height': 768 291 }, 292 } 293} 294``` 295 296This doesn't however mute the warning that _Graphcache_ outputs, since it believes we may have made a 297mistake. The warning itself gives us advice on how to mute it: 298 299> If this is intentional, create a keys config for `Image` that always returns null. 300 301Meaning, that we can add an entry to our `keys` config for our non-keyable type that explicitly 302returns `null`, which tells _Graphcache_ that the entity has no key: 303 304```js 305cacheExchange({ 306 keys: { 307 Image: () => null, 308 }, 309}); 310``` 311 312### Flexible Key Generation 313 314In some cases, you may want to create a pattern for your key generation. For instance, you may want 315to say "create a special key for every type ending in `'Node'`. In such a case we recommend creating 316a small JS `Proxy` to take care of key generation for you and making the keys functional. 317 318```js 319cacheExchange({ 320 keys: new Proxy( 321 { 322 Image: () => null, 323 }, 324 { 325 get(target, prop, receiver) { 326 if (prop.endsWith('Node')) { 327 return data => data.uid; 328 } 329 const fallback = data => data.uuid; 330 return target[prop] || fallback; 331 }, 332 } 333 ), 334}); 335``` 336 337In the above example, we dynamically change the key generator depending on the typename. When 338a typename ends in `'Node'`, we return a key generator that uses the `uid` field. We still fall back 339to an object of manual key generation functions however. Lastly though, when a type doesn't have 340a predefined key generator, we change the default behavior from using `id` and `_id` fields to using 341`uuid` fields. 342 343## Non-Automatic Relations and Updates 344 345While _Graphcache_ is able to store and update our entities in an in-memory relational data 346structure, which keeps the same entities in singular unique locations, a GraphQL API may make a lot 347of implicit changes to the relations of data as it runs or have trivial relations that our cache 348doesn't need to see to resolve. Like with the `keys` config, we have two more configuration options 349to combat this: `resolvers` and `updates`. 350 351### Manually resolving entities 352 353Some fields in our configuration can be resolved without checking the GraphQL API for relations. The 354`resolvers` config allows us to create a list of client-side resolvers where we can read from the 355cache directly as _Graphcache_ creates a local GraphQL result from its cached data. 356 357```graphql 358{ 359 todo(id: 1) { 360 id 361 } 362} 363``` 364 365Previously we've looked at the above query to illustrate how data from a GraphQL API may be written 366to _Graphcache_'s relational data structure to store the links and entities in a result against this 367GraphQL query document. However, it may be possible for another query to have already written this 368`Todo` entity to the cache. So, **how do we resolve a relation manually?** 369 370In such a case, _Graphcache_ may have seen and stored the `Todo` entity but isn't aware of the 371relation between `Query.todo({"id":1})` and the `Todo:1` entity. However, we can tell _Graphcache_ 372which entity it should look for when it accesses the `Query.todo` field by creating a resolver for 373it: 374 375```js 376cacheExchange({ 377 resolvers: { 378 Query: { 379 todo(parent, args, cache, info) { 380 return { __typename: 'Todo', id: args.id }; 381 }, 382 }, 383 }, 384}); 385``` 386 387A resolver is a function that's similar to [GraphQL.js' resolvers on the 388server-side](https://www.graphql-tools.com/docs/resolvers/). They receive the parent data, the 389field's arguments, access to _Graphcache_'s cached data, and an `info` object. [The entire function 390signature and more explanations can be found in the API docs.](../api/graphcache.md#resolvers-option) 391Since it can access the field's arguments from the GraphQL query document, we can return a partial 392`Todo` entity. As long as this 393object is keyable, it will tell _Graphcache_ what the key of the returned entity is. In other words, 394we've told it how to get to a `Todo` from the `Query.todo` field. 395 396This mechanism is immensely more powerful than this example. We have other use-cases that 397resolvers may be used for: 398 399- Resolvers can be applied to fields with records, which means that it can be used to change or 400 transform scalar values. For instance, we can update a string or parse a `Date` right inside a 401 resolver. 402- Resolvers can return deeply nested results, which will be layered on top of the in-memory 403 relational cached data of _Graphcache_, which means that it can emulate infinite pagination and 404 other complex behaviour. 405- Resolvers can change when a cache miss or hit occurs. Returning `null` means that a field’s value 406 is literally `null`, which will not cause a cache miss, while returning `undefined` will mean 407 a field’s value is uncached. 408- Resolvers can return either partial entities or keys, so we can chain `cache.resolve` calls to 409 read fields from the cache, even when a field is pointing at another entity, since we can return 410 keys to the other entity directly. 411 412[Read more about resolvers on the following page about "Local Resolvers".](./local-resolvers.md) 413 414### Manual cache updates 415 416While `resolvers`, as shown above, operate while _Graphcache_ is reading from its in-memory cache, 417`updates` are a configuration option that operate while _Graphcache_ is writing to its cached data. 418Specifically, these functions can be used to add more updates onto what a `Mutation` or 419`Subscription` may automatically update. 420 421As stated before, a GraphQL schema's data may undergo a lot of implicit changes when we send it a 422`Mutation` or `Subscription`. A new item that we create may for instance manipulate a completely 423different item or even a list. Often mutations and subscriptions alter relations that their 424selection sets wouldn't necessarily see. Since mutations and subscriptions operate on a different 425root type, rather than the `Query` root type, we often need to update links in the rest of our data 426when a mutation is executed. 427 428```graphql 429query TodosList { 430 todos { 431 id 432 title 433 } 434} 435 436mutation AddTodo($title: String!) { 437 addTodo(title: $title) { 438 id 439 title 440 } 441} 442``` 443 444In a simple example, like the one above, we have a list of todos in a query and create a new todo 445using the `Mutation.addTodo` mutation field. When the mutation is executed and we get the result 446back, _Graphcache_ already writes the `Todo` item to its normalized cache. However, we also want to 447add the new `Todo` item to the list on `Query.todos`: 448 449```js 450import { gql } from '@urql/core'; 451 452cacheExchange({ 453 updates: { 454 Mutation: { 455 addTodo(result, args, cache, info) { 456 const query = gql` 457 { 458 todos { 459 id 460 } 461 } 462 `; 463 cache.updateQuery({ query }, data => { 464 data.todos.push(result.addTodo); 465 return data; 466 }); 467 }, 468 }, 469 }, 470}); 471``` 472 473In this code example we can first see that the signature of the `updates` entry is very similar to 474the one of `resolvers`. However, we're seeing the `cache` in use for the first time. The `cache` 475object (as [documented in the API docs](../api/graphcache.md#cache)) gives us 476access to _Graphcache_'s mechanisms directly. Not only can we resolve data using it, we can directly 477start sub-queries or sub-writes manually. These are full normalized cache runs inside other runs. In 478this case we're calling `cache.updateQuery` on a list of `Todo` items while the `Mutation` that 479added the `Todo` is already being written to the cache. 480 481As we can see, we may perform manual changes inside of `updates` functions, which can be used to 482affect other parts of the cache (like `Query.todos` here) beyond the automatic updates that a 483normalized cache is expected to perform. 484 485We get methods like `cache.updateQuery`, `cache.writeFragment`, and `cache.link` in our updater 486functions, which aren't available to us in local resolvers, and can only be used in these `updates` 487entries to change the data that the cache holds. 488 489[Read more about writing cache updates on the "Cache Updates" page.](./cache-updates.md) 490 491## Deterministic Cache Updates 492 493Above, in [the "Storing Normalized Data" section](#storing-normalized-data), we've talked about how 494Graphcache is able to store normalized data. However, apart from storing this data there are a 495couple of caveats that many applications simply ignore, skip, or simplify when they implement a 496store to cache their data in. 497 498Amongst features like [Optimistic Updates](./cache-updates.md#optimistic-updates) and [Offline 499Support](./offline.md), Graphcache supports several features that allow our API results to be more 500unreliable. Essentially we don't expect API results to always come back in order or on time. 501However, we expect Graphcache to prevent us from making "indeterministic cache updates", meaning 502that we expect it to handle API results that come back in a random order and delayed gracefully. 503 504In terms of the ["Manual Cache Updates"](#manual-cache-updates) that we've talked about above and 505[Optimistic Updates](./cache-updates.md#optimistic-updates) the limitations are pretty simple at 506first and if we use Graphcache as usual we may not even notice them: 507 508- When we make an _optimistic_ change, we define what a mutation's result may look like once the API 509 responds in the future and apply this temporary result immediately. We store this temporary data 510 in a separate "layer". Once the real result comes back this layer can be deleted and the real API 511 result can be applied as usual. 512- When multiple _optimistic updates_ are made at the same time, we never allow these layers to be 513 deleted separately. Instead Graphcache waits for all mutations to complete before deleting the 514 optimistic layers and applying the real API result. This means that a mutation update cannot 515 accidentally commit optimistic data to the cache permanently. 516- While an _optimistic update_ has been applied, Graphcache stops refetching any queries that contain 517 this optimistic data so that it doesn't "flip back" to its non-optimistic state without the 518 optimistic update being applied. Otherwise we'd see a "flicker" in the UI. 519 520These three principles are the basic mechanisms we can expect from Graphcache. The summary is: 521**Graphcache groups optimistic mutations and pauses queries so that optimistic updates look as 522expected,** which is an implementation detail we can mostly ignore when using it. 523 524However, one implementation detail we cannot ignore is the last mechanism in Graphcache which is 525called **"Commutativity"**. As we can tell, "optimistic updates" need to store their normalized 526results on a separate layer. This means that the previous data structure we've seen in Graphcache is 527actually more like a list, with many tables of links and entities. 528 529Each layer may contain optimistic results and have an order of preference. However, this order also 530applies to queries. Since queries are run in one order but their API results can come back to us in 531a very different order, if we access enough pages in a random order things can sometimes look rather 532weird. We may see that in an application on a slow network connection the results may vary depending 533on when their results came back. 534 535![Commutativity means that we store data in separate layers.](../assets/commutative-layers.png) 536 537Instead, Graphcache actually uses layers for any API result it receives. In case, an API result 538arrives out-of-order, it sorts them by precedence — or rather by when they've been requested. 539Overall, we don't have to worry about this, but Graphcache has mechanisms that keep our updates 540safe. 541 542## Reading on 543 544This concludes the introduction to Graphcache with a short overview of how it works, what it 545supports, and some hidden mechanisms and internals. Next we may want to learn more about how to use 546it and more of its features: 547 548- [How do we write "Local Resolvers"?](./local-resolvers.md) 549- [How to set up "Cache Updates" and "Optimistic Updates"?](./cache-updates.md) 550- [What is Graphcache's "Schema Awareness" feature for?](./schema-awareness.md) 551- [How do I enable "Offline Support"?](./offline.md)