+262
-151
doc/tutorial.md
+262
-151
doc/tutorial.md
···
28
28
29
29
The JSON Pointer `/users/0/name` refers to the string `"Alice"`.
30
30
31
+
In OCaml, this is represented by the `Jsont_pointer.t` type - a sequence
32
+
of navigation steps from the document root to a target value.
33
+
31
34
## Syntax: Reference Tokens
32
35
33
36
RFC 6901, Section 3 defines the syntax:
···
79
82
80
83
Multiple tokens navigate deeper into nested structures.
81
84
82
-
### Invalid Syntax
83
-
84
-
What happens if a pointer doesn't start with `/`?
85
-
86
-
```sh
87
-
$ jsonpp parse "foo"
88
-
ERROR: Invalid JSON Pointer: must be empty or start with '/': foo
89
-
```
90
-
91
-
The RFC is strict: non-empty pointers MUST start with `/`.
92
-
93
-
## Escaping Special Characters
94
-
95
-
RFC 6901, Section 3 explains the escaping rules:
96
-
97
-
> Because the characters '~' (%x7E) and '/' (%x2F) have special meanings
98
-
> in JSON Pointer, '~' needs to be encoded as '~0' and '/' needs to be
99
-
> encoded as '~1' when these characters appear in a reference token.
100
-
101
-
Why these specific characters?
102
-
- `/` separates tokens, so it must be escaped inside a token
103
-
- `~` is the escape character itself, so it must also be escaped
104
-
105
-
The escape sequences are:
106
-
- `~0` represents `~` (tilde)
107
-
- `~1` represents `/` (forward slash)
108
-
109
-
Let's see escaping in action:
110
-
111
-
```sh
112
-
$ jsonpp escape "hello"
113
-
hello
114
-
```
115
-
116
-
No special characters, no escaping needed.
117
-
118
-
```sh
119
-
$ jsonpp escape "a/b"
120
-
a~1b
121
-
```
122
-
123
-
The `/` becomes `~1`.
124
-
125
-
```sh
126
-
$ jsonpp escape "a~b"
127
-
a~0b
128
-
```
129
-
130
-
The `~` becomes `~0`.
85
+
### The Index Type
131
86
132
-
```sh
133
-
$ jsonpp escape "~/"
134
-
~0~1
135
-
```
136
-
137
-
Both characters are escaped.
138
-
139
-
### Unescaping
140
-
141
-
And the reverse process:
142
-
143
-
```sh
144
-
$ jsonpp unescape "a~1b"
145
-
OK: a/b
146
-
```
87
+
Each reference token becomes an `Index.t` value in the library:
147
88
148
-
```sh
149
-
$ jsonpp unescape "a~0b"
150
-
OK: a~b
89
+
```ocaml
90
+
type t =
91
+
| Mem of string (* Object member access *)
92
+
| Nth of int (* Array index access *)
93
+
| End (* The special "-" marker for append operations *)
151
94
```
152
95
153
-
### The Order Matters!
96
+
The `Mem` variant holds the **unescaped** member name - you work with the
97
+
actual key string (like `"a/b"`) and the library handles any escaping needed
98
+
for the JSON Pointer string representation.
154
99
155
-
RFC 6901, Section 4 is careful to specify the unescaping order:
156
-
157
-
> Evaluation of each reference token begins by decoding any escaped
158
-
> character sequence. This is performed by first transforming any
159
-
> occurrence of the sequence '~1' to '/', and then transforming any
160
-
> occurrence of the sequence '~0' to '~'. By performing the substitutions
161
-
> in this order, an implementation avoids the error of turning '~01' first
162
-
> into '~1' and then into '/', which would be incorrect (the string '~01'
163
-
> correctly becomes '~1' after transformation).
100
+
### Invalid Syntax
164
101
165
-
Let's verify this tricky case:
102
+
What happens if a pointer doesn't start with `/`?
166
103
167
104
```sh
168
-
$ jsonpp unescape "~01"
169
-
OK: ~1
105
+
$ jsonpp parse "foo"
106
+
ERROR: Invalid JSON Pointer: must be empty or start with '/': foo
170
107
```
171
108
172
-
If we unescaped `~0` first, `~01` would become `~1`, which would then become
173
-
`/`. But that's wrong! The sequence `~01` should become the literal string
174
-
`~1` (a tilde followed by the digit one).
175
-
176
-
Invalid escape sequences are rejected:
177
-
178
-
```sh
179
-
$ jsonpp unescape "~2"
180
-
ERROR: Invalid JSON Pointer: invalid escape sequence ~2
181
-
```
182
-
183
-
```sh
184
-
$ jsonpp unescape "hello~"
185
-
ERROR: Invalid JSON Pointer: incomplete escape sequence at end
186
-
```
109
+
The RFC is strict: non-empty pointers MUST start with `/`.
187
110
188
111
## Evaluation: Navigating JSON
189
112
···
195
118
> the document. Each reference token in the JSON Pointer is evaluated
196
119
> sequentially.
197
120
121
+
In the library, this is the `Jsont_pointer.get` function:
122
+
123
+
```ocaml
124
+
val get : t -> Jsont.json -> Jsont.json
125
+
```
126
+
198
127
Let's use the example JSON document from RFC 6901, Section 5:
199
128
200
129
```sh
···
222
151
OK: {"foo":["bar","baz"],"":0,"a/b":1,"c%d":2,"e^f":3,"g|h":4,"i\\j":5,"k\"l":6," ":7,"m~n":8}
223
152
```
224
153
225
-
The empty pointer returns the whole document.
154
+
The empty pointer returns the whole document. In OCaml, this is
155
+
`Jsont_pointer.root`:
156
+
157
+
```ocaml
158
+
val root : t
159
+
(** The empty pointer that references the whole document. *)
160
+
```
226
161
227
162
### Object Member Access
228
163
···
263
198
264
199
### Keys with Special Characters
265
200
266
-
Now for the escape sequences:
201
+
The RFC example includes keys with `/` and `~` characters:
267
202
268
203
```sh
269
204
$ jsonpp eval rfc6901_example.json "/a~1b"
270
205
OK: 1
271
206
```
272
207
273
-
The token `a~1b` unescapes to `a/b`, which is the key name.
208
+
The token `a~1b` refers to the key `a/b`. We'll explain this escaping
209
+
[below](#escaping-special-characters).
274
210
275
211
```sh
276
212
$ jsonpp eval rfc6901_example.json "/m~0n"
277
213
OK: 8
278
214
```
279
215
280
-
The token `m~0n` unescapes to `m~n`.
216
+
The token `m~0n` refers to the key `m~n`.
217
+
218
+
**Important**: When using the OCaml library programmatically, you don't need
219
+
to worry about escaping. The `Index.Mem` variant holds the literal key name:
220
+
221
+
```ocaml
222
+
(* To access the key "a/b", just use the literal string *)
223
+
let pointer = Jsont_pointer.make [Mem "a/b"]
224
+
225
+
(* The library escapes it when converting to string *)
226
+
let s = Jsont_pointer.to_string pointer (* "/a~1b" *)
227
+
```
281
228
282
229
### Other Special Characters (No Escaping Needed)
283
230
···
329
276
$ jsonpp eval rfc6901_example.json "/foo/0/invalid"
330
277
ERROR: JSON Pointer: cannot index into string with 'invalid'
331
278
File "-":
279
+
```
280
+
281
+
The library provides both exception-raising and result-returning variants:
282
+
283
+
```ocaml
284
+
val get : t -> Jsont.json -> Jsont.json
285
+
val get_result : t -> Jsont.json -> (Jsont.json, Jsont.Error.t) result
286
+
val find : t -> Jsont.json -> Jsont.json option
332
287
```
333
288
334
289
### Array Index Rules
···
393
348
394
349
But we'll see later that `-` is very useful for mutation operations!
395
350
396
-
## URI Fragment Encoding
397
-
398
-
JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains:
399
-
400
-
> A JSON Pointer can be represented in a URI fragment identifier by
401
-
> encoding it into octets using UTF-8, while percent-encoding those
402
-
> characters not allowed by the fragment rule in RFC 3986.
403
-
404
-
This adds percent-encoding on top of the `~0`/`~1` escaping:
405
-
406
-
```sh
407
-
$ jsonpp uri-fragment "/foo"
408
-
OK: /foo -> /foo
409
-
```
410
-
411
-
Simple pointers often don't need percent-encoding.
412
-
413
-
```sh
414
-
$ jsonpp uri-fragment "/a~1b"
415
-
OK: /a~1b -> /a~1b
416
-
```
417
-
418
-
The `~1` escape stays as-is (it's valid in URI fragments).
419
-
420
-
```sh
421
-
$ jsonpp uri-fragment "/c%d"
422
-
OK: /c%d -> /c%25d
423
-
```
424
-
425
-
The `%` character must be percent-encoded as `%25` in URIs!
426
-
427
-
```sh
428
-
$ jsonpp uri-fragment "/ "
429
-
OK: / -> /%20
430
-
```
431
-
432
-
Spaces become `%20`.
433
-
434
-
Here's the RFC example showing the URI fragment forms:
435
-
436
-
| JSON Pointer | URI Fragment | Value |
437
-
|-------------|-------------|-------|
438
-
| `""` | `#` | whole document |
439
-
| `"/foo"` | `#/foo` | `["bar", "baz"]` |
440
-
| `"/foo/0"` | `#/foo/0` | `"bar"` |
441
-
| `"/"` | `#/` | `0` |
442
-
| `"/a~1b"` | `#/a~1b` | `1` |
443
-
| `"/c%d"` | `#/c%25d` | `2` |
444
-
| `"/ "` | `#/%20` | `7` |
445
-
| `"/m~0n"` | `#/m~0n` | `8` |
446
-
447
351
## Mutation Operations
448
352
449
353
While RFC 6901 defines JSON Pointer for read-only access, RFC 6902
···
459
363
{"foo":"bar","baz":"qux"}
460
364
```
461
365
366
+
In OCaml:
367
+
368
+
```ocaml
369
+
val add : t -> Jsont.json -> value:Jsont.json -> Jsont.json
370
+
```
371
+
462
372
For arrays, `add` inserts BEFORE the specified index:
463
373
464
374
```sh
···
538
448
false
539
449
```
540
450
451
+
## Escaping Special Characters
452
+
453
+
RFC 6901, Section 3 explains the escaping rules:
454
+
455
+
> Because the characters '\~' (%x7E) and '/' (%x2F) have special meanings
456
+
> in JSON Pointer, '\~' needs to be encoded as '\~0' and '/' needs to be
457
+
> encoded as '\~1' when these characters appear in a reference token.
458
+
459
+
Why these specific characters?
460
+
- `/` separates tokens, so it must be escaped inside a token
461
+
- `~` is the escape character itself, so it must also be escaped
462
+
463
+
The escape sequences are:
464
+
- `~0` represents `~` (tilde)
465
+
- `~1` represents `/` (forward slash)
466
+
467
+
### The Library Handles Escaping Automatically
468
+
469
+
**Important**: When using `jsont-pointer` programmatically, you rarely need
470
+
to think about escaping. The `Index.Mem` variant stores unescaped strings,
471
+
and escaping happens automatically during serialization:
472
+
473
+
```ocaml
474
+
(* Create a pointer to key "a/b" - no escaping needed *)
475
+
let p = Jsont_pointer.make [Mem "a/b"]
476
+
477
+
(* Serialize to string - escaping happens automatically *)
478
+
let s = Jsont_pointer.to_string p (* Returns "/a~1b" *)
479
+
480
+
(* Parse from string - unescaping happens automatically *)
481
+
let p' = Jsont_pointer.of_string "/a~1b"
482
+
(* p' contains [Mem "a/b"] - the unescaped key *)
483
+
```
484
+
485
+
The `Token` module exposes the escaping functions if you need them:
486
+
487
+
```ocaml
488
+
module Token : sig
489
+
val escape : string -> string (* "a/b" -> "a~1b" *)
490
+
val unescape : string -> string (* "a~1b" -> "a/b" *)
491
+
end
492
+
```
493
+
494
+
### Escaping in Action
495
+
496
+
Let's see escaping with the CLI tool:
497
+
498
+
```sh
499
+
$ jsonpp escape "hello"
500
+
hello
501
+
```
502
+
503
+
No special characters, no escaping needed.
504
+
505
+
```sh
506
+
$ jsonpp escape "a/b"
507
+
a~1b
508
+
```
509
+
510
+
The `/` becomes `~1`.
511
+
512
+
```sh
513
+
$ jsonpp escape "a~b"
514
+
a~0b
515
+
```
516
+
517
+
The `~` becomes `~0`.
518
+
519
+
```sh
520
+
$ jsonpp escape "~/"
521
+
~0~1
522
+
```
523
+
524
+
Both characters are escaped.
525
+
526
+
### Unescaping
527
+
528
+
And the reverse process:
529
+
530
+
```sh
531
+
$ jsonpp unescape "a~1b"
532
+
OK: a/b
533
+
```
534
+
535
+
```sh
536
+
$ jsonpp unescape "a~0b"
537
+
OK: a~b
538
+
```
539
+
540
+
### The Order Matters!
541
+
542
+
RFC 6901, Section 4 is careful to specify the unescaping order:
543
+
544
+
> Evaluation of each reference token begins by decoding any escaped
545
+
> character sequence. This is performed by first transforming any
546
+
> occurrence of the sequence '~1' to '/', and then transforming any
547
+
> occurrence of the sequence '~0' to '~'. By performing the substitutions
548
+
> in this order, an implementation avoids the error of turning '~01' first
549
+
> into '~1' and then into '/', which would be incorrect (the string '~01'
550
+
> correctly becomes '~1' after transformation).
551
+
552
+
Let's verify this tricky case:
553
+
554
+
```sh
555
+
$ jsonpp unescape "~01"
556
+
OK: ~1
557
+
```
558
+
559
+
If we unescaped `~0` first, `~01` would become `~1`, which would then become
560
+
`/`. But that's wrong! The sequence `~01` should become the literal string
561
+
`~1` (a tilde followed by the digit one).
562
+
563
+
Invalid escape sequences are rejected:
564
+
565
+
```sh
566
+
$ jsonpp unescape "~2"
567
+
ERROR: Invalid JSON Pointer: invalid escape sequence ~2
568
+
```
569
+
570
+
```sh
571
+
$ jsonpp unescape "hello~"
572
+
ERROR: Invalid JSON Pointer: incomplete escape sequence at end
573
+
```
574
+
575
+
## URI Fragment Encoding
576
+
577
+
JSON Pointers can be embedded in URIs. RFC 6901, Section 6 explains:
578
+
579
+
> A JSON Pointer can be represented in a URI fragment identifier by
580
+
> encoding it into octets using UTF-8, while percent-encoding those
581
+
> characters not allowed by the fragment rule in RFC 3986.
582
+
583
+
This adds percent-encoding on top of the `~0`/`~1` escaping:
584
+
585
+
```sh
586
+
$ jsonpp uri-fragment "/foo"
587
+
OK: /foo -> /foo
588
+
```
589
+
590
+
Simple pointers often don't need percent-encoding.
591
+
592
+
```sh
593
+
$ jsonpp uri-fragment "/a~1b"
594
+
OK: /a~1b -> /a~1b
595
+
```
596
+
597
+
The `~1` escape stays as-is (it's valid in URI fragments).
598
+
599
+
```sh
600
+
$ jsonpp uri-fragment "/c%d"
601
+
OK: /c%d -> /c%25d
602
+
```
603
+
604
+
The `%` character must be percent-encoded as `%25` in URIs!
605
+
606
+
```sh
607
+
$ jsonpp uri-fragment "/ "
608
+
OK: / -> /%20
609
+
```
610
+
611
+
Spaces become `%20`.
612
+
613
+
The library provides functions for URI fragment encoding:
614
+
615
+
```ocaml
616
+
val to_uri_fragment : t -> string
617
+
val of_uri_fragment : string -> t
618
+
val jsont_uri_fragment : t Jsont.t
619
+
```
620
+
621
+
Here's the RFC example showing the URI fragment forms:
622
+
623
+
| JSON Pointer | URI Fragment | Value |
624
+
|-------------|-------------|-------|
625
+
| `""` | `#` | whole document |
626
+
| `"/foo"` | `#/foo` | `["bar", "baz"]` |
627
+
| `"/foo/0"` | `#/foo/0` | `"bar"` |
628
+
| `"/"` | `#/` | `0` |
629
+
| `"/a~1b"` | `#/a~1b` | `1` |
630
+
| `"/c%d"` | `#/c%25d` | `2` |
631
+
| `"/ "` | `#/%20` | `7` |
632
+
| `"/m~0n"` | `#/m~0n` | `8` |
633
+
541
634
## Deeply Nested Structures
542
635
543
636
JSON Pointer handles arbitrarily deep nesting:
···
561
654
{"arr":[[1,99,2],[3,4]]}
562
655
```
563
656
657
+
## Jsont Integration
658
+
659
+
The library integrates with the `Jsont` codec system for typed access:
660
+
661
+
```ocaml
662
+
(* Codec for JSON Pointers as JSON strings *)
663
+
val jsont : t Jsont.t
664
+
665
+
(* Query combinators *)
666
+
val path : ?absent:'a -> t -> 'a Jsont.t -> 'a Jsont.t
667
+
val set_path : ?allow_absent:bool -> 'a Jsont.t -> t -> 'a -> Jsont.json Jsont.t
668
+
val update_path : ?absent:'a -> t -> 'a Jsont.t -> Jsont.json Jsont.t
669
+
val delete_path : ?allow_absent:bool -> t -> Jsont.json Jsont.t
670
+
```
671
+
672
+
These allow you to use JSON Pointers with typed codecs rather than raw
673
+
`Jsont.json` values.
674
+
564
675
## Summary
565
676
566
677
JSON Pointer (RFC 6901) provides a simple but powerful way to address
567
678
values within JSON documents:
568
679
569
680
1. **Syntax**: Pointers are strings of `/`-separated reference tokens
570
-
2. **Escaping**: Use `~0` for `~` and `~1` for `/` in tokens
681
+
2. **Escaping**: Use `~0` for `~` and `~1` for `/` in tokens (handled automatically by the library)
571
682
3. **Evaluation**: Tokens navigate through objects (by key) and arrays (by index)
572
683
4. **URI Encoding**: Pointers can be percent-encoded for use in URIs
573
684
5. **Mutations**: Combined with JSON Patch (RFC 6902), pointers enable structured updates