-276
docs/compatibility.md
-276
docs/compatibility.md
···
1
-
# CRDT Library - json-joy Compatibility Specification
2
-
3
-
This document defines the compatibility scope for the OCaml CRDT library with the
4
-
[json-joy](https://github.com/streamich/json-joy) TypeScript implementation.
5
-
6
-
## Reference Implementation
7
-
8
-
- **Repository**: https://github.com/streamich/json-joy
9
-
- **Source**: `packages/json-joy/src/`
10
-
- **Test Fixtures**: https://github.com/streamich/json-crdt-traces
11
-
12
-
## CRDT Node Types (7 types)
13
-
14
-
All node types from `json-crdt/nodes/` must be implemented:
15
-
16
-
| Type | Description | json-joy Source | Priority |
17
-
|------|-------------|-----------------|----------|
18
-
| `con` | Constant/immutable value | `nodes/const/` | P1 |
19
-
| `val` | Mutable reference (LWW register) | `nodes/val/` | P1 |
20
-
| `obj` | Mutable string-keyed map (LWW per key) | `nodes/obj/` | P1 |
21
-
| `vec` | Mutable indexed tuple (0-255 slots) | `nodes/vec/` | P1 |
22
-
| `arr` | Mutable ordered list (RGA-based) | `nodes/arr/` | P1 |
23
-
| `str` | Mutable text string (RGA-based) | `nodes/str/` | P1 |
24
-
| `bin` | Mutable binary data (RGA-based) | `nodes/bin/` | P1 |
25
-
26
-
### Node Interface
27
-
28
-
Each node must implement:
29
-
- `id: Timestamp` - Creation timestamp (unique identifier)
30
-
- `name(): string` - Type name for debugging
31
-
- `view(): Value.t` - Materialize current value
32
-
- `children(): Node list` - Child nodes for traversal
33
-
34
-
## Patch Operations (18 opcodes)
35
-
36
-
All operations from `json-crdt-patch/operations.ts`:
37
-
38
-
### Creation Operations (7)
39
-
40
-
| Opcode | Name | Description |
41
-
|--------|------|-------------|
42
-
| 0 | `new_con` | Create constant node |
43
-
| 1 | `new_val` | Create mutable value node |
44
-
| 2 | `new_obj` | Create object node |
45
-
| 3 | `new_vec` | Create vector/tuple node |
46
-
| 4 | `new_str` | Create string node |
47
-
| 5 | `new_bin` | Create binary node |
48
-
| 6 | `new_arr` | Create array node |
49
-
50
-
### Insertion Operations (7)
51
-
52
-
| Opcode | Name | Description |
53
-
|--------|------|-------------|
54
-
| 7 | `ins_val` | Set value node reference |
55
-
| 8 | `ins_obj` | Insert/update object key |
56
-
| 9 | `ins_vec` | Set vector slot |
57
-
| 10 | `ins_str` | Insert text into string |
58
-
| 11 | `ins_bin` | Insert bytes into binary |
59
-
| 12 | `ins_arr` | Insert element into array |
60
-
| 13 | `upd_arr` | Update array element in place |
61
-
62
-
### Other Operations (4)
63
-
64
-
| Opcode | Name | Description |
65
-
|--------|------|-------------|
66
-
| 14 | `del` | Delete range (RGA tombstone) |
67
-
| 15 | `nop` | No-op (padding/alignment) |
68
-
69
-
**Note**: Opcodes 16-17 are reserved for future use.
70
-
71
-
## Value Types (CBOR Extended)
72
-
73
-
The library must support JSON values plus CBOR extensions:
74
-
75
-
```ocaml
76
-
type t =
77
-
| Null
78
-
| Undefined (* CBOR extension - not in JSON *)
79
-
| Bool of bool
80
-
| Int of int (* 53-bit safe integer *)
81
-
| Float of float
82
-
| String of string
83
-
| Bytes of bytes (* CBOR extension - not in JSON *)
84
-
| Array of t list
85
-
| Object of (string * t) list
86
-
| Timestamp_ref of Timestamp.t (* Reference to another node *)
87
-
```
88
-
89
-
### Encoding Notes
90
-
91
-
- `Undefined` encodes as CBOR undefined (0xf7), JSON `null`
92
-
- `Bytes` encodes as CBOR bytes, JSON base64 string
93
-
- `Timestamp_ref` encodes as 2-element array `[sid, time]`
94
-
95
-
## Clock and Timestamp Types
96
-
97
-
From `json-crdt-patch/clock/`:
98
-
99
-
### Timestamp
100
-
101
-
```ocaml
102
-
type timestamp = {
103
-
sid: int; (* Session ID, 53-bit safe integer *)
104
-
time: int; (* Logical time, 53-bit safe integer *)
105
-
}
106
-
```
107
-
108
-
### Timespan
109
-
110
-
```ocaml
111
-
type timespan = {
112
-
sid: int;
113
-
time: int;
114
-
span: int; (* Length of the span *)
115
-
}
116
-
```
117
-
118
-
### Session ID Constants
119
-
120
-
From `json-crdt-patch/constants.ts`:
121
-
122
-
| Name | Value | Description |
123
-
|------|-------|-------------|
124
-
| `SYSTEM` | 0 | Reserved for system use |
125
-
| `SERVER` | 1 | Server clock mode |
126
-
| `GLOBAL` | 2 | Global/schema patches |
127
-
| `LOCAL` | 3 | Local-only data |
128
-
| `MAX` | 9007199254740991 | 53-bit max (2^53 - 1) |
129
-
130
-
### ClockVector
131
-
132
-
Full vector clock implementation supporting multiple concurrent sessions:
133
-
134
-
```ocaml
135
-
type clock_vector = {
136
-
local: logical_clock; (* This session's clock *)
137
-
peers: (int, int) Map.t; (* Session ID -> observed time *)
138
-
}
139
-
```
140
-
141
-
Methods:
142
-
- `tick()` - Increment local clock
143
-
- `observe(timestamp)` - Update peer knowledge
144
-
- `fork(new_sid)` - Create independent replica
145
-
- `clone()` - Deep copy
146
-
147
-
## Codec Formats
148
-
149
-
### Patch Codecs
150
-
151
-
| Format | Description | File Extension |
152
-
|--------|-------------|----------------|
153
-
| Verbose JSON | Human-readable, explicit types | `.verbose.json` |
154
-
| Compact JSON | Array-based, smaller | `.compact.json` |
155
-
| Binary | CBOR-based, wire-efficient | `.bin` |
156
-
157
-
### Document Codecs
158
-
159
-
| Format | Description | Use Case |
160
-
|--------|-------------|----------|
161
-
| Sidecar | View + separate metadata | Non-CRDT compatibility |
162
-
| Verbose Structural | Tree with explicit nodes | Debugging |
163
-
| Compact Structural | Array-based tree | Storage |
164
-
| Binary Structural | CBOR tree | Wire transfer |
165
-
| Indexed | Flat node map | Incremental sync |
166
-
167
-
## Conformance Testing
168
-
169
-
The library must pass all tests from [json-crdt-traces](https://github.com/streamich/json-crdt-traces):
170
-
171
-
### Test Categories
172
-
173
-
1. **Codec Roundtrip**
174
-
- Decode each format (verbose, compact, binary)
175
-
- Re-encode and compare
176
-
- Binary must match byte-for-byte
177
-
178
-
2. **Patch Application**
179
-
- Apply patches from trace to empty model
180
-
- Compare final view to `view.json`
181
-
182
-
3. **Model Codecs**
183
-
- Decode model snapshots
184
-
- Verify view matches expected
185
-
- Re-encode and compare
186
-
187
-
4. **Convergence**
188
-
- Apply patches in different orders
189
-
- Verify all replicas converge to same state
190
-
191
-
### Trace Types
192
-
193
-
- `text/` - Plain text editing traces
194
-
- `json/` - JSON document editing traces
195
-
- `rich-text/` - Peritext rich text traces
196
-
- `fuzzer/` - Randomized operation traces
197
-
198
-
## Extensions (Optional)
199
-
200
-
Higher-order CRDT types built on the core:
201
-
202
-
| Extension | Description | Built On |
203
-
|-----------|-------------|----------|
204
-
| Counter (`cnt`) | PN-counter | `vec` node |
205
-
| Multi-value (`mval`) | All concurrent writes visible | `arr` node |
206
-
| Peritext | Rich text with formatting | `str` + annotations |
207
-
208
-
## JSON-Rx RPC Protocol
209
-
210
-
For OCaml-to-OCaml communication:
211
-
212
-
### Message Types
213
-
214
-
```ocaml
215
-
type message =
216
-
| Request of { id: int; method_: string; data: Value.t option }
217
-
| Response of { id: int; data: Value.t }
218
-
| Error of { id: int; error: Value.t }
219
-
| Notification of { method_: string; data: Value.t option }
220
-
| Subscribe of { id: int; channel: string }
221
-
| Unsubscribe of { id: int }
222
-
| Data of { id: int; data: Value.t }
223
-
| Complete of { id: int }
224
-
```
225
-
226
-
### Sync Protocol
227
-
228
-
1. Client subscribes to document
229
-
2. Server sends initial snapshot
230
-
3. Bidirectional patch streaming
231
-
4. Conflict-free merge on both sides
232
-
233
-
## OCaml-Specific Design
234
-
235
-
### Effects-Based IO
236
-
237
-
Use OCaml 5.4 effects for IO-agnostic design:
238
-
239
-
```ocaml
240
-
type _ Effect.t +=
241
-
| Read_bytes : int -> bytes Effect.t
242
-
| Write_bytes : bytes -> unit Effect.t
243
-
| Yield : unit Effect.t
244
-
```
245
-
246
-
Effect handlers for:
247
-
- Eio (recommended)
248
-
- Lwt (legacy)
249
-
- Direct/blocking (simple scripts)
250
-
251
-
### Module Structure
252
-
253
-
```
254
-
lib/
255
-
crdt.ml (* Top-level module *)
256
-
value/ (* Value.t and JSON Pointer *)
257
-
clock/ (* Timestamps, ClockVector *)
258
-
patch/ (* Operations, Patch, Builder *)
259
-
model/ (* Nodes, Model, API *)
260
-
codec/ (* All codec implementations *)
261
-
rx/ (* JSON-Rx RPC *)
262
-
```
263
-
264
-
### Dependencies
265
-
266
-
- `jsont` - JSON encoding/decoding
267
-
- `jsont.bytesrw` - Binary/streaming support
268
-
269
-
## Version Compatibility
270
-
271
-
This specification targets json-joy compatibility as of:
272
-
- **json-joy version**: Latest main branch (Dec 2024)
273
-
- **json-crdt-traces version**: Latest main branch
274
-
275
-
Wire format compatibility is required - documents and patches encoded by
276
-
either implementation must be readable by the other.