an atproto pds written in F# (.NET 9) 馃
pds fsharp giraffe dotnet atproto
1# DAG-CBOR Implementation Notes 2 3DAG-CBOR is the canonical data serialization format for the AT Protocol. 4It is a strict subset of CBOR (RFC 8949) with specific rules for determinism and linking. 5 6## 1. Canonicalization Rules 7 8To ensure consistent Content IDs (CIDs) for the same data, specific canonicalization rules must be followed during encoding. 9 10### Map Key Sorting 11 12Maps must be sorted by keys. The sorting order is **NOT** standard lexicographical order. 13 141. **Length**: Shorter keys come first. 152. **Bytes**: keys of the same length are sorted lexicographically by their UTF-8 byte representation. 16 17**Example:** 18 19- `"a"` (len 1) comes before `"aa"` (len 2). 20- `"b"` (len 1) comes before `"aa"` (len 2). 21- `"a"` comes before `"b"`. 22 23### Integer Encoding 24 25Integers must be encoded using the smallest possible representation. 26 27`System.Formats.Cbor` (in Strict mode) generally handles this, but care must be taken to treat `int`, `int64`, and `uint64` consistently. 28 29## 2. Content Addressing (CIDs) 30 31Links to other nodes (CIDs) are encoded using **CBOR Tag 42**. 32 33### Format 34 351. **Tag**: `42` (Major type 6, value 42). 362. **Payload**: A byte string containing: 37 - The `0x00` byte (Multibase identity prefix, required by IPLD specs for binary CID inclusion). 38 - The raw bytes of the CID. 39 40## 3. Known Gotchas 41 42- **Float vs Int**: 43 AT Protocol generally discourages floats where integers suffice. 44 F# types must be matched carefully to avoid encoding `2.0` instead of `2`. 45- **String Encoding**: 46 Must be UTF-8. Indefinite length strings are prohibited in DAG-CBOR.