PDSharp.Docs/mst.md at main · desertthunder.dev/PDSharp

an atproto pds written in F# (.NET 9) 🦒
pds fsharp giraffe dotnet atproto
PDSharp / PDSharp.Docs / mst.md
at main 3.3 kB view raw view rendered
 1# Merkle Search Tree (MST) Implementation Notes
 2
 3The Merkle Search Tree (MST) is a probabilistic, balanced search tree used by the AT Protocol to store repository records.
 4
 5## Overview
 6
 7MSTs combine properties of B-trees and Merkle trees to ensure:
 8
 91. **Determinism**: The tree structure is determined by the keys (and their hashes), not insertion order.
102. **Verifyability**: Every node is content-addressed (CID), allowing the entire state to be verified via a single root hash.
113. **Efficiency**: Efficient key-value lookups and delta-based sync (subtrees that haven't changed share the same CIDs).
12
13## Core Concepts
14
15### Layering (Probabilistic Balance)
16
17MSTs do not use rotation for balancing. Instead, they assign every key a "layer" based on its hash.
18
19- **Formula**:
20  `Layer(key) = countLeadingZeros(SHA256(key)) / 2`.
21- **Fanout**:
22  The divisor `2` implies a fanout of roughly 4 (2 bits per layer increment).
23- Keys with higher layers appear higher in the tree, splitting the range of keys below them.
24
25### Data Structure (`MstNode`)
26
27An MST node consists of:
28
29- **Left Child (`l`)**: Use to traverse to keys lexicographically smaller than the first entry in this node.
30- **Entries (`e`)**: A sorted list of entries. Each entry contains:
31    - **Prefix Length (`p`)**: Length of the shared prefix with the *previous* key in the node (or the split key).
32    - **Key Suffix (`k`)**: The remaining bytes of the key.
33    - **Value (`v`)**: The CID of the record data.
34    - **Tree (`t`)**: (Optional) CID of the subtree containing keys between this entry and the next.
35
36**Serialization**: The node is serialized as a DAG-CBOR array: `[l, [e1, e2, ...]]`.
37
38## Algorithms
39
40### Insertion (`Put`)
41
42Insertion relies on the "Layer" property:
43
441. Calculate `Layer(newKey)`.
452. Traverse the tree from the root.
463. **Split Condition**: If `Layer(newKey)` is **greater** than the layer of the current node, the new key belongs *above* this node.
47    - The current node is **split** into two children (Left and Right) based on the `newKey`.
48    - The `newKey` becomes a new node pointing to these two children.
494. **Recurse**: If `Layer(newKey)` is **less** than the current node, find the correct child subtree (based on key comparison) and recurse.
505. **Same Layer**: If `Layer(newKey)` equals the current node's layer:
51    - Insert perfectly into the sorted entries list.
52    - Any existing child pointer at that position must be split and redistributed if necessary (though spec usually implies layers are unique enough or handled by standard BST insert at that level).
53
54### Deletion
55
561. Locate the key.
572. Remove the entry.
583. **Merge**:
59   Since the key acted as a separator for two subtrees (its "Left" previous child and its "Tree" child), removing it requires merging these two adjacent subtrees into a single valid MST node to preserve the tree's density and structure.
60
61### Determinism & Prefix Compression
62
63- **Canonical Order**: Keys must always be sorted.
64- **Prefix Compression**:
65  Crucial for space saving.
66  The prefix length `p` is calculated relative to the *immediately preceding key* in the node.
67- **Issues**:
68  Insertion order *should not* matter (commutativity).
69  However, implementations must be careful with `Split` and `Merge` operations to ensure exactly the same node boundaries are created regardless of history.