A Transparent and Verifiable Way to Sync the AT Protocol's PLC Directory

preview warning on spec

+13 -11
+1 -1
README.md
··· 30 30 > 31 31 > This project and plcbundle specification is currently unstable and under heavy development. Things can break at any time. Bundle hashes or data formats may change. **Do not** use this for production systems. Please wait for the **`1.0`** release. 32 32 33 - plcbundle archives AT Protocol's DID PLC Directory operations into immutable, cryptographically-chained bundles of 10,000 operations. Each bundle is hashed (SHA-256), compressed (zstd), and linked to the previous bundle, creating a verifiable chain of DID operations. 33 + plcbundle archives AT Protocol's [DID PLC Directory](https://plc.directory/) operations into immutable, cryptographically-chained bundles of 10,000 operations. Each bundle is hashed (SHA-256), compressed (zstd), and linked to the previous bundle, creating a verifiable chain of DID operations. 34 34 35 35 This repository contains a reference library and a CLI tool written in Go language. 36 36
+12 -10
SPECIFICATION.md
··· 1 - # plcbundle V1 Specification 1 + # plcbundle V1 (draft) Specification 2 + 3 + > ⚠️ **Preview Version - Request for Comments!** 2 4 3 5 ## 1. Abstract 4 6 5 - `plcbundle` is a system for archiving and distributing DID PLC (Placeholder) directory operations in a secure, verifiable, and efficient manner. It groups chronological operations from the PLC directory into immutable, compressed bundles. These bundles are cryptographically linked, forming a verifiable chain of history. This specification details the V1 format for the bundles, the index file that describes them, and the processes for creating and verifying them to ensure interoperability between implementations. 7 + `plcbundle` is a system for archiving and distributing [DID PLC (Placeholder) directory](plc.directory) operations in a secure, verifiable, and efficient manner. It groups chronological operations from the PLC directory into immutable, compressed bundles. These bundles are cryptographically linked, forming a verifiable chain of history. This specification details the V1 format for the bundles, the index file that describes them, and the processes for creating and verifying them to ensure interoperability between implementations. 6 8 7 9 --- 8 10 9 11 ## 2. Key Terminology 10 12 11 - * **Operation:** A single DID PLC operation, as exported from a PLC directory's `/export` endpoint. It is represented as a single JSON object. 13 + * **Operation:** A single DID PLC operation, as exported from a PLC directory's [`/export` endpoint](https://web.plc.directory/api/redoc#operation/Export). It is represented as a single JSON object. 12 14 * **Bundle:** A single compressed file containing a fixed number of operations. 13 15 * **Index:** A JSON file named `plc_bundles.json` that contains metadata for all available bundles in the repository. It is the entry point for discovering and verifying bundles. 14 - * **Content Hash:** The SHA-256 hash of the *uncompressed* JSONL content of a single bundle. This hash uniquely identifies the bundle's data. 16 + * **Content Hash:** The [SHA-256](https://en.wikipedia.org/wiki/SHA-2) hash of the *uncompressed* [JSONL](https://jsonlines.org/) content of a single bundle. This hash uniquely identifies the bundle's data. 15 17 * **Chain Hash:** A cumulative SHA-256 hash that links a bundle to its predecessor, ensuring the integrity and order of the entire chain. 16 18 * **Compressed Hash:** The SHA-256 hash of the *compressed* `.jsonl.zst` bundle file. This is used to verify file integrity during downloads. 17 19 ··· 30 32 * **Naming Convention:** Bundles are named sequentially with six-digit zero-padding, following the format `%06d.jsonl.zst`. 31 33 * *Examples:* `000001.jsonl.zst`, `000123.jsonl.zst`. 32 34 * **Content:** Each bundle contains exactly **10,000** PLC operations. 33 - * **Compression:** The JSONL content is compressed using Zstandard (zstd). 35 + * **Compression:** The JSONL content is compressed using [Zstandard](https://facebook.github.io/zstd/) (zstd). 34 36 35 37 ### 4.2. Serialization and Data Integrity 36 38 ··· 90 92 91 93 ### 6.1. Collecting Operations 92 94 93 - 1. **Mempool:** Operations are fetched from the PLC directory's `/export` endpoint and collected into a temporary staging area, or "mempool". 94 - 2. **Chronological Validation:** The mempool must enforce that operations are added in chronological order, as described in Section 3. 95 + 1. **Mempool:** Operations are fetched from the PLC directory's [`/export` endpoint](https://web.plc.directory/api/redoc#operation/Export) and collected into a temporary staging area, or "mempool". 96 + 2. **Chronological Validation:** The mempool must enforce that operations are added in chronological order, as described in [Section 3](#3-operation-order-and-reproducibility). 95 97 3. **Boundary Deduplication:** To prevent including the same operation in two adjacent bundles, the system must use a "boundary CID" mechanism. When creating bundle `N+1`, it must ignore any fetched operations whose `createdAt` timestamp and `CID` match those from the very end of bundle `N`. 96 98 4. **Filling the Mempool:** The process continues fetching and deduplicating operations until at least 10,000 are collected in the mempool. 97 99 98 100 ### 6.2. Creating a Bundle File 99 101 100 102 1. **Take Operations:** Exactly 10,000 operations are taken from the front of the mempool. 101 - 2. **Serialize:** These operations are serialized into a single block of newline-delimited JSON (JSONL), adhering to the integrity rules in Section 4.2. 102 - 3. **Compress and Save:** The JSONL data is compressed using Zstandard and saved to a file with the appropriate sequential name (e.g., `000001.jsonl.zst`). 103 + 2. **Serialize:** These operations are serialized into a single block of newline-delimited JSON ([JSONL](https://jsonlines.org/)), adhering to the integrity rules in [Section 4.2](#42-serialization-and-data-integrity). 104 + 3. **Compress and Save:** The JSONL data is compressed using [Zstandard](https://facebook.github.io/zstd/) and saved to a file with the appropriate sequential name (e.g., `000001.jsonl.zst`). 103 105 104 106 ### 6.3. Hash Calculation 105 107 ··· 123 125 124 126 ### 6.4. Updating the Index 125 127 126 - 1. A new `BundleMetadata` object is created for the new bundle, populated with all the information described in Section 5.2. 128 + 1. A new `BundleMetadata` object is created for the new bundle, populated with all the information described in [Section 5.2](#52-bundlemetadata-object). 127 129 2. This metadata object is appended to the `bundles` array in the main `Index` object. 128 130 3. The `Index` object's top-level fields (`last_bundle`, `updated_at`, `total_size_bytes`) are updated to reflect the new state. 129 131 4. The entire `Index` object is serialized to JSON and saved, atomically overwriting the existing `plc_bundles.json` file.