An easy-to-host PDS on the ATProtocol, iPhone and MacOS. Maintain control of your keys and data, always.
at main 273 lines 22 kB view raw view rendered
1**Data Migration Spec** 2 3Device Swap, Recovery & Key Migration 4 5v0.2 — Shamir Model Update + Mobile Cross-References 6 7March 2026 8 9Companion to Provisioning API Spec v0.2 10 11**Changelog** 12 13``` 14v0.2 Changes — Shamir Model Update + Mobile Cross-References 15 16FIX Shamir share model: Share 3 is user's choice (device-local or BIP-39) 17NEW Cross-references to mobile spec §7 for phone recovery 18FIX Milestone alignment with unified-milestone-map.md 19``` 20 21**1. Overview** 22 23This document specifies the data migration system for the desktop PDS application. It covers two primary scenarios: planned device swaps (user voluntarily moves to a new machine) and unplanned device loss (hardware failure, theft, or accidental damage). Both scenarios share core infrastructure but diverge in their recovery ceremony. 24 25The migration system builds on three foundational components from the existing architecture: the Shamir secret sharing scheme for DID key protection, the relay layer's configurable caching behavior, and the Iroh peer-to-peer transport. 26 27**1.1 Design Principles** 28 29- **Zero key exposure:** DID signing keys never transit the network unencrypted, even during migration. All key material is wrapped before leaving the source device or reconstructed only on the destination device. 30 31- **Sovereignty preserved:** The relay never holds sufficient key material to impersonate the user. Recovery always requires at least two of three Shamir shares, and the relay holds at most one. 32 33- **Grandma-proof UX:** The planned swap happy path requires entering a 6-digit code. Unplanned recovery requires signing into iCloud on the new device (which most users have already done). 34 35- **Tier-aware restoration:** Paid users get full repo mirrors for instant recovery. Free users reconstruct from the network, accepting possible blob loss. 36 37**1.2 Migration Assets** 38 39Each asset has a distinct risk profile and recovery strategy: 40 41 ----------------------- ------------------ ------------------------------- ----------------------------------------------------------- 42 **Asset** **Risk if Lost** **Recovery Source** **Notes** 43 DID signing key **Catastrophic** Shamir reconstruction Identity is permanently lost without 2-of-3 shares 44 ATProto repo (CAR) High Relay mirror or network crawl Signed commit history; can be re-fetched if crawled 45 Blob store Medium Relay mirror or CDN Images/media; may be lost if never crawled by AppView 46 App config Low Relay account metadata Handle, relay endpoint, preferences; easily reconstructed 47 iCloud Keychain share Low (redundant) Apple iCloud sync Auto-syncs to new device via Apple ID 48 ----------------------- ------------------ ------------------------------- ----------------------------------------------------------- 49 50**2. Shamir Key Recovery Model** 51 52The DID signing key is split into three shares using Shamir's Secret Sharing (2-of-3 threshold). Any two shares are sufficient to reconstruct the key; no single share reveals any information about the key. 53 54**2.1 Share Distribution** 55 56 ----------- ------------------------------------------ ----------------------------------------------------- ------------------------------------------------------------------ 57 **Share** **Holder** **Storage** **Recovery Access** 58 Share 1 iCloud Keychain Keychain (E2E encrypted by Apple) Available on any device signed into the same iCloud account 59 Share 2 Relay service Encrypted at rest, server-side HSM-wrapped Released after account authentication (email + password) 60 Share 3 User's choice (device-local OR BIP-39) Secure Enclave / Keychain (device-local) or paper/USB export Auto-available on configured device, or manual entry from backup 61 ----------- ------------------------------------------ ----------------------------------------------------- ------------------------------------------------------------------ 62 63**2.2 Share Holder Rationale** 64 65The user's choice for share 3 balances convenience and resilience. For device-local storage, the user designates a second device (e.g., iPad) where share 3 is stored in the Secure Enclave/Keychain, making recovery seamless across their ecosystem if they lose their primary device. For BIP-39 backup, power users who want full air-gap sovereignty can export share 3 as a recovery phrase for paper or USB storage. The recovery ceremony code is identical in both cases; only the share retrieval step differs. 66 67iCloud Keychain (share 1) was chosen as the default anchor for UX simplicity: most macOS users are already signed into iCloud, making unplanned recovery require zero additional user action beyond installing the app on a new device. 68 69**2.3 Threat Model** 70 71- **Relay compromise alone:** Attacker obtains share 2 only. Insufficient for key reconstruction. 72 73- **iCloud compromise alone:** Attacker obtains share 1 only. Insufficient. 74 75- **Relay + iCloud compromise:** Attacker can reconstruct key. Mitigation: relay share is HSM-wrapped and requires account auth; iCloud Keychain is E2E encrypted and requires Apple ID + device passcode. Combined compromise is a sophisticated, targeted attack. 76 77- **Device theft (unlocked):** Attacker has share 3 if device-local, or nothing if BIP-39 is in a separate location. Mitigation: biometric/password gate on the app's key export flow. 78 79**3. Planned Device Swap** 80 81The happy path. The user's old machine is still accessible. This flow uses a direct Iroh peer connection for local transfer, with relay-mediated fallback for remote swaps. 82 83**3.1 Flow** 84 851. **Initiate transfer.** User opens Settings → Transfer to New Device on the old machine. The app generates a one-time 6-digit transfer code and displays it on screen. Internally, the app bundles: full repo snapshot (CAR file export), blob archive, DID signing key (encrypted with the transfer code as symmetric key via AES-256-GCM), app config (handle, relay endpoint, preferences), and a manifest with checksums. 86 872. **Establish peer connection.** User installs the app on the new machine, selects "Transfer from Existing Device," and enters the 6-digit code. The app uses Iroh's peer discovery to find the old machine on the local network. If both machines are on the same LAN, the transfer is direct (no relay involvement). If remote, the transfer routes through the Iroh relay, encrypted end-to-end with the transfer code as the shared secret. 88 893. **Transfer and verify.** The bundle streams from old → new. The new machine verifies the manifest checksums, decrypts the DID key, and validates it can sign a test commit against the repo's Merkle root. 90 914. **Device lease handover.** The new machine calls POST /v1/devices/:id/lease to acquire the primary device lease from the relay. The old machine's lease is released. The relay begins routing traffic to the new device's Iroh node ID. 92 935. **Shamir share rotation.** The new machine generates a fresh Shamir split of the DID key and updates share 1 (iCloud Keychain), share 2 (relay via PUT /v1/keys/shares/:id), and share 3 (device-local Keychain or BIP-39 export). This ensures the old machine's local share is invalidated. 94 956. **Decommission old device.** The old machine's app detects the lease release and prompts: "Transfer complete. Wipe local data?" On confirmation, it securely erases the local repo, blobs, and key material. 96 97**3.2 Transfer Code Security** 98 99The 6-digit code provides approximately 20 bits of entropy, which is intentionally low for usability. Security relies on the transfer window being short-lived (default: 10 minutes), the Iroh connection requiring the code for handshake, and rate limiting on connection attempts (3 failures = code invalidated, regenerate required). For power users, a "Show full code" option reveals a 24-character alphanumeric code for higher entropy. 100 101**4. Unplanned Device Loss** 102 103The old machine is gone. Recovery depends entirely on the Shamir shares and the relay's cached data. 104 105**4.1 Recovery Ceremony** 106 1071. **Install and select recovery.** User installs the app on a new machine and selects "Recover Existing Identity." 108 1092. **Authenticate with relay.** User signs in with their account credentials (email + password from initial provisioning). The relay verifies identity and releases Shamir share 2. 110 1113. **Retrieve share 3.** If the user chose device-local storage: the app attempts to retrieve share 3 from the configured backup device (via iCloud Keychain sync or local network if available). If the configured device is inaccessible, the user can enter a BIP-39 backup phrase if one was exported during setup. If the user chose BIP-39 backup: the app prompts for manual entry of the recovery phrase. 112 1134. **Reconstruct DID key.** Shares 2 + 3 are combined via Shamir reconstruction. The app verifies the key by checking its public component against the DID document retrieved from the PLC directory. 114 1155. **Restore repo and blobs.** Restoration behavior depends on the user's tier (see section 4.2). 116 1176. **Re-establish relay presence.** Register new device lease, publish DID rotation operation if key was rotated, and resume Iroh tunnel to relay. 118 1197. **Rotate Shamir shares.** Same as planned swap step 5: generate fresh split, update all three share holders. This invalidates the lost device's share 3. 120 121**4.2 Tier-Based Repo Restoration** 122 123 ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 124 **Tier** **Behavior** **Tradeoffs** 125 **Paid** Relay holds a full repo mirror (synced continuously). On recovery, the relay streams the complete CAR file + blobs to the new device. Target: \< 5 min for a typical repo. Near-instant, zero data loss. Relay storage cost scales with repo size. User pays for this via subscription. 126 **Free** Relay holds only the recent activity buffer (configurable, default 7 days). On recovery, the app: (a) imports the buffer from the relay, (b) calls com.atproto.sync.getRepo against the AppView/BGS to fetch the historical repo, (c) attempts to recover blobs from known CDN endpoints. Slower recovery (minutes to hours depending on repo size and network). Blobs that were never crawled by the AppView are permanently lost. Commit history intact if the BGS indexed it. 127 ---------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 128 129**4.3 The Blob Loss Problem (Free Tier)** 130 131On the free tier, blobs (images, media) that were uploaded but never crawled by the AppView or any relay are unrecoverable after device loss. This is an inherent tradeoff of not paying for relay-side mirroring. 132 133Mitigations: 134 135- **Proactive crawl requests:** After every blob upload, the app calls requestCrawl to the configured AppView, increasing the likelihood the blob is indexed before any loss event. 136 137- **Blob inventory warnings:** The health monitor tracks which blobs have been confirmed crawled vs. uncrawled. Settings → Data Health shows a "X blobs not yet backed up by the network" count, with an upgrade prompt. 138 139- **Grace period on tier downgrade:** If a paid user downgrades to free, the relay retains the full mirror for 30 days before pruning to buffer-only. 140 141**4.4 Phone Recovery** 142 143Phone-to-phone recovery uses the same Shamir infrastructure as desktop recovery. The mobile architecture spec (§7) details the iOS-specific flow: 144 1451. New phone signs into iCloud → Share 1 is available 1462. User authenticates with relay → Share 2 is available 1473. Relay reconstructs rotation key from 2 shares 1484. Relay re-generates signing key, updates DID document 1495. New phone stores new rotation key in Secure Enclave 150 151The key difference from desktop recovery: in phone recovery, the relay already holds the repo (it's the PDS in mobile-only mode), so there's no repo transfer step. Recovery is purely a key reconstruction + DID update operation. 152 153See: mobile-architecture-spec-v1.3 §7.2 for the complete flow. 154 155**5. API Surface** 156 157These endpoints extend the existing provisioning API. All endpoints require bearer token authentication. 158 159**5.1 Transfer Endpoints** 160 161 ----------------------- ------------ ---------------------------------------------------------------------------------------------------- 162 **Endpoint** **Method** **Purpose** 163 /v1/transfer/initiate POST Generate transfer session + code. Returns session ID and encrypted bundle metadata. 164 /v1/transfer/accept POST New device submits transfer code. Relay brokers Iroh peer introduction if direct connection fails. 165 /v1/transfer/complete POST Finalize transfer. Triggers lease handover and old device notification. 166 ----------------------- ------------ ---------------------------------------------------------------------------------------------------- 167 168**5.2 Recovery Endpoints** 169 170 ------------------------- ----------------- ------------------------------------------------------------------------------------------------------ 171 **Endpoint** **Method** **Purpose** 172 /v1/recovery/initiate POST Begin recovery ceremony. Requires account credentials. Returns share 2 (encrypted to session key). 173 /v1/recovery/verify-key POST Client proves it reconstructed the correct DID key by signing a challenge. Unlocks repo restoration. 174 /v1/recovery/restore GET (streaming) Stream repo + blobs from relay cache. Paid tier: full mirror. Free tier: buffer only. 175 ------------------------- ----------------- ------------------------------------------------------------------------------------------------------ 176 177**5.3 Key Management Endpoints** 178 179 ----------------------- ------------ --------------------------------------------------------------------------------------------- 180 **Endpoint** **Method** **Purpose** 181 /v1/keys/shares/:id PUT Update the relay-held Shamir share after rotation. Requires proof of current key ownership. 182 /v1/keys/shares/:id DELETE Permanently delete relay-held share (account deletion flow). 183 /v1/keys/rotation-log GET Audit log of all Shamir rotations and share updates for the account. 184 ----------------------- ------------ --------------------------------------------------------------------------------------------- 185 186**6. Sequence Summaries** 187 188**6.1 Planned Swap Sequence** 189 190Old Device → generates transfer code → bundles repo + encrypted key 191 192New Device → enters code → discovers old device via Iroh LAN / relay fallback 193 194Old Device → streams bundle → New Device 195 196New Device → verifies checksums + decrypts key → POST /v1/devices/:id/lease 197 198New Device → generates fresh Shamir split → updates all 3 share holders 199 200Old Device → detects lease release → prompts wipe → securely erases 201 202**6.2 Unplanned Loss Sequence** 203 204New Device → POST /v1/recovery/initiate (credentials) → receives share 2 205 206New Device → retrieves share 3 from device-local storage (automatic) or paper (manual) 207 208New Device → Shamir reconstruct → POST /v1/recovery/verify-key (signed challenge) 209 210New Device → GET /v1/recovery/restore (streaming) → imports repo + blobs 211 212New Device → POST /v1/devices/:id/lease → DID rotation → Iroh tunnel up 213 214New Device → fresh Shamir split → updates all 3 share holders 215 216**7. Edge Cases and Risks** 217 218**7.1 Lost Share 3 (Paper or Device)** 219 220If the user chose BIP-39 backup and loses the paper, only shares 1 + 2 (iCloud + relay) remain. If the user chose device-local and loses the backup device, share 3 is inaccessible. Mitigation: during onboarding, the app clearly explains both options and recommends exporting a BIP-39 backup even for device-local mode. The app also offers periodic "recovery key health check" reminders. 221 222**7.2 Relay Downtime During Recovery** 223 224If the relay is unreachable when the user attempts recovery, share 2 is temporarily inaccessible. Mitigation: the relay is the only infrastructure component the user depends on, and its SLA is part of the service tier. Multi-region relay failover (designed in the provisioning spec) covers this. The recovery ceremony gracefully retries with exponential backoff. 225 226**7.3 Stale Relay Mirror** 227 228For paid users, the relay mirror may lag behind the device's latest commits if the device was actively posting when it was lost. The relay sync interval determines the maximum data loss window. Default: 5-minute sync interval, meaning up to 5 minutes of commits could be lost. Configurable per account. 229 230**7.4 Concurrent Recovery Attempts** 231 232If an attacker attempts recovery while the legitimate user is also recovering, the relay's recovery/initiate endpoint enforces a single active session per account. Second attempts return 409 Conflict with a "recovery already in progress" message. Sessions expire after 30 minutes. 233 234**7.5 Transfer Interrupted Mid-Stream** 235 236If the Iroh connection drops during a planned swap, the transfer session remains valid for 10 minutes. The new device can reconnect and resume from the last acknowledged chunk (the bundle is transferred in content-addressed blocks). After timeout, a new transfer code must be generated. 237 238**8. Implementation Milestones** 239 240**v0.1 — Basic Migration + Shamir Generation** 241 242- Planned device swap (LAN transfer via Iroh, 6-digit code) 243- Shamir share generation during account creation 244- Share 1 → iCloud Keychain storage 245- Share 2 → relay escrow 246- Share 3 → user's choice (device-local or BIP-39) 247- Note: Share GENERATION is v0.1. Share RECOVERY is v1.0. 248 249**v1.0 — Full Recovery** 250 251- Unplanned device loss recovery ceremony 252- Shamir reconstruction (2-of-3) 253- DID key rotation after recovery 254- Recovery UI in mobile app 255- Relay-side recovery session management 256 257**Later** 258 259- Multi-device sync (share key across devices without migration) 260 261See unified-milestone-map.md for how these milestones align with the architecture, provisioning API, and mobile spec phases. 262 263**9. Design Decisions Log** 264 265 ------------------------------------------- ----------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- 266 **Decision** **Rationale** **Alternatives Considered** 267 iCloud Keychain as share 1 Best UX for target audience (non-technical macOS users). Zero user action on recovery. E2E encrypted by Apple. Paper-only (too fragile for grandma), relay holds 2 shares (breaks sovereignty model). 268 Share 3 as user's choice Balances convenience (device-local for multi-device users) with resilience (BIP-39 for air-gap backup). Single-option model (less flexible for different user preferences). 269 6-digit transfer code for planned swap Balances usability (easy to read aloud) with security (short-lived session, rate-limited attempts). QR code (requires camera), Bluetooth pairing (unreliable), pre-shared key (complex UX). 270 Shamir rotation on every migration Ensures the old device's local share cannot be used even if physically recovered by an attacker after the swap. Reuse shares (simpler but leaves old share 1 valid indefinitely). 271 Configurable relay caching per tier Aligns cost with value. Full mirror is expensive; free users accept the tradeoff. Upgrade path is clear. Full mirror for all (unsustainable at scale), no caching (too risky). 272 Proactive requestCrawl after blob upload Reduces blob loss risk on free tier without requiring relay storage. Leverages existing ATProto infrastructure. Accept blob loss (bad UX), require paid tier for blob uploads (too restrictive). 273 ------------------------------------------- ----------------------------------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------