learn and share notes on atproto (wip) 🦉 malfestio.stormlightlabs.org/
readability solid axum atproto srs

AT Protocol Research Notes#

Reference material for AT Protocol integration. For implementation details, see todo.md.

OAuth 2.1 Specification#

AT Protocol uses a specific profile of OAuth 2.1 for client↔PDS authorization.

Required Components#

  • Client Metadata Endpoint: Serve client_metadata.json at a public HTTPS URL (this URL becomes the client_id)

    {
      "client_id": "https://your-app.com/oauth/client-metadata.json",
      "application_type": "web",
      "grant_types": ["authorization_code", "refresh_token"],
      "scope": "atproto transition:generic",
      "response_types": ["code"],
      "redirect_uris": ["https://your-app.com/oauth/callback"],
      "client_name": "Malfestio",
      "client_uri": "https://your-app.com"
    }
    
  • PKCE (Mandatory): Generate code_verifier and code_challenge (S256 only)

  • DPoP (Mandatory): Bind tokens to client instances with proof-of-possession JWTs

  • Handle/DID Resolution: Resolve user identity to discover their PDS

  • Token Exchange: Authorization code flow with token refresh

DPoP (Demonstrating Proof-of-Possession)#

DPoP (RFC 9449) binds access tokens to specific client instances, preventing token theft/replay.

Proof JWT Structure:

  • Header: typ: dpop+jwt, alg: EdDSA (or ES256), jwk: <public key>
  • Payload Claims:
    • jti — Unique identifier (nonce) per request
    • htm — HTTP method (e.g., "POST", "GET")
    • htu — HTTP target URI (without query/fragment)
    • iat — Issued-at timestamp
    • ath — SHA-256 hash of access token (for resource requests)
    • nonce — Server-provided nonce (if required)

Usage:

  1. Client generates DPoP keypair per session (not reused across devices/users)
  2. Each request includes Authorization: DPoP <token> and DPoP: <proof JWT>
  3. Server validates signature, checks claims match request, verifies token binding

Server Behavior:

  • May return DPoP-Nonce header; client must include in subsequent proofs
  • Validates jti uniqueness to prevent replay attacks
  • Checks ath matches provided access token

Record Publishing#

XRPC Endpoints#

  • com.atproto.repo.putRecord — Create or update records
  • com.atproto.repo.deleteRecord — Remove records
  • com.atproto.repo.uploadBlob — Upload media attachments

Record Keys#

Use TID (timestamp-based identifiers) per Lexicon spec.

AT-URIs#

Format: at://<did>/<collection>/<rkey>

Example: at://did:plc:abc123/org.stormlightlabs.malfestio.deck/3k5abc123

Firehose / Jetstream#

Overview#

The AT Protocol provides two main options for consuming real-time repository events:

  1. Raw Firehose (com.atproto.sync.subscribeRepos) - Full-fidelity, CBOR-encoded, cryptographically signed
  2. Jetstream - Simplified JSON format, lower bandwidth, easier to consume

Raw Firehose#

  • WebSocket: Subscribe to com.atproto.sync.subscribeRepos from a Relay
  • CBOR Decoding: Parse CAR files containing MST blocks
  • Cryptographic Verification: Validate commit signatures against DID signing keys
  • Cursor Management: Track seq position for reliable reconnection

Event Types:

  • #commit - Repository changes (record create/update/delete)
  • #identity - DID/handle updates
  • #account - Account status changes (active, deactivated, etc.)

Jetstream (Simplified)#

Bluesky's simplified JSON firehose - ideal for indexing and discovery:

  • JSON format: No CBOR decoding required
  • zstd compression: Reduced bandwidth (enable with compress=true)
  • Collection filtering: Subscribe to specific NSIDs
  • DID filtering: Watch specific accounts
  • Cursor-based reconnection: Microsecond timestamps

Public Endpoints:

  • wss://jetstream1.us-east.bsky.network/subscribe
  • wss://jetstream2.us-west.bsky.network/subscribe

Tradeoffs:

  • ⚠️ Events are NOT cryptographically signed (trust the Jetstream operator)
  • ⚠️ Not self-authenticating data
  • ✅ Much simpler to implement
  • ✅ Lower bandwidth and compute requirements

Reliable Synchronization#

Cursor Tracking:

  • Store cursor position (microsecond timestamp) per endpoint
  • Resume from last processed cursor on reconnect
  • Handle gaps by fetching missing commits via getRepo if needed

Per-Repo Revision Tracking:

  • Track latest rev (TID) for each DID
  • Compare incoming rev against stored value to detect gaps
  • Use since field to detect out-of-order events

Deletion Handling:

  • Handle operation: "delete" in commit events
  • Mark records as deleted (soft or hard delete)

Best Practices:

  • Process events sequentially per-DID (partition by DID)
  • Ignore events with rev ≤ stored latest rev
  • Validate records against Lexicon schema before indexing

Well-Known Endpoints#

  • /.well-known/atproto-did — Domain verification for handle claims
  • /.well-known/oauth-protected-resource — PDS OAuth metadata
  • /.well-known/oauth-authorization-server — Auth server metadata

Labelers#

Architecture:

  1. Labels = metadata (source DID + subject AT-URI + value string)
  2. User Subscription = users subscribe to labelers; clients include in API requests
  3. Label Interpretation = per-user config to hide, warn, or ignore content

Structure:

{
  "src": "did:plc:labeler",
  "uri": "at://did:user/app.bsky.feed.post/123",
  "val": "spam",
  "cts": "2026-01-01T00:00:00Z"
}

Feeds#

Core Flow:

  1. User requests feed via at-uri of declared feed
  2. PDS resolves at-uri → Feed Generator's DID doc
  3. PDS sends getFeedSkeleton to service endpoint (authenticated by user's JWT)
  4. Feed Generator returns skeleton (list of post URIs + cursor)
  5. PDS hydrates skeleton with full content (via AppView)
  6. Hydrated feed returned to user

AppView#

Responsibilities:

  1. Record Processing & Indexing - consume firehose, build indices for likes, threads, follows
  2. Moderation Enforcement - apply labels from subscribed labelers
  3. Query Interface - expose XRPC API (proxied through PDS)
  4. Media CDN - fetch/cache blobs from upstream PDSes, generate thumbnails
  5. Search & Discovery - full-text search, type-ahead, content ranking

Patterns from Real AT Protocol Apps#

plyr.fm (Music)#

  • OAuth 2.1 via @atproto/oauth-client library
  • Records synced to PDS: tracks, likes, playlists
  • Separate moderation service (Rust labeler)

leaflet.pub (Writing)#

  • React/Next.js frontend with Supabase + Replicache for sync
  • Bluesky integration via dedicated lexicons/ and appview/ directories

wisp.place (Static Sites)#

  • Stores site files as place.wisp.fs records in user's PDS
  • Firehose consumer to index and serve sites
  • CDN layer caches content from PDS

Common Patterns#

  1. Local database for fast queries + PDS for portable, signed records
  2. Firehose consumption for discovery/aggregation
  3. OAuth 2.1 for production auth (app passwords only for development)
  4. Lexicons define the public contract; internal state stays private

References#