QuickDID is a high-performance AT Protocol identity resolution service written in Rust. It provides handle-to-DID resolution with Redis-backed caching and queue processing.

QuickDID - Development Guide for Claude#

Overview#

QuickDID is a high-performance AT Protocol identity resolution service written in Rust. It provides bidirectional handle-to-DID and DID-to-handle resolution with multi-layer caching (Redis, SQLite, in-memory), queue processing, metrics support, proactive cache refreshing, and real-time cache updates via Jetstream consumer.

Configuration#

QuickDID follows the 12-factor app methodology and uses environment variables exclusively for configuration. There are no command-line arguments except for --version and --help.

Configuration is validated at startup, and the service will exit with specific error codes if validation fails:

  • error-quickdid-config-1: Missing required environment variable
  • error-quickdid-config-2: Invalid configuration value
  • error-quickdid-config-3: Invalid TTL value (must be positive)
  • error-quickdid-config-4: Invalid timeout value (must be positive)

Common Commands#

Building and Running#

# Build the project
cargo build

# Run in debug mode (requires environment variables)
HTTP_EXTERNAL=localhost:3007 cargo run

# Run tests
cargo test

# Type checking
cargo check

# Linting
cargo clippy

# Show version
cargo run -- --version

# Show help
cargo run -- --help

Development with VS Code#

The project includes a .vscode/launch.json configuration for debugging with Redis integration. Use the "Debug executable 'quickdid'" launch configuration.

Architecture#

Core Components#

  1. Handle Resolution (src/handle_resolver/)

    • BaseHandleResolver: Core resolution using DNS and HTTP
    • RateLimitedHandleResolver: Semaphore-based rate limiting with optional timeout
    • CachingHandleResolver: In-memory caching layer with bidirectional support
    • RedisHandleResolver: Redis-backed persistent caching with bidirectional lookups
    • SqliteHandleResolver: SQLite-backed persistent caching with bidirectional support
    • ProactiveRefreshResolver: Automatically refreshes cache entries before expiration
    • All resolvers implement HandleResolver trait with:
      • resolve: Handle-to-DID resolution
      • purge: Remove entries by handle or DID
      • set: Manually update handle-to-DID mappings
    • Uses binary serialization via HandleResolutionResult for space efficiency
    • Resolution stack: Cache → ProactiveRefresh (optional) → RateLimited (optional) → Base → DNS/HTTP
    • Includes resolution timing measurements for metrics
  2. Binary Serialization (src/handle_resolution_result.rs)

    • Compact storage format using bincode
    • Strips DID prefixes for did:web and did:plc methods
    • Stores: timestamp (u64), method type (i16), payload (String)
  3. Queue System (src/queue/)

    • Supports MPSC (in-process), Redis, SQLite, and no-op adapters
    • HandleResolutionWork items processed asynchronously
    • Redis uses reliable queue pattern (LPUSH/RPOPLPUSH/LREM)
    • SQLite provides persistent queue with work shedding capabilities
  4. HTTP Server (src/http/)

    • XRPC endpoints for AT Protocol compatibility
    • Health check endpoint
    • Static file serving from configurable directory (default: www)
    • Serves .well-known files as static content
    • CORS headers support for cross-origin requests
    • Cache-Control headers with configurable max-age and stale directives
    • ETag support with configurable seed for cache invalidation
  5. Metrics System (src/metrics.rs)

    • Pluggable metrics publishing with StatsD support
    • Tracks counters, gauges, and timings
    • Configurable tags for environment/service identification
    • No-op adapter for development environments
    • Metrics for Jetstream event processing
  6. Jetstream Consumer (src/jetstream_handler.rs)

    • Consumes AT Protocol firehose events via WebSocket
    • Processes Account events (purges deleted/deactivated accounts)
    • Processes Identity events (updates handle-to-DID mappings)
    • Automatic reconnection with exponential backoff
    • Comprehensive metrics for event processing
    • Spawned as cancellable task using task manager

Key Technical Details#

DID Method Types#

  • did:web: Web-based DIDs, prefix stripped for storage
  • did:plc: PLC directory DIDs, prefix stripped for storage
  • Other DID methods stored with full identifier

Redis Integration#

  • Bidirectional Caching:
    • Stores both handle→DID and DID→handle mappings
    • Uses MetroHash64 for key generation
    • Binary data storage for efficiency
    • Automatic synchronization of both directions
  • Queuing: Reliable queue with processing/dead letter queues
  • Key Prefixes: Configurable via QUEUE_REDIS_PREFIX environment variable

Handle Resolution Flow#

  1. Check cache (Redis/SQLite/in-memory based on configuration)
  2. If cache miss and rate limiting enabled:
    • Acquire semaphore permit (with optional timeout)
    • If timeout configured and exceeded, return error
  3. Perform DNS TXT lookup or HTTP well-known query
  4. Cache result with appropriate TTL in both directions (handle→DID and DID→handle)
  5. Return DID or error

Cache Management Operations#

  • Purge: Removes entries by either handle or DID
    • Uses atproto_identity::resolve::parse_input for identifier detection
    • Removes both handle→DID and DID→handle mappings
    • Chains through all resolver layers
  • Set: Manually updates handle-to-DID mappings
    • Updates both directions in cache
    • Normalizes handles to lowercase
    • Chains through all resolver layers

Environment Variables#

Required#

  • HTTP_EXTERNAL: External hostname for service endpoints (e.g., localhost:3007)

Optional - Core Configuration#

  • HTTP_PORT: Server port (default: 8080)
  • PLC_HOSTNAME: PLC directory hostname (default: plc.directory)
  • RUST_LOG: Logging level (e.g., debug, info)
  • STATIC_FILES_DIR: Directory for serving static files (default: www)

Optional - Caching#

  • REDIS_URL: Redis connection URL for caching
  • SQLITE_URL: SQLite database URL for caching (e.g., sqlite:./quickdid.db)
  • CACHE_TTL_MEMORY: TTL for in-memory cache in seconds (default: 600)
  • CACHE_TTL_REDIS: TTL for Redis cache in seconds (default: 7776000)
  • CACHE_TTL_SQLITE: TTL for SQLite cache in seconds (default: 7776000)

Optional - Queue Configuration#

  • QUEUE_ADAPTER: Queue type - 'mpsc', 'redis', 'sqlite', 'noop', or 'none' (default: mpsc)
  • QUEUE_REDIS_PREFIX: Redis key prefix for queues (default: queue:handleresolver:)
  • QUEUE_WORKER_ID: Worker ID for queue operations (default: worker1)
  • QUEUE_BUFFER_SIZE: Buffer size for MPSC queue (default: 1000)
  • QUEUE_SQLITE_MAX_SIZE: Max queue size for SQLite work shedding (default: 10000)
  • QUEUE_REDIS_TIMEOUT: Redis blocking timeout in seconds (default: 5)
  • QUEUE_REDIS_DEDUP_ENABLED: Enable queue deduplication to prevent duplicate handles (default: false)
  • QUEUE_REDIS_DEDUP_TTL: TTL for deduplication keys in seconds (default: 60)

Optional - Rate Limiting#

  • RESOLVER_MAX_CONCURRENT: Maximum concurrent handle resolutions (default: 0 = disabled)
  • RESOLVER_MAX_CONCURRENT_TIMEOUT_MS: Timeout for acquiring rate limit permit in ms (default: 0 = no timeout)

Optional - HTTP Cache Control#

  • CACHE_MAX_AGE: Max-age for Cache-Control header in seconds (default: 86400)
  • CACHE_STALE_IF_ERROR: Stale-if-error directive in seconds (default: 172800)
  • CACHE_STALE_WHILE_REVALIDATE: Stale-while-revalidate directive in seconds (default: 86400)
  • CACHE_MAX_STALE: Max-stale directive in seconds (default: 86400)
  • ETAG_SEED: Seed value for ETag generation (default: application version)

Optional - Metrics#

  • METRICS_ADAPTER: Metrics adapter type - 'noop' or 'statsd' (default: noop)
  • METRICS_STATSD_HOST: StatsD host and port (required when METRICS_ADAPTER=statsd, e.g., localhost:8125)
  • METRICS_STATSD_BIND: Bind address for StatsD UDP socket (default: [::]:0 for IPv6, can use 0.0.0.0:0 for IPv4)
  • METRICS_PREFIX: Prefix for all metrics (default: quickdid)
  • METRICS_TAGS: Comma-separated tags (e.g., env:prod,service:quickdid)

Optional - Proactive Refresh#

  • PROACTIVE_REFRESH_ENABLED: Enable proactive cache refreshing (default: false)
  • PROACTIVE_REFRESH_THRESHOLD: Refresh when TTL remaining is below this threshold (0.0-1.0, default: 0.8)

Optional - Jetstream Consumer#

  • JETSTREAM_ENABLED: Enable Jetstream consumer for real-time cache updates (default: false)
  • JETSTREAM_HOSTNAME: Jetstream WebSocket hostname (default: jetstream.atproto.tools)

Error Handling#

All error strings must use this format:

error-quickdid-<domain>-<number> <message>: <details>

Current error domains and examples:

  • config: Configuration errors (e.g., error-quickdid-config-1 Missing required environment variable)
  • resolve: Handle resolution errors (e.g., error-quickdid-resolve-1 Failed to resolve subject)
  • queue: Queue operation errors (e.g., error-quickdid-queue-1 Failed to push to queue)
  • cache: Cache-related errors (e.g., error-quickdid-cache-1 Redis pool creation failed)
  • result: Serialization errors (e.g., error-quickdid-result-1 System time error)
  • task: Task processing errors (e.g., error-quickdid-task-1 Queue adapter health check failed)

Errors should be represented as enums using the thiserror library.

Avoid creating new errors with the anyhow!(...) or bail!(...) macro.

Testing#

Running Tests#

# Run all tests
cargo test

# Run with Redis integration tests
TEST_REDIS_URL=redis://localhost:6379 cargo test

# Run specific test module
cargo test handle_resolver::tests

Test Coverage Areas#

  • Handle resolution with various DID methods
  • Binary serialization/deserialization
  • Redis caching and expiration with bidirectional lookups
  • Queue processing logic
  • HTTP endpoint responses
  • Jetstream event handler processing
  • Purge and set operations across resolver layers

Development Patterns#

Error Handling#

  • Uses strongly-typed errors with thiserror for all modules
  • Each error has a unique identifier following the pattern error-quickdid-<domain>-<number>
  • Graceful fallbacks when Redis/SQLite is unavailable
  • Detailed tracing for debugging
  • Avoid using anyhow!() or bail!() macros - use proper error types instead

Performance Optimizations#

  • Binary serialization reduces storage by ~40%
  • MetroHash64 for fast key generation
  • Connection pooling for Redis
  • Configurable TTLs for cache entries
  • Rate limiting via semaphore-based concurrency control
  • HTTP caching with ETag and Cache-Control headers
  • Resolution timing metrics for performance monitoring

Code Style#

  • Follow existing Rust idioms and patterns
  • Use tracing for logging, not println!
  • Prefer Arc for shared state across async tasks
  • Handle errors explicitly, avoid .unwrap() in production code
  • Use httpdate crate for HTTP date formatting (not chrono)

Common Tasks#

Adding a New DID Method#

  1. Update DidMethodType enum in handle_resolution_result.rs
  2. Modify parse_did() and to_did() methods
  3. Add test cases for the new method type

Modifying Cache TTL#

  • For in-memory: Set CACHE_TTL_MEMORY environment variable
  • For Redis: Set CACHE_TTL_REDIS environment variable
  • For SQLite: Set CACHE_TTL_SQLITE environment variable

Configuring Metrics#

  1. Set METRICS_ADAPTER=statsd and METRICS_STATSD_HOST=localhost:8125
  2. Configure tags with METRICS_TAGS=env:prod,service:quickdid
  3. Use Telegraf + TimescaleDB for aggregation (see docs/telegraf-timescaledb-metrics-guide.md)
  4. Railway deployment resources available in railway-resources/telegraf/

Debugging Resolution Issues#

  1. Enable debug logging: RUST_LOG=debug
  2. Check Redis cache:
    • Handle lookup: redis-cli GET "handle:<hash>"
    • DID lookup: redis-cli GET "handle:<hash>" (same key format)
  3. Check SQLite cache: sqlite3 quickdid.db "SELECT * FROM handle_resolution_cache;"
  4. Monitor queue processing in logs
  5. Check rate limiting: Look for "Rate limit permit acquisition timed out" errors
  6. Verify DNS/HTTP connectivity to AT Protocol infrastructure
  7. Monitor metrics for resolution timing and cache hit rates
  8. Check Jetstream consumer status:
    • Look for "Jetstream consumer" log entries
    • Monitor jetstream.* metrics
    • Check reconnection attempts in logs

Dependencies#

  • atproto-identity: Core AT Protocol identity resolution
  • atproto-jetstream: AT Protocol Jetstream event consumer
  • bincode: Binary serialization
  • deadpool-redis: Redis connection pooling
  • metrohash: Fast non-cryptographic hashing
  • tokio: Async runtime
  • axum: Web framework
  • httpdate: HTTP date formatting (replacing chrono)
  • cadence: StatsD metrics client
  • thiserror: Error handling