QuickDID is a high-performance AT Protocol identity resolution service written in Rust. It provides handle-to-DID resolution with Redis-backed caching and queue processing.
1# QuickDID - Development Guide for Claude
2
3## Overview
4QuickDID is a high-performance AT Protocol identity resolution service written in Rust. It provides bidirectional handle-to-DID and DID-to-handle resolution with multi-layer caching (Redis, SQLite, in-memory), queue processing, metrics support, proactive cache refreshing, and real-time cache updates via Jetstream consumer.
5
6## Configuration
7
8QuickDID follows the 12-factor app methodology and uses environment variables exclusively for configuration. There are no command-line arguments except for `--version` and `--help`.
9
10Configuration is validated at startup, and the service will exit with specific error codes if validation fails:
11- `error-quickdid-config-1`: Missing required environment variable
12- `error-quickdid-config-2`: Invalid configuration value
13- `error-quickdid-config-3`: Invalid TTL value (must be positive)
14- `error-quickdid-config-4`: Invalid timeout value (must be positive)
15
16## Common Commands
17
18### Building and Running
19```bash
20# Build the project
21cargo build
22
23# Run in debug mode (requires environment variables)
24HTTP_EXTERNAL=localhost:3007 cargo run
25
26# Run tests
27cargo test
28
29# Type checking
30cargo check
31
32# Linting
33cargo clippy
34
35# Show version
36cargo run -- --version
37
38# Show help
39cargo run -- --help
40```
41
42### Development with VS Code
43The project includes a `.vscode/launch.json` configuration for debugging with Redis integration. Use the "Debug executable 'quickdid'" launch configuration.
44
45## Architecture
46
47### Core Components
48
491. **Handle Resolution** (`src/handle_resolver/`)
50 - `BaseHandleResolver`: Core resolution using DNS and HTTP
51 - `RateLimitedHandleResolver`: Semaphore-based rate limiting with optional timeout
52 - `CachingHandleResolver`: In-memory caching layer with bidirectional support
53 - `RedisHandleResolver`: Redis-backed persistent caching with bidirectional lookups
54 - `SqliteHandleResolver`: SQLite-backed persistent caching with bidirectional support
55 - `ProactiveRefreshResolver`: Automatically refreshes cache entries before expiration
56 - All resolvers implement `HandleResolver` trait with:
57 - `resolve`: Handle-to-DID resolution
58 - `purge`: Remove entries by handle or DID
59 - `set`: Manually update handle-to-DID mappings
60 - Uses binary serialization via `HandleResolutionResult` for space efficiency
61 - Resolution stack: Cache → ProactiveRefresh (optional) → RateLimited (optional) → Base → DNS/HTTP
62 - Includes resolution timing measurements for metrics
63
642. **Binary Serialization** (`src/handle_resolution_result.rs`)
65 - Compact storage format using bincode
66 - Strips DID prefixes for did:web and did:plc methods
67 - Stores: timestamp (u64), method type (i16), payload (String)
68
693. **Queue System** (`src/queue/`)
70 - Supports MPSC (in-process), Redis, SQLite, and no-op adapters
71 - `HandleResolutionWork` items processed asynchronously
72 - Redis uses reliable queue pattern (LPUSH/RPOPLPUSH/LREM)
73 - SQLite provides persistent queue with work shedding capabilities
74
754. **HTTP Server** (`src/http/`)
76 - XRPC endpoints for AT Protocol compatibility
77 - Health check endpoint
78 - Static file serving from configurable directory (default: www)
79 - Serves .well-known files as static content
80 - CORS headers support for cross-origin requests
81 - Cache-Control headers with configurable max-age and stale directives
82 - ETag support with configurable seed for cache invalidation
83
845. **Metrics System** (`src/metrics.rs`)
85 - Pluggable metrics publishing with StatsD support
86 - Tracks counters, gauges, and timings
87 - Configurable tags for environment/service identification
88 - No-op adapter for development environments
89 - Metrics for Jetstream event processing
90
916. **Jetstream Consumer** (`src/jetstream_handler.rs`)
92 - Consumes AT Protocol firehose events via WebSocket
93 - Processes Account events (purges deleted/deactivated accounts)
94 - Processes Identity events (updates handle-to-DID mappings)
95 - Automatic reconnection with exponential backoff
96 - Comprehensive metrics for event processing
97 - Spawned as cancellable task using task manager
98
99## Key Technical Details
100
101### DID Method Types
102- `did:web`: Web-based DIDs, prefix stripped for storage
103- `did:plc`: PLC directory DIDs, prefix stripped for storage
104- Other DID methods stored with full identifier
105
106### Redis Integration
107- **Bidirectional Caching**:
108 - Stores both handle→DID and DID→handle mappings
109 - Uses MetroHash64 for key generation
110 - Binary data storage for efficiency
111 - Automatic synchronization of both directions
112- **Queuing**: Reliable queue with processing/dead letter queues
113- **Key Prefixes**: Configurable via `QUEUE_REDIS_PREFIX` environment variable
114
115### Handle Resolution Flow
1161. Check cache (Redis/SQLite/in-memory based on configuration)
1172. If cache miss and rate limiting enabled:
118 - Acquire semaphore permit (with optional timeout)
119 - If timeout configured and exceeded, return error
1203. Perform DNS TXT lookup or HTTP well-known query
1214. Cache result with appropriate TTL in both directions (handle→DID and DID→handle)
1225. Return DID or error
123
124### Cache Management Operations
125- **Purge**: Removes entries by either handle or DID
126 - Uses `atproto_identity::resolve::parse_input` for identifier detection
127 - Removes both handle→DID and DID→handle mappings
128 - Chains through all resolver layers
129- **Set**: Manually updates handle-to-DID mappings
130 - Updates both directions in cache
131 - Normalizes handles to lowercase
132 - Chains through all resolver layers
133
134## Environment Variables
135
136### Required
137- `HTTP_EXTERNAL`: External hostname for service endpoints (e.g., `localhost:3007`)
138
139### Optional - Core Configuration
140- `HTTP_PORT`: Server port (default: 8080)
141- `PLC_HOSTNAME`: PLC directory hostname (default: plc.directory)
142- `RUST_LOG`: Logging level (e.g., debug, info)
143- `STATIC_FILES_DIR`: Directory for serving static files (default: www)
144
145### Optional - Caching
146- `REDIS_URL`: Redis connection URL for caching
147- `SQLITE_URL`: SQLite database URL for caching (e.g., `sqlite:./quickdid.db`)
148- `CACHE_TTL_MEMORY`: TTL for in-memory cache in seconds (default: 600)
149- `CACHE_TTL_REDIS`: TTL for Redis cache in seconds (default: 7776000)
150- `CACHE_TTL_SQLITE`: TTL for SQLite cache in seconds (default: 7776000)
151
152### Optional - Queue Configuration
153- `QUEUE_ADAPTER`: Queue type - 'mpsc', 'redis', 'sqlite', 'noop', or 'none' (default: mpsc)
154- `QUEUE_REDIS_PREFIX`: Redis key prefix for queues (default: queue:handleresolver:)
155- `QUEUE_WORKER_ID`: Worker ID for queue operations (default: worker1)
156- `QUEUE_BUFFER_SIZE`: Buffer size for MPSC queue (default: 1000)
157- `QUEUE_SQLITE_MAX_SIZE`: Max queue size for SQLite work shedding (default: 10000)
158- `QUEUE_REDIS_TIMEOUT`: Redis blocking timeout in seconds (default: 5)
159- `QUEUE_REDIS_DEDUP_ENABLED`: Enable queue deduplication to prevent duplicate handles (default: false)
160- `QUEUE_REDIS_DEDUP_TTL`: TTL for deduplication keys in seconds (default: 60)
161
162### Optional - Rate Limiting
163- `RESOLVER_MAX_CONCURRENT`: Maximum concurrent handle resolutions (default: 0 = disabled)
164- `RESOLVER_MAX_CONCURRENT_TIMEOUT_MS`: Timeout for acquiring rate limit permit in ms (default: 0 = no timeout)
165
166### Optional - HTTP Cache Control
167- `CACHE_MAX_AGE`: Max-age for Cache-Control header in seconds (default: 86400)
168- `CACHE_STALE_IF_ERROR`: Stale-if-error directive in seconds (default: 172800)
169- `CACHE_STALE_WHILE_REVALIDATE`: Stale-while-revalidate directive in seconds (default: 86400)
170- `CACHE_MAX_STALE`: Max-stale directive in seconds (default: 86400)
171- `ETAG_SEED`: Seed value for ETag generation (default: application version)
172
173### Optional - Metrics
174- `METRICS_ADAPTER`: Metrics adapter type - 'noop' or 'statsd' (default: noop)
175- `METRICS_STATSD_HOST`: StatsD host and port (required when METRICS_ADAPTER=statsd, e.g., localhost:8125)
176- `METRICS_STATSD_BIND`: Bind address for StatsD UDP socket (default: [::]:0 for IPv6, can use 0.0.0.0:0 for IPv4)
177- `METRICS_PREFIX`: Prefix for all metrics (default: quickdid)
178- `METRICS_TAGS`: Comma-separated tags (e.g., env:prod,service:quickdid)
179
180### Optional - Proactive Refresh
181- `PROACTIVE_REFRESH_ENABLED`: Enable proactive cache refreshing (default: false)
182- `PROACTIVE_REFRESH_THRESHOLD`: Refresh when TTL remaining is below this threshold (0.0-1.0, default: 0.8)
183
184### Optional - Jetstream Consumer
185- `JETSTREAM_ENABLED`: Enable Jetstream consumer for real-time cache updates (default: false)
186- `JETSTREAM_HOSTNAME`: Jetstream WebSocket hostname (default: jetstream.atproto.tools)
187
188## Error Handling
189
190All error strings must use this format:
191
192 error-quickdid-<domain>-<number> <message>: <details>
193
194Current error domains and examples:
195
196* `config`: Configuration errors (e.g., error-quickdid-config-1 Missing required environment variable)
197* `resolve`: Handle resolution errors (e.g., error-quickdid-resolve-1 Failed to resolve subject)
198* `queue`: Queue operation errors (e.g., error-quickdid-queue-1 Failed to push to queue)
199* `cache`: Cache-related errors (e.g., error-quickdid-cache-1 Redis pool creation failed)
200* `result`: Serialization errors (e.g., error-quickdid-result-1 System time error)
201* `task`: Task processing errors (e.g., error-quickdid-task-1 Queue adapter health check failed)
202
203Errors should be represented as enums using the `thiserror` library.
204
205Avoid creating new errors with the `anyhow!(...)` or `bail!(...)` macro.
206
207## Testing
208
209### Running Tests
210```bash
211# Run all tests
212cargo test
213
214# Run with Redis integration tests
215TEST_REDIS_URL=redis://localhost:6379 cargo test
216
217# Run specific test module
218cargo test handle_resolver::tests
219```
220
221### Test Coverage Areas
222- Handle resolution with various DID methods
223- Binary serialization/deserialization
224- Redis caching and expiration with bidirectional lookups
225- Queue processing logic
226- HTTP endpoint responses
227- Jetstream event handler processing
228- Purge and set operations across resolver layers
229
230## Development Patterns
231
232### Error Handling
233- Uses strongly-typed errors with `thiserror` for all modules
234- Each error has a unique identifier following the pattern `error-quickdid-<domain>-<number>`
235- Graceful fallbacks when Redis/SQLite is unavailable
236- Detailed tracing for debugging
237- Avoid using `anyhow!()` or `bail!()` macros - use proper error types instead
238
239### Performance Optimizations
240- Binary serialization reduces storage by ~40%
241- MetroHash64 for fast key generation
242- Connection pooling for Redis
243- Configurable TTLs for cache entries
244- Rate limiting via semaphore-based concurrency control
245- HTTP caching with ETag and Cache-Control headers
246- Resolution timing metrics for performance monitoring
247
248### Code Style
249- Follow existing Rust idioms and patterns
250- Use `tracing` for logging, not `println!`
251- Prefer `Arc` for shared state across async tasks
252- Handle errors explicitly, avoid `.unwrap()` in production code
253- Use `httpdate` crate for HTTP date formatting (not `chrono`)
254
255## Common Tasks
256
257### Adding a New DID Method
2581. Update `DidMethodType` enum in `handle_resolution_result.rs`
2592. Modify `parse_did()` and `to_did()` methods
2603. Add test cases for the new method type
261
262### Modifying Cache TTL
263- For in-memory: Set `CACHE_TTL_MEMORY` environment variable
264- For Redis: Set `CACHE_TTL_REDIS` environment variable
265- For SQLite: Set `CACHE_TTL_SQLITE` environment variable
266
267### Configuring Metrics
2681. Set `METRICS_ADAPTER=statsd` and `METRICS_STATSD_HOST=localhost:8125`
2692. Configure tags with `METRICS_TAGS=env:prod,service:quickdid`
2703. Use Telegraf + TimescaleDB for aggregation (see `docs/telegraf-timescaledb-metrics-guide.md`)
2714. Railway deployment resources available in `railway-resources/telegraf/`
272
273### Debugging Resolution Issues
2741. Enable debug logging: `RUST_LOG=debug`
2752. Check Redis cache:
276 - Handle lookup: `redis-cli GET "handle:<hash>"`
277 - DID lookup: `redis-cli GET "handle:<hash>"` (same key format)
2783. Check SQLite cache: `sqlite3 quickdid.db "SELECT * FROM handle_resolution_cache;"`
2794. Monitor queue processing in logs
2805. Check rate limiting: Look for "Rate limit permit acquisition timed out" errors
2816. Verify DNS/HTTP connectivity to AT Protocol infrastructure
2827. Monitor metrics for resolution timing and cache hit rates
2838. Check Jetstream consumer status:
284 - Look for "Jetstream consumer" log entries
285 - Monitor `jetstream.*` metrics
286 - Check reconnection attempts in logs
287
288## Dependencies
289- `atproto-identity`: Core AT Protocol identity resolution
290- `atproto-jetstream`: AT Protocol Jetstream event consumer
291- `bincode`: Binary serialization
292- `deadpool-redis`: Redis connection pooling
293- `metrohash`: Fast non-cryptographic hashing
294- `tokio`: Async runtime
295- `axum`: Web framework
296- `httpdate`: HTTP date formatting (replacing chrono)
297- `cadence`: StatsD metrics client
298- `thiserror`: Error handling