QuickDID is a high-performance AT Protocol identity resolution service written in Rust. It provides handle-to-DID resolution with Redis-backed caching and queue processing.

documentation: Updating documentation with recent config and featuers

+48 -8
CLAUDE.md
··· 1 1 # QuickDID - Development Guide for Claude 2 2 3 3 ## Overview 4 - QuickDID is a high-performance AT Protocol identity resolution service written in Rust. It provides handle-to-DID resolution with multi-layer caching (Redis, SQLite, in-memory), queue processing, metrics support, and proactive cache refreshing. 4 + QuickDID is a high-performance AT Protocol identity resolution service written in Rust. It provides bidirectional handle-to-DID and DID-to-handle resolution with multi-layer caching (Redis, SQLite, in-memory), queue processing, metrics support, proactive cache refreshing, and real-time cache updates via Jetstream consumer. 5 5 6 6 ## Configuration 7 7 ··· 49 49 1. **Handle Resolution** (`src/handle_resolver/`) 50 50 - `BaseHandleResolver`: Core resolution using DNS and HTTP 51 51 - `RateLimitedHandleResolver`: Semaphore-based rate limiting with optional timeout 52 - - `CachingHandleResolver`: In-memory caching layer 53 - - `RedisHandleResolver`: Redis-backed persistent caching 54 - - `SqliteHandleResolver`: SQLite-backed persistent caching 52 + - `CachingHandleResolver`: In-memory caching layer with bidirectional support 53 + - `RedisHandleResolver`: Redis-backed persistent caching with bidirectional lookups 54 + - `SqliteHandleResolver`: SQLite-backed persistent caching with bidirectional support 55 55 - `ProactiveRefreshResolver`: Automatically refreshes cache entries before expiration 56 + - All resolvers implement `HandleResolver` trait with: 57 + - `resolve`: Handle-to-DID resolution 58 + - `purge`: Remove entries by handle or DID 59 + - `set`: Manually update handle-to-DID mappings 56 60 - Uses binary serialization via `HandleResolutionResult` for space efficiency 57 61 - Resolution stack: Cache → ProactiveRefresh (optional) → RateLimited (optional) → Base → DNS/HTTP 58 62 - Includes resolution timing measurements for metrics ··· 82 86 - Tracks counters, gauges, and timings 83 87 - Configurable tags for environment/service identification 84 88 - No-op adapter for development environments 89 + - Metrics for Jetstream event processing 90 + 91 + 6. **Jetstream Consumer** (`src/jetstream_handler.rs`) 92 + - Consumes AT Protocol firehose events via WebSocket 93 + - Processes Account events (purges deleted/deactivated accounts) 94 + - Processes Identity events (updates handle-to-DID mappings) 95 + - Automatic reconnection with exponential backoff 96 + - Comprehensive metrics for event processing 97 + - Spawned as cancellable task using task manager 85 98 86 99 ## Key Technical Details 87 100 ··· 91 104 - Other DID methods stored with full identifier 92 105 93 106 ### Redis Integration 94 - - **Caching**: Uses MetroHash64 for key generation, stores binary data 107 + - **Bidirectional Caching**: 108 + - Stores both handle→DID and DID→handle mappings 109 + - Uses MetroHash64 for key generation 110 + - Binary data storage for efficiency 111 + - Automatic synchronization of both directions 95 112 - **Queuing**: Reliable queue with processing/dead letter queues 96 113 - **Key Prefixes**: Configurable via `QUEUE_REDIS_PREFIX` environment variable 97 114 ··· 101 118 - Acquire semaphore permit (with optional timeout) 102 119 - If timeout configured and exceeded, return error 103 120 3. Perform DNS TXT lookup or HTTP well-known query 104 - 4. Cache result with appropriate TTL 121 + 4. Cache result with appropriate TTL in both directions (handle→DID and DID→handle) 105 122 5. Return DID or error 123 + 124 + ### Cache Management Operations 125 + - **Purge**: Removes entries by either handle or DID 126 + - Uses `atproto_identity::resolve::parse_input` for identifier detection 127 + - Removes both handle→DID and DID→handle mappings 128 + - Chains through all resolver layers 129 + - **Set**: Manually updates handle-to-DID mappings 130 + - Updates both directions in cache 131 + - Normalizes handles to lowercase 132 + - Chains through all resolver layers 106 133 107 134 ## Environment Variables 108 135 ··· 154 181 - `PROACTIVE_REFRESH_ENABLED`: Enable proactive cache refreshing (default: false) 155 182 - `PROACTIVE_REFRESH_THRESHOLD`: Refresh when TTL remaining is below this threshold (0.0-1.0, default: 0.8) 156 183 184 + ### Optional - Jetstream Consumer 185 + - `JETSTREAM_ENABLED`: Enable Jetstream consumer for real-time cache updates (default: false) 186 + - `JETSTREAM_HOSTNAME`: Jetstream WebSocket hostname (default: jetstream.atproto.tools) 187 + 157 188 ## Error Handling 158 189 159 190 All error strings must use this format: ··· 190 221 ### Test Coverage Areas 191 222 - Handle resolution with various DID methods 192 223 - Binary serialization/deserialization 193 - - Redis caching and expiration 224 + - Redis caching and expiration with bidirectional lookups 194 225 - Queue processing logic 195 226 - HTTP endpoint responses 227 + - Jetstream event handler processing 228 + - Purge and set operations across resolver layers 196 229 197 230 ## Development Patterns 198 231 ··· 239 272 240 273 ### Debugging Resolution Issues 241 274 1. Enable debug logging: `RUST_LOG=debug` 242 - 2. Check Redis cache: `redis-cli GET "handle:<hash>"` 275 + 2. Check Redis cache: 276 + - Handle lookup: `redis-cli GET "handle:<hash>"` 277 + - DID lookup: `redis-cli GET "handle:<hash>"` (same key format) 243 278 3. Check SQLite cache: `sqlite3 quickdid.db "SELECT * FROM handle_resolution_cache;"` 244 279 4. Monitor queue processing in logs 245 280 5. Check rate limiting: Look for "Rate limit permit acquisition timed out" errors 246 281 6. Verify DNS/HTTP connectivity to AT Protocol infrastructure 247 282 7. Monitor metrics for resolution timing and cache hit rates 283 + 8. Check Jetstream consumer status: 284 + - Look for "Jetstream consumer" log entries 285 + - Monitor `jetstream.*` metrics 286 + - Check reconnection attempts in logs 248 287 249 288 ## Dependencies 250 289 - `atproto-identity`: Core AT Protocol identity resolution 290 + - `atproto-jetstream`: AT Protocol Jetstream event consumer 251 291 - `bincode`: Binary serialization 252 292 - `deadpool-redis`: Redis connection pooling 253 293 - `metrohash`: Fast non-cryptographic hashing
+29 -1
README.md
··· 21 21 ## Features 22 22 23 23 - **Fast Handle Resolution**: Resolves AT Protocol handles to DIDs using DNS TXT records and HTTP well-known endpoints 24 + - **Bidirectional Caching**: Supports both handle-to-DID and DID-to-handle lookups with automatic cache synchronization 24 25 - **Multi-Layer Caching**: Flexible caching with three tiers: 25 26 - In-memory caching with configurable TTL (default: 600 seconds) 26 27 - Redis-backed persistent caching (default: 90-day TTL) 27 28 - SQLite-backed persistent caching (default: 90-day TTL) 29 + - **Jetstream Consumer**: Real-time cache updates from AT Protocol firehose: 30 + - Processes Account and Identity events 31 + - Automatically purges deleted/deactivated accounts 32 + - Updates handle-to-DID mappings in real-time 33 + - Comprehensive metrics for event processing 34 + - Automatic reconnection with backoff 28 35 - **HTTP Caching**: Client-side caching support with: 29 36 - ETag generation with configurable seed for cache invalidation 30 37 - Cache-Control headers with max-age, stale-while-revalidate, and stale-if-error directives ··· 39 46 - **Metrics & Monitoring**: 40 47 - StatsD metrics support for counters, gauges, and timings 41 48 - Resolution timing measurements 49 + - Jetstream event processing metrics 42 50 - Configurable tags for environment/service identification 43 51 - Integration guides for Telegraf and TimescaleDB 44 52 - Configurable bind address for StatsD UDP socket (IPv4/IPv6) ··· 51 59 - Redis-based deduplication for queue items 52 60 - Prevents duplicate handle resolution work 53 61 - Configurable TTL for deduplication keys 62 + - **Cache Management APIs**: 63 + - `purge` method for removing entries by handle or DID 64 + - `set` method for manually updating handle-to-DID mappings 65 + - Chainable operations across resolver layers 54 66 - **AT Protocol Compatible**: Implements XRPC endpoints for seamless integration with AT Protocol infrastructure 55 67 - **Comprehensive Error Handling**: Structured errors with unique identifiers (e.g., `error-quickdid-config-1`), health checks, and graceful shutdown 56 68 - **12-Factor App**: Environment-based configuration following cloud-native best practices ··· 164 176 - `PROACTIVE_REFRESH_ENABLED`: Enable proactive cache refreshing (default: false) 165 177 - `PROACTIVE_REFRESH_THRESHOLD`: Refresh when TTL remaining is below this threshold (0.0-1.0, default: 0.8) 166 178 179 + #### Jetstream Consumer 180 + - `JETSTREAM_ENABLED`: Enable Jetstream consumer for real-time cache updates (default: false) 181 + - `JETSTREAM_HOSTNAME`: Jetstream WebSocket hostname (default: jetstream.atproto.tools) 182 + 167 183 #### Static Files 168 184 - `STATIC_FILES_DIR`: Directory for serving static files (default: www) 169 185 ··· 172 188 173 189 ### Production Examples 174 190 175 - #### Redis-based with Metrics (Multi-instance/HA) 191 + #### Redis-based with Metrics and Jetstream (Multi-instance/HA) 176 192 ```bash 177 193 HTTP_EXTERNAL=quickdid.example.com \ 178 194 HTTP_PORT=3000 \ ··· 187 203 METRICS_PREFIX=quickdid \ 188 204 METRICS_TAGS=env:prod,service:quickdid \ 189 205 CACHE_MAX_AGE=86400 \ 206 + JETSTREAM_ENABLED=true \ 207 + JETSTREAM_HOSTNAME=jetstream.atproto.tools \ 190 208 RUST_LOG=info \ 191 209 ./target/release/quickdid 192 210 ``` ··· 213 231 ↓ ↓ ↓ ↓ 214 232 Memory/Redis/ Background Semaphore AT Protocol 215 233 SQLite Refresher (optional) Infrastructure 234 + 235 + Jetstream Consumer ← Real-time Updates from AT Protocol Firehose 216 236 ``` 217 237 218 238 ### Cache Priority ··· 221 241 2. SQLite (if configured) - Best for single-instance with persistence 222 242 3. In-memory (fallback) - Always available 223 243 244 + ### Real-time Cache Updates 245 + When Jetstream is enabled, QuickDID maintains cache consistency by: 246 + - Processing Account events to purge deleted/deactivated accounts 247 + - Processing Identity events to update handle-to-DID mappings 248 + - Automatically reconnecting with exponential backoff on connection failures 249 + - Tracking metrics for successful and failed event processing 250 + 224 251 ### Deployment Strategies 225 252 226 253 - **Single-instance**: Use SQLite for both caching and queuing 227 254 - **Multi-instance/HA**: Use Redis for distributed caching and queuing 228 255 - **Development**: Use in-memory caching with MPSC queuing 256 + - **Real-time sync**: Enable Jetstream consumer for live cache updates 229 257 230 258 ## API Endpoints 231 259
+119 -2
docs/configuration-reference.md
··· 10 10 - [Queue Configuration](#queue-configuration) 11 11 - [Rate Limiting Configuration](#rate-limiting-configuration) 12 12 - [HTTP Caching Configuration](#http-caching-configuration) 13 + - [Metrics Configuration](#metrics-configuration) 14 + - [Proactive Refresh Configuration](#proactive-refresh-configuration) 15 + - [Jetstream Consumer Configuration](#jetstream-consumer-configuration) 16 + - [Static Files Configuration](#static-files-configuration) 13 17 - [Configuration Examples](#configuration-examples) 14 18 - [Validation Rules](#validation-rules) 15 19 ··· 761 765 - TTL=3600s (1 hour), threshold=0.8: Refresh after 48 minutes 762 766 - TTL=86400s (1 day), threshold=0.8: Refresh after 19.2 hours 763 767 768 + ## Jetstream Consumer Configuration 769 + 770 + ### `JETSTREAM_ENABLED` 771 + 772 + **Required**: No 773 + **Type**: Boolean 774 + **Default**: `false` 775 + 776 + Enable Jetstream consumer for real-time cache updates from the AT Protocol firehose. When enabled, QuickDID connects to the Jetstream WebSocket service to receive live updates about account and identity changes. 777 + 778 + **How it works**: 779 + - Subscribes to Account and Identity events from the firehose 780 + - Processes Account events to purge deleted/deactivated accounts 781 + - Processes Identity events to update handle-to-DID mappings 782 + - Automatically reconnects with exponential backoff on connection failures 783 + - Tracks metrics for successful and failed event processing 784 + 785 + **Examples**: 786 + ```bash 787 + # Enable Jetstream consumer (recommended for production) 788 + JETSTREAM_ENABLED=true 789 + 790 + # Disable Jetstream consumer (default) 791 + JETSTREAM_ENABLED=false 792 + ``` 793 + 794 + **Benefits**: 795 + - Real-time cache synchronization with AT Protocol network 796 + - Automatic removal of deleted/deactivated accounts 797 + - Immediate handle change updates 798 + - Reduces stale data in cache 799 + 800 + **Considerations**: 801 + - Requires stable WebSocket connection 802 + - Increases network traffic (incoming events) 803 + - Best for services requiring up-to-date handle mappings 804 + - Automatically handles reconnection on failures 805 + 806 + ### `JETSTREAM_HOSTNAME` 807 + 808 + **Required**: No 809 + **Type**: String 810 + **Default**: `jetstream.atproto.tools` 811 + 812 + The hostname of the Jetstream WebSocket service to connect to for real-time AT Protocol events. Only used when `JETSTREAM_ENABLED=true`. 813 + 814 + **Examples**: 815 + ```bash 816 + # Production firehose (default) 817 + JETSTREAM_HOSTNAME=jetstream.atproto.tools 818 + 819 + # Staging environment 820 + JETSTREAM_HOSTNAME=jetstream-staging.atproto.tools 821 + 822 + # Local development firehose 823 + JETSTREAM_HOSTNAME=localhost:6008 824 + 825 + # Custom deployment 826 + JETSTREAM_HOSTNAME=jetstream.example.com 827 + ``` 828 + 829 + **Event Processing**: 830 + - **Account events**: 831 + - `status: deleted` → Purges handle and DID from all caches 832 + - `status: deactivated` → Purges handle and DID from all caches 833 + - Other statuses → Ignored 834 + 835 + - **Identity events**: 836 + - Updates handle-to-DID mapping in cache 837 + - Removes old handle mapping if changed 838 + - Maintains bidirectional cache consistency 839 + 840 + **Metrics Tracked** (when metrics are enabled): 841 + - `jetstream.events.received`: Total events received 842 + - `jetstream.events.processed`: Successfully processed events 843 + - `jetstream.events.failed`: Failed event processing 844 + - `jetstream.connections.established`: Successful connections 845 + - `jetstream.connections.failed`: Failed connection attempts 846 + 847 + **Reconnection Behavior**: 848 + - Initial retry delay: 1 second 849 + - Maximum retry delay: 60 seconds 850 + - Exponential backoff with jitter 851 + - Automatic recovery on transient failures 852 + 853 + **Recommendations**: 854 + - **Production**: Use default `jetstream.atproto.tools` 855 + - **Development**: Consider local firehose for testing 856 + - **High availability**: Monitor connection metrics 857 + - **Network issues**: Check WebSocket connectivity 858 + 764 859 ## Static Files Configuration 765 860 766 861 ### `STATIC_FILES_DIR` ··· 1026 1121 PROACTIVE_REFRESH_ENABLED=true 1027 1122 PROACTIVE_REFRESH_THRESHOLD=0.8 1028 1123 1124 + # Jetstream Consumer (optional, recommended for real-time sync) 1125 + JETSTREAM_ENABLED=true 1126 + JETSTREAM_HOSTNAME=jetstream.atproto.tools 1127 + 1029 1128 # HTTP Caching (Cache-Control headers) 1030 1129 CACHE_MAX_AGE=86400 # 24 hours 1031 1130 CACHE_STALE_IF_ERROR=172800 # 48 hours ··· 1059 1158 # Rate Limiting (optional, recommended for production) 1060 1159 RESOLVER_MAX_CONCURRENT=100 1061 1160 RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=5000 # 5 second timeout 1161 + 1162 + # Jetstream Consumer (optional, recommended for real-time sync) 1163 + JETSTREAM_ENABLED=true 1164 + JETSTREAM_HOSTNAME=jetstream.atproto.tools 1062 1165 1063 1166 # HTTP Caching (Cache-Control headers) 1064 1167 CACHE_MAX_AGE=86400 # 24 hours ··· 1110 1213 # Proactive Refresh (recommended for HA) 1111 1214 PROACTIVE_REFRESH_ENABLED=true 1112 1215 PROACTIVE_REFRESH_THRESHOLD=0.7 # More aggressive for HA 1216 + 1217 + # Jetstream Consumer (recommended for real-time sync in HA) 1218 + JETSTREAM_ENABLED=true 1219 + JETSTREAM_HOSTNAME=jetstream.atproto.tools 1113 1220 1114 1221 # Logging 1115 1222 RUST_LOG=warn ··· 1157 1264 CACHE_TTL_REDIS: 86400 1158 1265 QUEUE_ADAPTER: redis 1159 1266 QUEUE_REDIS_TIMEOUT: 5 1267 + JETSTREAM_ENABLED: true 1268 + JETSTREAM_HOSTNAME: jetstream.atproto.tools 1160 1269 RUST_LOG: info 1161 1270 ports: 1162 1271 - "8080:8080" ··· 1186 1295 QUEUE_ADAPTER: sqlite 1187 1296 QUEUE_BUFFER_SIZE: 5000 1188 1297 QUEUE_SQLITE_MAX_SIZE: 10000 1298 + JETSTREAM_ENABLED: true 1299 + JETSTREAM_HOSTNAME: jetstream.atproto.tools 1189 1300 RUST_LOG: info 1190 1301 ports: 1191 1302 - "8080:8080" ··· 1281 1392 2. **Single-instance deployments**: Use SQLite for persistent caching and queuing 1282 1393 3. **Development/testing**: Use memory-only caching with MPSC queuing 1283 1394 4. **Hybrid setups**: Configure both Redis and SQLite for redundancy 1284 - 5. **Queue adapter guidelines**: 1395 + 5. **Real-time sync**: Enable Jetstream consumer for live cache updates 1396 + 6. **Queue adapter guidelines**: 1285 1397 - Redis: Best for multi-instance deployments with distributed processing 1286 1398 - SQLite: Best for single-instance deployments needing persistence 1287 1399 - MPSC: Best for single-instance deployments without persistence needs 1288 - 6. **Cache TTL guidelines**: 1400 + 7. **Cache TTL guidelines**: 1289 1401 - Redis: Shorter TTLs (1-7 days) for frequently updated handles 1290 1402 - SQLite: Longer TTLs (7-90 days) for stable single-instance caching 1291 1403 - Memory: Short TTLs (5-30 minutes) as fallback 1404 + 8. **Jetstream guidelines**: 1405 + - Production: Enable for real-time cache synchronization 1406 + - High-traffic: Essential for reducing stale data 1407 + - Development: Can be disabled for simpler testing 1408 + - Monitor WebSocket connection health in production 1292 1409 1293 1410 ### Monitoring 1294 1411
+33 -4
docs/production-deployment.md
··· 296 296 PROACTIVE_REFRESH_THRESHOLD=0.8 297 297 298 298 # ---------------------------------------------------------------------------- 299 + # JETSTREAM CONSUMER CONFIGURATION 300 + # ---------------------------------------------------------------------------- 301 + 302 + # Enable Jetstream consumer for real-time cache updates (default: false) 303 + # When enabled, connects to AT Protocol firehose for live updates 304 + # Processes Account events (deleted/deactivated) and Identity events (handle changes) 305 + # Automatically reconnects with exponential backoff on connection failures 306 + JETSTREAM_ENABLED=false 307 + 308 + # Jetstream WebSocket hostname (default: jetstream.atproto.tools) 309 + # The firehose service to connect to for real-time AT Protocol events 310 + # Examples: 311 + # - jetstream.atproto.tools (production firehose) 312 + # - jetstream-staging.atproto.tools (staging environment) 313 + # - localhost:6008 (local development) 314 + JETSTREAM_HOSTNAME=jetstream.atproto.tools 315 + 316 + # ---------------------------------------------------------------------------- 299 317 # STATIC FILES CONFIGURATION 300 318 # ---------------------------------------------------------------------------- 301 319 ··· 413 431 414 432 ## Docker Compose Setup 415 433 416 - ### Redis-based Production Setup 434 + ### Redis-based Production Setup with Jetstream 417 435 418 - Create a `docker-compose.yml` file for a complete production setup with Redis: 436 + Create a `docker-compose.yml` file for a complete production setup with Redis and optional Jetstream consumer: 419 437 420 438 ```yaml 421 439 version: '3.8' ··· 504 522 driver: local 505 523 ``` 506 524 507 - ### SQLite-based Single-Instance Setup 525 + ### SQLite-based Single-Instance Setup with Jetstream 508 526 509 - For single-instance deployments without Redis, create a simpler `docker-compose.sqlite.yml`: 527 + For single-instance deployments without Redis, create a simpler `docker-compose.sqlite.yml` with optional Jetstream consumer: 510 528 511 529 ```yaml 512 530 version: '3.8' ··· 524 542 QUEUE_ADAPTER: sqlite 525 543 QUEUE_BUFFER_SIZE: 5000 526 544 QUEUE_SQLITE_MAX_SIZE: 10000 545 + # Optional: Enable Jetstream for real-time cache updates 546 + # JETSTREAM_ENABLED: true 547 + # JETSTREAM_HOSTNAME: jetstream.atproto.tools 527 548 RUST_LOG: info 528 549 ports: 529 550 - "8080:8080" ··· 905 926 2. **SQLite** (persistent, best for single-instance) 906 927 3. **Memory** (fast, but lost on restart) 907 928 929 + **Real-time Updates with Jetstream**: When `JETSTREAM_ENABLED=true`, QuickDID: 930 + - Connects to AT Protocol firehose for live cache updates 931 + - Processes Account events to purge deleted/deactivated accounts 932 + - Processes Identity events to update handle-to-DID mappings 933 + - Automatically reconnects with exponential backoff on failures 934 + - Tracks metrics for successful and failed event processing 935 + 908 936 **Recommendations by Deployment Type**: 909 937 - **Single instance, persistent**: Use SQLite for both caching and queuing (`SQLITE_URL=sqlite:./quickdid.db`, `QUEUE_ADAPTER=sqlite`) 910 938 - **Multi-instance, HA**: Use Redis for both caching and queuing (`REDIS_URL=redis://redis:6379/0`, `QUEUE_ADAPTER=redis`) 939 + - **Real-time sync**: Enable Jetstream consumer (`JETSTREAM_ENABLED=true`) for live cache updates 911 940 - **Testing/development**: Use memory-only caching with MPSC queuing (`QUEUE_ADAPTER=mpsc`) 912 941 - **Hybrid**: Configure both Redis and SQLite for redundancy 913 942
+4 -1
src/bin/quickdid.rs
··· 633 633 compression: false, 634 634 zstd_dictionary_location: String::new(), 635 635 jetstream_hostname: jetstream_hostname.clone(), 636 - collections: vec![], // Listen to all collections 636 + // Listen to the "community.lexicon.collection.fake" collection 637 + // so that we keep an active connection open but only for 638 + // account and identity events. 639 + collections: vec!["community.lexicon.collection.fake".to_string()], // Listen to all collections 637 640 dids: vec![], 638 641 max_message_size_bytes: None, 639 642 cursor: None,