QuickDID is a high-performance AT Protocol identity resolution service written in Rust. It provides handle-to-DID resolution with Redis-backed caching and queue processing.
1# QuickDID Production Deployment Guide
2
3This guide provides comprehensive instructions for deploying QuickDID in a production environment using Docker. QuickDID supports multiple caching strategies: Redis (distributed), SQLite (single-instance), or in-memory caching.
4
5## Table of Contents
6
7- [Prerequisites](#prerequisites)
8- [Environment Configuration](#environment-configuration)
9- [Docker Deployment](#docker-deployment)
10- [Docker Compose Setup](#docker-compose-setup)
11- [Health Monitoring](#health-monitoring)
12- [Security Considerations](#security-considerations)
13- [Troubleshooting](#troubleshooting)
14
15## Prerequisites
16
17- Docker 20.10.0 or higher
18- Docker Compose 2.0.0 or higher (optional, for multi-container setup)
19- Redis 6.0 or higher (optional, for persistent caching and queue management)
20- SQLite 3.35 or higher (optional, alternative to Redis for single-instance caching)
21- Valid SSL certificates for HTTPS (recommended for production)
22- Domain name configured with appropriate DNS records
23
24## Environment Configuration
25
26Create a `.env` file in your deployment directory with the following configuration:
27
28```bash
29# ============================================================================
30# QuickDID Production Environment Configuration
31# ============================================================================
32
33# ----------------------------------------------------------------------------
34# REQUIRED CONFIGURATION
35# ----------------------------------------------------------------------------
36
37# External hostname for service endpoints
38# This should be your public domain name with port if non-standard
39# Examples:
40# - quickdid.example.com
41# - quickdid.example.com:8080
42# - localhost:3007 (for testing only)
43HTTP_EXTERNAL=quickdid.example.com
44
45# ----------------------------------------------------------------------------
46# NETWORK CONFIGURATION
47# ----------------------------------------------------------------------------
48
49# HTTP server port (default: 8080)
50# This is the port the service will bind to inside the container
51# Map this to your desired external port in docker-compose.yml
52HTTP_PORT=8080
53
54# PLC directory hostname (default: plc.directory)
55# Change this if using a custom PLC directory or testing environment
56PLC_HOSTNAME=plc.directory
57
58# ----------------------------------------------------------------------------
59# CACHING CONFIGURATION
60# ----------------------------------------------------------------------------
61
62# Redis connection URL for caching (recommended for production)
63# Format: redis://[username:password@]host:port/database
64# Examples:
65# - redis://localhost:6379/0 (local Redis, no auth)
66# - redis://user:pass@redis.example.com:6379/0 (remote with auth)
67# - redis://redis:6379/0 (Docker network)
68# - rediss://secure-redis.example.com:6380/0 (TLS)
69# Benefits: Persistent cache, distributed caching, better performance
70REDIS_URL=redis://redis:6379/0
71
72# SQLite database URL for caching (alternative to Redis for single-instance deployments)
73# Format: sqlite:path/to/database.db
74# Examples:
75# - sqlite:./quickdid.db (file-based database)
76# - sqlite::memory: (in-memory database for testing)
77# - sqlite:/var/lib/quickdid/cache.db (absolute path)
78# Benefits: Persistent cache, single-file storage, no external dependencies
79# Note: Cache priority is Redis > SQLite > Memory (first available is used)
80# SQLITE_URL=sqlite:./quickdid.db
81
82# TTL for in-memory cache in seconds (default: 600 = 10 minutes)
83# Range: 60-3600 recommended
84# Lower = fresher data, more DNS/HTTP lookups
85# Higher = better performance, potentially stale data
86CACHE_TTL_MEMORY=600
87
88# TTL for Redis cache in seconds (default: 7776000 = 90 days)
89# Range: 3600-31536000 (1 hour to 1 year)
90# Recommendations:
91# - 86400 (1 day) for frequently changing data
92# - 604800 (1 week) for balanced performance
93# - 7776000 (90 days) for stable data
94CACHE_TTL_REDIS=86400
95
96# TTL for SQLite cache in seconds (default: 7776000 = 90 days)
97# Range: 3600-31536000 (1 hour to 1 year)
98# Same recommendations as Redis TTL
99# Only used when SQLITE_URL is configured
100CACHE_TTL_SQLITE=86400
101
102# ----------------------------------------------------------------------------
103# QUEUE CONFIGURATION
104# ----------------------------------------------------------------------------
105
106# Queue adapter type: 'mpsc', 'redis', 'sqlite', 'noop', or 'none' (default: mpsc)
107# - 'mpsc': In-memory queue for single-instance deployments
108# - 'redis': Distributed queue for multi-instance or HA deployments
109# - 'sqlite': Persistent queue for single-instance deployments
110# - 'noop': Disable queue processing (testing only)
111# - 'none': Alias for 'noop'
112QUEUE_ADAPTER=redis
113
114# Redis URL for queue adapter (uses REDIS_URL if not set)
115# Set this if you want to use a separate Redis instance for queuing
116# QUEUE_REDIS_URL=redis://queue-redis:6379/1
117
118# Redis key prefix for queues (default: queue:handleresolver:)
119# Useful when sharing Redis instance with other services
120QUEUE_REDIS_PREFIX=queue:quickdid:prod:
121
122# Redis blocking timeout for queue operations in seconds (default: 5)
123# Range: 1-60 recommended
124# Lower = more responsive to shutdown, more polling
125# Higher = less polling overhead, slower shutdown
126QUEUE_REDIS_TIMEOUT=5
127
128# Enable deduplication for Redis queue to prevent duplicate handles (default: false)
129# When enabled, uses Redis SET with TTL to track handles being processed
130# Prevents the same handle from being queued multiple times within the TTL window
131QUEUE_REDIS_DEDUP_ENABLED=false
132
133# TTL for Redis queue deduplication keys in seconds (default: 60)
134# Range: 10-300 recommended
135# Determines how long to prevent duplicate handle resolution requests
136QUEUE_REDIS_DEDUP_TTL=60
137
138# Worker ID for Redis queue (defaults to "worker1")
139# Set this for predictable worker identification in multi-instance deployments
140# Examples: worker-001, prod-us-east-1, $(hostname)
141QUEUE_WORKER_ID=prod-worker-1
142
143# Buffer size for MPSC queue (default: 1000)
144# Range: 100-100000
145# Increase for high-traffic deployments using MPSC adapter
146QUEUE_BUFFER_SIZE=5000
147
148# Maximum queue size for SQLite adapter work shedding (default: 10000)
149# Range: 100-1000000 (recommended)
150# When exceeded, oldest entries are deleted to maintain this limit
151# Set to 0 to disable work shedding (unlimited queue size)
152# Benefits: Prevents unbounded disk usage, maintains recent work items
153QUEUE_SQLITE_MAX_SIZE=10000
154
155# ----------------------------------------------------------------------------
156# HTTP CLIENT CONFIGURATION
157# ----------------------------------------------------------------------------
158
159# HTTP User-Agent header
160# Identifies your service to other AT Protocol services
161# Default: Auto-generated with current version from Cargo.toml
162# Format: quickdid/{version} (+https://github.com/smokesignal.events/quickdid)
163USER_AGENT=quickdid/1.0.0-rc.5 (+https://quickdid.example.com)
164
165# Custom DNS nameservers (comma-separated)
166# Use for custom DNS resolution or to bypass local DNS
167# Examples:
168# - 8.8.8.8,8.8.4.4 (Google DNS)
169# - 1.1.1.1,1.0.0.1 (Cloudflare DNS)
170# DNS_NAMESERVERS=1.1.1.1,1.0.0.1
171
172# Additional CA certificates (comma-separated file paths)
173# Use when connecting to services with custom CA certificates
174# CERTIFICATE_BUNDLES=/certs/custom-ca.pem,/certs/internal-ca.pem
175
176# ----------------------------------------------------------------------------
177# LOGGING AND MONITORING
178# ----------------------------------------------------------------------------
179
180# Logging level (debug, info, warn, error)
181# Use 'info' for production, 'debug' for troubleshooting
182RUST_LOG=info
183
184# Structured logging format (optional)
185# Set to 'json' for machine-readable logs
186# RUST_LOG_FORMAT=json
187
188# ----------------------------------------------------------------------------
189# RATE LIMITING CONFIGURATION
190# ----------------------------------------------------------------------------
191
192# Maximum concurrent handle resolutions (default: 0 = disabled)
193# When > 0, enables semaphore-based rate limiting
194# Range: 0-10000 (0 = disabled)
195# Protects upstream DNS/HTTP services from being overwhelmed
196RESOLVER_MAX_CONCURRENT=0
197
198# Timeout for acquiring rate limit permit in milliseconds (default: 0 = no timeout)
199# When > 0, requests will timeout if they can't acquire a permit within this time
200# Range: 0-60000 (max 60 seconds)
201# Prevents requests from waiting indefinitely when rate limiter is at capacity
202RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=0
203
204# ----------------------------------------------------------------------------
205# HTTP CACHING CONFIGURATION
206# ----------------------------------------------------------------------------
207
208# ETAG seed for cache invalidation (default: application version)
209# Used to generate ETAG checksums for HTTP responses
210# Changing this value invalidates all client-cached responses
211# Examples:
212# - prod-2024-01-15 (deployment-specific)
213# - v1.0.0-1705344000 (version with timestamp)
214# - config-update-2024-01-15 (after configuration changes)
215# Default uses the application version from Cargo.toml
216# ETAG_SEED=prod-2024-01-15
217
218# Maximum age for HTTP Cache-Control header in seconds (default: 86400 = 24 hours)
219# Set to 0 to disable Cache-Control header
220# Controls how long clients and intermediate caches can cache responses
221CACHE_MAX_AGE=86400
222
223# Stale-if-error directive for Cache-Control in seconds (default: 172800 = 48 hours)
224# Allows stale content to be served if backend errors occur
225# Provides resilience during service outages
226CACHE_STALE_IF_ERROR=172800
227
228# Stale-while-revalidate directive for Cache-Control in seconds (default: 86400 = 24 hours)
229# Allows stale content to be served while fetching fresh content in background
230# Improves perceived performance for users
231CACHE_STALE_WHILE_REVALIDATE=86400
232
233# Max-stale directive for Cache-Control in seconds (default: 172800 = 48 hours)
234# Maximum time client will accept stale responses
235# Provides upper bound on cached content age
236CACHE_MAX_STALE=172800
237
238# Min-fresh directive for Cache-Control in seconds (default: 3600 = 1 hour)
239# Minimum time response must remain fresh
240# Clients won't accept responses expiring within this time
241CACHE_MIN_FRESH=3600
242
243# ----------------------------------------------------------------------------
244# METRICS CONFIGURATION
245# ----------------------------------------------------------------------------
246
247# Metrics adapter type: 'noop' or 'statsd' (default: noop)
248# - 'noop': No metrics collection (default)
249# - 'statsd': Send metrics to StatsD server
250METRICS_ADAPTER=statsd
251
252# StatsD host and port (required when METRICS_ADAPTER=statsd)
253# Format: hostname:port
254# Examples:
255# - localhost:8125 (local StatsD)
256# - statsd.example.com:8125 (remote StatsD)
257METRICS_STATSD_HOST=localhost:8125
258
259# Bind address for StatsD UDP socket (default: [::]:0)
260# Controls which local address to bind for sending UDP packets
261# Examples:
262# - [::]:0 (IPv6 any address, random port - default)
263# - 0.0.0.0:0 (IPv4 any address, random port)
264# - 192.168.1.100:0 (specific interface)
265METRICS_STATSD_BIND=[::]:0
266
267# Prefix for all metrics (default: quickdid)
268# Used to namespace metrics in your monitoring system
269# Examples:
270# - quickdid (default)
271# - prod.quickdid
272# - us-east-1.quickdid
273METRICS_PREFIX=quickdid
274
275# Tags for all metrics (comma-separated key:value pairs)
276# Added to all metrics for filtering and grouping
277# Examples:
278# - env:production,service:quickdid
279# - env:staging,region:us-east-1,version:1.0.0
280METRICS_TAGS=env:production,service:quickdid
281
282# ----------------------------------------------------------------------------
283# PROACTIVE REFRESH CONFIGURATION
284# ----------------------------------------------------------------------------
285
286# Enable proactive cache refresh (default: false)
287# When enabled, cache entries nearing expiration are automatically refreshed
288# in the background to prevent cache misses for frequently accessed handles
289PROACTIVE_REFRESH_ENABLED=false
290
291# Threshold for proactive refresh as percentage of TTL (default: 0.8)
292# Range: 0.0-1.0 (0% to 100% of TTL)
293# Example: 0.8 means refresh when 80% of TTL has elapsed
294# Lower values = more aggressive refreshing, higher load
295# Higher values = less aggressive refreshing, more cache misses
296PROACTIVE_REFRESH_THRESHOLD=0.8
297
298# ----------------------------------------------------------------------------
299# JETSTREAM CONSUMER CONFIGURATION
300# ----------------------------------------------------------------------------
301
302# Enable Jetstream consumer for real-time cache updates (default: false)
303# When enabled, connects to AT Protocol firehose for live updates
304# Processes Account events (deleted/deactivated) and Identity events (handle changes)
305# Automatically reconnects with exponential backoff on connection failures
306JETSTREAM_ENABLED=false
307
308# Jetstream WebSocket hostname (default: jetstream.atproto.tools)
309# The firehose service to connect to for real-time AT Protocol events
310# Examples:
311# - jetstream.atproto.tools (production firehose)
312# - jetstream-staging.atproto.tools (staging environment)
313# - localhost:6008 (local development)
314JETSTREAM_HOSTNAME=jetstream.atproto.tools
315
316# ----------------------------------------------------------------------------
317# STATIC FILES CONFIGURATION
318# ----------------------------------------------------------------------------
319
320# Directory path for serving static files (default: www)
321# This directory should contain:
322# - index.html (landing page)
323# - .well-known/atproto-did (service DID identifier)
324# - .well-known/did.json (DID document)
325# In Docker, this defaults to /app/www
326# You can mount custom files via Docker volumes
327STATIC_FILES_DIR=/app/www
328
329# ----------------------------------------------------------------------------
330# PERFORMANCE TUNING
331# ----------------------------------------------------------------------------
332
333# Tokio runtime worker threads (defaults to CPU count)
334# Adjust based on your container's CPU allocation
335# TOKIO_WORKER_THREADS=4
336
337# Maximum concurrent connections (optional)
338# Helps prevent resource exhaustion
339# MAX_CONNECTIONS=10000
340
341# ----------------------------------------------------------------------------
342# DOCKER-SPECIFIC CONFIGURATION
343# ----------------------------------------------------------------------------
344
345# Container restart policy (for docker-compose)
346# Options: no, always, on-failure, unless-stopped
347RESTART_POLICY=unless-stopped
348
349# Resource limits (for docker-compose)
350# Adjust based on your available resources
351MEMORY_LIMIT=512M
352CPU_LIMIT=1.0
353```
354
355## Docker Deployment
356
357### Building the Docker Image
358
359Create a `Dockerfile` in your project root:
360
361```dockerfile
362# Build stage
363FROM rust:1.75-slim AS builder
364
365# Install build dependencies
366RUN apt-get update && apt-get install -y \
367 pkg-config \
368 libssl-dev \
369 && rm -rf /var/lib/apt/lists/*
370
371# Create app directory
372WORKDIR /app
373
374# Copy source files
375COPY Cargo.toml Cargo.lock ./
376COPY src ./src
377
378# Build the application
379RUN cargo build --release
380
381# Runtime stage
382FROM debian:bookworm-slim
383
384# Install runtime dependencies
385RUN apt-get update && apt-get install -y \
386 ca-certificates \
387 libssl3 \
388 curl \
389 && rm -rf /var/lib/apt/lists/*
390
391# Create non-root user
392RUN useradd -m -u 1000 quickdid
393
394# Copy binary from builder
395COPY --from=builder /app/target/release/quickdid /usr/local/bin/quickdid
396
397# Set ownership and permissions
398RUN chown quickdid:quickdid /usr/local/bin/quickdid
399
400# Switch to non-root user
401USER quickdid
402
403# Expose default port
404EXPOSE 8080
405
406# Health check
407HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
408 CMD curl -f http://localhost:8080/health || exit 1
409
410# Run the application
411ENTRYPOINT ["quickdid"]
412```
413
414Build the image:
415
416```bash
417docker build -t quickdid:latest .
418```
419
420### Running a Single Instance
421
422```bash
423# Run with environment file
424docker run -d \
425 --name quickdid \
426 --env-file .env \
427 -p 8080:8080 \
428 --restart unless-stopped \
429 quickdid:latest
430```
431
432## Docker Compose Setup
433
434### Redis-based Production Setup with Jetstream
435
436Create a `docker-compose.yml` file for a complete production setup with Redis and optional Jetstream consumer:
437
438```yaml
439version: '3.8'
440
441services:
442 quickdid:
443 image: quickdid:latest
444 container_name: quickdid
445 env_file: .env
446 ports:
447 - "8080:8080"
448 depends_on:
449 redis:
450 condition: service_healthy
451 networks:
452 - quickdid-network
453 restart: ${RESTART_POLICY:-unless-stopped}
454 deploy:
455 resources:
456 limits:
457 memory: ${MEMORY_LIMIT:-512M}
458 cpus: ${CPU_LIMIT:-1.0}
459 reservations:
460 memory: 256M
461 cpus: '0.5'
462 healthcheck:
463 test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
464 interval: 30s
465 timeout: 3s
466 retries: 3
467 start_period: 10s
468 logging:
469 driver: "json-file"
470 options:
471 max-size: "10m"
472 max-file: "3"
473
474 redis:
475 image: redis:7-alpine
476 container_name: quickdid-redis
477 command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
478 volumes:
479 - redis-data:/data
480 networks:
481 - quickdid-network
482 restart: unless-stopped
483 healthcheck:
484 test: ["CMD", "redis-cli", "ping"]
485 interval: 10s
486 timeout: 3s
487 retries: 3
488 logging:
489 driver: "json-file"
490 options:
491 max-size: "10m"
492 max-file: "3"
493
494 # Optional: Nginx reverse proxy with SSL
495 nginx:
496 image: nginx:alpine
497 container_name: quickdid-nginx
498 ports:
499 - "80:80"
500 - "443:443"
501 volumes:
502 - ./nginx.conf:/etc/nginx/nginx.conf:ro
503 - ./certs:/etc/nginx/certs:ro
504 - ./acme-challenge:/var/www/acme:ro
505 depends_on:
506 - quickdid
507 networks:
508 - quickdid-network
509 restart: unless-stopped
510 logging:
511 driver: "json-file"
512 options:
513 max-size: "10m"
514 max-file: "3"
515
516networks:
517 quickdid-network:
518 driver: bridge
519
520volumes:
521 redis-data:
522 driver: local
523```
524
525### SQLite-based Single-Instance Setup with Jetstream
526
527For single-instance deployments without Redis, create a simpler `docker-compose.sqlite.yml` with optional Jetstream consumer:
528
529```yaml
530version: '3.8'
531
532services:
533 quickdid:
534 image: quickdid:latest
535 container_name: quickdid-sqlite
536 environment:
537 HTTP_EXTERNAL: quickdid.example.com
538 HTTP_PORT: 8080
539 SQLITE_URL: sqlite:/data/quickdid.db
540 CACHE_TTL_MEMORY: 600
541 CACHE_TTL_SQLITE: 86400
542 QUEUE_ADAPTER: sqlite
543 QUEUE_BUFFER_SIZE: 5000
544 QUEUE_SQLITE_MAX_SIZE: 10000
545 # Optional: Enable Jetstream for real-time cache updates
546 # JETSTREAM_ENABLED: true
547 # JETSTREAM_HOSTNAME: jetstream.atproto.tools
548 RUST_LOG: info
549 ports:
550 - "8080:8080"
551 volumes:
552 - quickdid-data:/data
553 networks:
554 - quickdid-network
555 restart: ${RESTART_POLICY:-unless-stopped}
556 deploy:
557 resources:
558 limits:
559 memory: ${MEMORY_LIMIT:-256M}
560 cpus: ${CPU_LIMIT:-0.5}
561 reservations:
562 memory: 128M
563 cpus: '0.25'
564 healthcheck:
565 test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
566 interval: 30s
567 timeout: 3s
568 retries: 3
569 start_period: 10s
570 logging:
571 driver: "json-file"
572 options:
573 max-size: "10m"
574 max-file: "3"
575
576 # Optional: Nginx reverse proxy with SSL
577 nginx:
578 image: nginx:alpine
579 container_name: quickdid-nginx
580 ports:
581 - "80:80"
582 - "443:443"
583 volumes:
584 - ./nginx.conf:/etc/nginx/nginx.conf:ro
585 - ./certs:/etc/nginx/certs:ro
586 - ./acme-challenge:/var/www/acme:ro
587 depends_on:
588 - quickdid
589 networks:
590 - quickdid-network
591 restart: unless-stopped
592 logging:
593 driver: "json-file"
594 options:
595 max-size: "10m"
596 max-file: "3"
597
598networks:
599 quickdid-network:
600 driver: bridge
601
602volumes:
603 quickdid-data:
604 driver: local
605```
606
607### Nginx Configuration (nginx.conf)
608
609```nginx
610events {
611 worker_connections 1024;
612}
613
614http {
615 upstream quickdid {
616 server quickdid:8080;
617 }
618
619 server {
620 listen 80;
621 server_name quickdid.example.com;
622
623 # ACME challenge for Let's Encrypt
624 location /.well-known/acme-challenge/ {
625 root /var/www/acme;
626 }
627
628 # Redirect HTTP to HTTPS
629 location / {
630 return 301 https://$server_name$request_uri;
631 }
632 }
633
634 server {
635 listen 443 ssl http2;
636 server_name quickdid.example.com;
637
638 ssl_certificate /etc/nginx/certs/fullchain.pem;
639 ssl_certificate_key /etc/nginx/certs/privkey.pem;
640 ssl_protocols TLSv1.2 TLSv1.3;
641 ssl_ciphers HIGH:!aNULL:!MD5;
642
643 location / {
644 proxy_pass http://quickdid;
645 proxy_set_header Host $host;
646 proxy_set_header X-Real-IP $remote_addr;
647 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
648 proxy_set_header X-Forwarded-Proto $scheme;
649
650 # WebSocket support (if needed)
651 proxy_http_version 1.1;
652 proxy_set_header Upgrade $http_upgrade;
653 proxy_set_header Connection "upgrade";
654
655 # Timeouts
656 proxy_connect_timeout 60s;
657 proxy_send_timeout 60s;
658 proxy_read_timeout 60s;
659 }
660
661 # Health check endpoint
662 location /health {
663 proxy_pass http://quickdid/health;
664 access_log off;
665 }
666 }
667}
668```
669
670### Starting the Stack
671
672```bash
673# Start Redis-based stack
674docker-compose up -d
675
676# Start SQLite-based stack
677docker-compose -f docker-compose.sqlite.yml up -d
678
679# View logs
680docker-compose logs -f
681# or for SQLite setup
682docker-compose -f docker-compose.sqlite.yml logs -f
683
684# Check service status
685docker-compose ps
686
687# Stop all services
688docker-compose down
689# or for SQLite setup
690docker-compose -f docker-compose.sqlite.yml down
691```
692
693## Health Monitoring
694
695QuickDID provides health check endpoints for monitoring:
696
697### Basic Health Check
698
699```bash
700curl http://quickdid.example.com/health
701```
702
703Expected response:
704```json
705{
706 "status": "healthy",
707 "version": "1.0.0",
708 "uptime_seconds": 3600
709}
710```
711
712### Monitoring with Prometheus (Optional)
713
714Add to your `docker-compose.yml`:
715
716```yaml
717 prometheus:
718 image: prom/prometheus:latest
719 container_name: quickdid-prometheus
720 volumes:
721 - ./prometheus.yml:/etc/prometheus/prometheus.yml
722 - prometheus-data:/prometheus
723 command:
724 - '--config.file=/etc/prometheus/prometheus.yml'
725 - '--storage.tsdb.path=/prometheus'
726 ports:
727 - "9090:9090"
728 networks:
729 - quickdid-network
730 restart: unless-stopped
731
732volumes:
733 prometheus-data:
734 driver: local
735```
736
737## Security Considerations
738
739### 1. Service Key Protection
740
741- **Never commit** sensitive configuration to version control
742- Store keys in a secure secret management system (e.g., HashiCorp Vault, AWS Secrets Manager)
743- Rotate keys regularly
744- Use different keys for different environments
745
746### 2. Network Security
747
748- Use HTTPS in production with valid SSL certificates
749- Implement rate limiting at the reverse proxy level
750- Use firewall rules to restrict access to Redis
751- Enable Redis authentication in production
752
753### 3. Container Security
754
755- Run containers as non-root user (already configured in Dockerfile)
756- Keep base images updated
757- Scan images for vulnerabilities regularly
758- Use read-only filesystems where possible
759
760### 4. Redis Security
761
762```bash
763# Add to Redis configuration for production
764requirepass your_strong_password_here
765maxclients 10000
766timeout 300
767```
768
769### 5. Environment Variables
770
771- Use Docker secrets or external secret management
772- Avoid logging sensitive environment variables
773- Implement proper access controls
774
775## Troubleshooting
776
777### Common Issues and Solutions
778
779#### 1. Container Won't Start
780
781```bash
782# Check logs
783docker logs quickdid
784
785# Verify environment variables
786docker exec quickdid env | grep -E "HTTP_EXTERNAL|HTTP_PORT"
787
788# Test Redis connectivity
789docker exec quickdid redis-cli -h redis ping
790```
791
792#### 2. Handle Resolution Failures
793
794```bash
795# Enable debug logging
796docker exec quickdid sh -c "export RUST_LOG=debug"
797
798# Check DNS resolution
799docker exec quickdid nslookup plc.directory
800
801# Verify Redis cache (if using Redis)
802docker exec -it quickdid-redis redis-cli
803> KEYS handle:*
804> TTL handle:example_key
805
806# Check SQLite cache (if using SQLite)
807docker exec quickdid sqlite3 /data/quickdid.db ".tables"
808docker exec quickdid sqlite3 /data/quickdid.db "SELECT COUNT(*) FROM handle_resolution_cache;"
809```
810
811#### 3. Performance Issues
812
813```bash
814# Monitor Redis memory usage (if using Redis)
815docker exec quickdid-redis redis-cli INFO memory
816
817# Check SQLite database size (if using SQLite)
818docker exec quickdid ls -lh /data/quickdid.db
819docker exec quickdid sqlite3 /data/quickdid.db "PRAGMA page_count; PRAGMA page_size;"
820
821# Check container resource usage
822docker stats quickdid
823
824# Analyze slow queries (with debug logging)
825docker logs quickdid | grep "resolution took"
826```
827
828#### 4. Health Check Failures
829
830```bash
831# Manual health check
832docker exec quickdid curl -v http://localhost:8080/health
833
834# Check service binding
835docker exec quickdid netstat -tlnp | grep 8080
836```
837
838### Debugging Commands
839
840```bash
841# Interactive shell in container
842docker exec -it quickdid /bin/bash
843
844# Test handle resolution
845curl "http://localhost:8080/xrpc/com.atproto.identity.resolveHandle?handle=example.bsky.social"
846
847# Check Redis keys (if using Redis)
848docker exec quickdid-redis redis-cli --scan --pattern "handle:*" | head -20
849
850# Check SQLite cache entries (if using SQLite)
851docker exec quickdid sqlite3 /data/quickdid.db "SELECT COUNT(*) as total_entries, MIN(updated) as oldest, MAX(updated) as newest FROM handle_resolution_cache;"
852
853# Check SQLite queue entries (if using SQLite queue adapter)
854docker exec quickdid sqlite3 /data/quickdid.db "SELECT COUNT(*) as queue_entries, MIN(queued_at) as oldest, MAX(queued_at) as newest FROM handle_resolution_queue;"
855
856# Monitor real-time logs
857docker-compose logs -f quickdid | grep -E "ERROR|WARN"
858```
859
860## Maintenance
861
862### Backup and Restore
863
864#### Redis Backup
865```bash
866# Backup Redis data
867docker exec quickdid-redis redis-cli BGSAVE
868docker cp quickdid-redis:/data/dump.rdb ./backups/redis-$(date +%Y%m%d).rdb
869
870# Restore Redis data
871docker cp ./backups/redis-backup.rdb quickdid-redis:/data/dump.rdb
872docker restart quickdid-redis
873```
874
875#### SQLite Backup
876```bash
877# Backup SQLite database
878docker exec quickdid sqlite3 /data/quickdid.db ".backup /tmp/backup.db"
879docker cp quickdid:/tmp/backup.db ./backups/sqlite-$(date +%Y%m%d).db
880
881# Alternative: Copy database file directly (service must be stopped)
882docker-compose -f docker-compose.sqlite.yml stop quickdid
883docker cp quickdid:/data/quickdid.db ./backups/sqlite-$(date +%Y%m%d).db
884docker-compose -f docker-compose.sqlite.yml start quickdid
885
886# Restore SQLite database
887docker-compose -f docker-compose.sqlite.yml stop quickdid
888docker cp ./backups/sqlite-backup.db quickdid:/data/quickdid.db
889docker-compose -f docker-compose.sqlite.yml start quickdid
890```
891
892### Updates and Rollbacks
893
894```bash
895# Update to new version
896docker pull quickdid:new-version
897docker-compose down
898docker-compose up -d
899
900# Rollback if needed
901docker-compose down
902docker tag quickdid:previous quickdid:latest
903docker-compose up -d
904```
905
906### Log Rotation
907
908Configure Docker's built-in log rotation in `/etc/docker/daemon.json`:
909
910```json
911{
912 "log-driver": "json-file",
913 "log-opts": {
914 "max-size": "10m",
915 "max-file": "3"
916 }
917}
918```
919
920## Performance Optimization
921
922### Caching Strategy Selection
923
924**Cache Priority**: QuickDID uses the first available cache in this order:
9251. **Redis** (distributed, best for multi-instance)
9262. **SQLite** (persistent, best for single-instance)
9273. **Memory** (fast, but lost on restart)
928
929**Real-time Updates with Jetstream**: When `JETSTREAM_ENABLED=true`, QuickDID:
930- Connects to AT Protocol firehose for live cache updates
931- Processes Account events to purge deleted/deactivated accounts
932- Processes Identity events to update handle-to-DID mappings
933- Automatically reconnects with exponential backoff on failures
934- Tracks metrics for successful and failed event processing
935
936**Recommendations by Deployment Type**:
937- **Single instance, persistent**: Use SQLite for both caching and queuing (`SQLITE_URL=sqlite:./quickdid.db`, `QUEUE_ADAPTER=sqlite`)
938- **Multi-instance, HA**: Use Redis for both caching and queuing (`REDIS_URL=redis://redis:6379/0`, `QUEUE_ADAPTER=redis`)
939- **Real-time sync**: Enable Jetstream consumer (`JETSTREAM_ENABLED=true`) for live cache updates
940- **Testing/development**: Use memory-only caching with MPSC queuing (`QUEUE_ADAPTER=mpsc`)
941- **Hybrid**: Configure both Redis and SQLite for redundancy
942
943### Queue Strategy Selection
944
945**Queue Adapter Options**:
9461. **Redis** (`QUEUE_ADAPTER=redis`) - Distributed queuing, best for multi-instance deployments
9472. **SQLite** (`QUEUE_ADAPTER=sqlite`) - Persistent queuing, best for single-instance deployments
9483. **MPSC** (`QUEUE_ADAPTER=mpsc`) - In-memory queuing, lightweight for single-instance without persistence needs
9494. **No-op** (`QUEUE_ADAPTER=noop`) - Disable queuing entirely (testing only)
950
951### Redis Optimization
952
953```redis
954# Add to redis.conf or pass as command arguments
955maxmemory 2gb
956maxmemory-policy allkeys-lru
957save "" # Disable persistence for cache-only usage
958tcp-keepalive 300
959timeout 0
960```
961
962### System Tuning
963
964```bash
965# Add to host system's /etc/sysctl.conf
966net.core.somaxconn = 1024
967net.ipv4.tcp_tw_reuse = 1
968net.ipv4.ip_local_port_range = 10000 65000
969fs.file-max = 100000
970```
971
972## Configuration Validation
973
974QuickDID validates all configuration at startup. The following rules are enforced:
975
976### Required Fields
977
978- **HTTP_EXTERNAL**: Must be provided
979- **HTTP_EXTERNAL**: Must be provided
980
981### Value Constraints
982
9831. **TTL Values** (`CACHE_TTL_MEMORY`, `CACHE_TTL_REDIS`, `CACHE_TTL_SQLITE`):
984 - Must be positive integers (> 0)
985 - Recommended minimum: 60 seconds
986
9872. **Timeout Values** (`QUEUE_REDIS_TIMEOUT`):
988 - Must be positive integers (> 0)
989 - Recommended range: 1-60 seconds
990
9913. **Queue Adapter** (`QUEUE_ADAPTER`):
992 - Must be one of: `mpsc`, `redis`, `sqlite`, `noop`, `none`
993 - Case-sensitive
994
9954. **Rate Limiting** (`RESOLVER_MAX_CONCURRENT`):
996 - Must be between 0 and 10000
997 - 0 = disabled (default)
998 - When > 0, limits concurrent handle resolutions
999
10005. **Rate Limiting Timeout** (`RESOLVER_MAX_CONCURRENT_TIMEOUT_MS`):
1001 - Must be between 0 and 60000 (milliseconds)
1002 - 0 = no timeout (default)
1003 - Maximum: 60000ms (60 seconds)
1004
1005### Validation Errors
1006
1007If validation fails, QuickDID will exit with one of these error codes:
1008
1009- `error-quickdid-config-1`: Missing required environment variable
1010- `error-quickdid-config-2`: Invalid configuration value
1011- `error-quickdid-config-3`: Invalid TTL value (must be positive)
1012- `error-quickdid-config-4`: Invalid timeout value (must be positive)
1013
1014### Testing Configuration
1015
1016```bash
1017# Validate configuration without starting service
1018HTTP_EXTERNAL=test quickdid --help
1019
1020# Test with specific values (will fail validation)
1021CACHE_TTL_MEMORY=0 quickdid --help
1022
1023# Debug configuration parsing
1024RUST_LOG=debug HTTP_EXTERNAL=test quickdid
1025```
1026
1027## Support and Resources
1028
1029- **Documentation**: [QuickDID GitHub Repository](https://github.com/smokesignal.events/quickdid)
1030- **Configuration Reference**: See [configuration-reference.md](./configuration-reference.md) for detailed documentation of all options
1031- **AT Protocol Specs**: [atproto.com](https://atproto.com)
1032- **Issues**: Report bugs via GitHub Issues
1033- **Community**: Join the AT Protocol Discord server
1034
1035## License
1036
1037QuickDID is licensed under the MIT License. See LICENSE file for details.