QuickDID is a high-performance AT Protocol identity resolution service written in Rust. It provides handle-to-DID resolution with Redis-backed caching and queue processing.
1# QuickDID Production Deployment Guide 2 3This guide provides comprehensive instructions for deploying QuickDID in a production environment using Docker. QuickDID supports multiple caching strategies: Redis (distributed), SQLite (single-instance), or in-memory caching. 4 5## Table of Contents 6 7- [Prerequisites](#prerequisites) 8- [Environment Configuration](#environment-configuration) 9- [Docker Deployment](#docker-deployment) 10- [Docker Compose Setup](#docker-compose-setup) 11- [Health Monitoring](#health-monitoring) 12- [Security Considerations](#security-considerations) 13- [Troubleshooting](#troubleshooting) 14 15## Prerequisites 16 17- Docker 20.10.0 or higher 18- Docker Compose 2.0.0 or higher (optional, for multi-container setup) 19- Redis 6.0 or higher (optional, for persistent caching and queue management) 20- SQLite 3.35 or higher (optional, alternative to Redis for single-instance caching) 21- Valid SSL certificates for HTTPS (recommended for production) 22- Domain name configured with appropriate DNS records 23 24## Environment Configuration 25 26Create a `.env` file in your deployment directory with the following configuration: 27 28```bash 29# ============================================================================ 30# QuickDID Production Environment Configuration 31# ============================================================================ 32 33# ---------------------------------------------------------------------------- 34# REQUIRED CONFIGURATION 35# ---------------------------------------------------------------------------- 36 37# External hostname for service endpoints 38# This should be your public domain name with port if non-standard 39# Examples: 40# - quickdid.example.com 41# - quickdid.example.com:8080 42# - localhost:3007 (for testing only) 43HTTP_EXTERNAL=quickdid.example.com 44 45# ---------------------------------------------------------------------------- 46# NETWORK CONFIGURATION 47# ---------------------------------------------------------------------------- 48 49# HTTP server port (default: 8080) 50# This is the port the service will bind to inside the container 51# Map this to your desired external port in docker-compose.yml 52HTTP_PORT=8080 53 54# PLC directory hostname (default: plc.directory) 55# Change this if using a custom PLC directory or testing environment 56PLC_HOSTNAME=plc.directory 57 58# ---------------------------------------------------------------------------- 59# CACHING CONFIGURATION 60# ---------------------------------------------------------------------------- 61 62# Redis connection URL for caching (recommended for production) 63# Format: redis://[username:password@]host:port/database 64# Examples: 65# - redis://localhost:6379/0 (local Redis, no auth) 66# - redis://user:pass@redis.example.com:6379/0 (remote with auth) 67# - redis://redis:6379/0 (Docker network) 68# - rediss://secure-redis.example.com:6380/0 (TLS) 69# Benefits: Persistent cache, distributed caching, better performance 70REDIS_URL=redis://redis:6379/0 71 72# SQLite database URL for caching (alternative to Redis for single-instance deployments) 73# Format: sqlite:path/to/database.db 74# Examples: 75# - sqlite:./quickdid.db (file-based database) 76# - sqlite::memory: (in-memory database for testing) 77# - sqlite:/var/lib/quickdid/cache.db (absolute path) 78# Benefits: Persistent cache, single-file storage, no external dependencies 79# Note: Cache priority is Redis > SQLite > Memory (first available is used) 80# SQLITE_URL=sqlite:./quickdid.db 81 82# TTL for in-memory cache in seconds (default: 600 = 10 minutes) 83# Range: 60-3600 recommended 84# Lower = fresher data, more DNS/HTTP lookups 85# Higher = better performance, potentially stale data 86CACHE_TTL_MEMORY=600 87 88# TTL for Redis cache in seconds (default: 7776000 = 90 days) 89# Range: 3600-31536000 (1 hour to 1 year) 90# Recommendations: 91# - 86400 (1 day) for frequently changing data 92# - 604800 (1 week) for balanced performance 93# - 7776000 (90 days) for stable data 94CACHE_TTL_REDIS=86400 95 96# TTL for SQLite cache in seconds (default: 7776000 = 90 days) 97# Range: 3600-31536000 (1 hour to 1 year) 98# Same recommendations as Redis TTL 99# Only used when SQLITE_URL is configured 100CACHE_TTL_SQLITE=86400 101 102# ---------------------------------------------------------------------------- 103# QUEUE CONFIGURATION 104# ---------------------------------------------------------------------------- 105 106# Queue adapter type: 'mpsc', 'redis', 'sqlite', 'noop', or 'none' (default: mpsc) 107# - 'mpsc': In-memory queue for single-instance deployments 108# - 'redis': Distributed queue for multi-instance or HA deployments 109# - 'sqlite': Persistent queue for single-instance deployments 110# - 'noop': Disable queue processing (testing only) 111# - 'none': Alias for 'noop' 112QUEUE_ADAPTER=redis 113 114# Redis URL for queue adapter (uses REDIS_URL if not set) 115# Set this if you want to use a separate Redis instance for queuing 116# QUEUE_REDIS_URL=redis://queue-redis:6379/1 117 118# Redis key prefix for queues (default: queue:handleresolver:) 119# Useful when sharing Redis instance with other services 120QUEUE_REDIS_PREFIX=queue:quickdid:prod: 121 122# Redis blocking timeout for queue operations in seconds (default: 5) 123# Range: 1-60 recommended 124# Lower = more responsive to shutdown, more polling 125# Higher = less polling overhead, slower shutdown 126QUEUE_REDIS_TIMEOUT=5 127 128# Enable deduplication for Redis queue to prevent duplicate handles (default: false) 129# When enabled, uses Redis SET with TTL to track handles being processed 130# Prevents the same handle from being queued multiple times within the TTL window 131QUEUE_REDIS_DEDUP_ENABLED=false 132 133# TTL for Redis queue deduplication keys in seconds (default: 60) 134# Range: 10-300 recommended 135# Determines how long to prevent duplicate handle resolution requests 136QUEUE_REDIS_DEDUP_TTL=60 137 138# Worker ID for Redis queue (defaults to "worker1") 139# Set this for predictable worker identification in multi-instance deployments 140# Examples: worker-001, prod-us-east-1, $(hostname) 141QUEUE_WORKER_ID=prod-worker-1 142 143# Buffer size for MPSC queue (default: 1000) 144# Range: 100-100000 145# Increase for high-traffic deployments using MPSC adapter 146QUEUE_BUFFER_SIZE=5000 147 148# Maximum queue size for SQLite adapter work shedding (default: 10000) 149# Range: 100-1000000 (recommended) 150# When exceeded, oldest entries are deleted to maintain this limit 151# Set to 0 to disable work shedding (unlimited queue size) 152# Benefits: Prevents unbounded disk usage, maintains recent work items 153QUEUE_SQLITE_MAX_SIZE=10000 154 155# ---------------------------------------------------------------------------- 156# HTTP CLIENT CONFIGURATION 157# ---------------------------------------------------------------------------- 158 159# HTTP User-Agent header 160# Identifies your service to other AT Protocol services 161# Default: Auto-generated with current version from Cargo.toml 162# Format: quickdid/{version} (+https://github.com/smokesignal.events/quickdid) 163USER_AGENT=quickdid/1.0.0-rc.5 (+https://quickdid.example.com) 164 165# Custom DNS nameservers (comma-separated) 166# Use for custom DNS resolution or to bypass local DNS 167# Examples: 168# - 8.8.8.8,8.8.4.4 (Google DNS) 169# - 1.1.1.1,1.0.0.1 (Cloudflare DNS) 170# DNS_NAMESERVERS=1.1.1.1,1.0.0.1 171 172# Additional CA certificates (comma-separated file paths) 173# Use when connecting to services with custom CA certificates 174# CERTIFICATE_BUNDLES=/certs/custom-ca.pem,/certs/internal-ca.pem 175 176# ---------------------------------------------------------------------------- 177# LOGGING AND MONITORING 178# ---------------------------------------------------------------------------- 179 180# Logging level (debug, info, warn, error) 181# Use 'info' for production, 'debug' for troubleshooting 182RUST_LOG=info 183 184# Structured logging format (optional) 185# Set to 'json' for machine-readable logs 186# RUST_LOG_FORMAT=json 187 188# ---------------------------------------------------------------------------- 189# RATE LIMITING CONFIGURATION 190# ---------------------------------------------------------------------------- 191 192# Maximum concurrent handle resolutions (default: 0 = disabled) 193# When > 0, enables semaphore-based rate limiting 194# Range: 0-10000 (0 = disabled) 195# Protects upstream DNS/HTTP services from being overwhelmed 196RESOLVER_MAX_CONCURRENT=0 197 198# Timeout for acquiring rate limit permit in milliseconds (default: 0 = no timeout) 199# When > 0, requests will timeout if they can't acquire a permit within this time 200# Range: 0-60000 (max 60 seconds) 201# Prevents requests from waiting indefinitely when rate limiter is at capacity 202RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=0 203 204# ---------------------------------------------------------------------------- 205# HTTP CACHING CONFIGURATION 206# ---------------------------------------------------------------------------- 207 208# ETAG seed for cache invalidation (default: application version) 209# Used to generate ETAG checksums for HTTP responses 210# Changing this value invalidates all client-cached responses 211# Examples: 212# - prod-2024-01-15 (deployment-specific) 213# - v1.0.0-1705344000 (version with timestamp) 214# - config-update-2024-01-15 (after configuration changes) 215# Default uses the application version from Cargo.toml 216# ETAG_SEED=prod-2024-01-15 217 218# Maximum age for HTTP Cache-Control header in seconds (default: 86400 = 24 hours) 219# Set to 0 to disable Cache-Control header 220# Controls how long clients and intermediate caches can cache responses 221CACHE_MAX_AGE=86400 222 223# Stale-if-error directive for Cache-Control in seconds (default: 172800 = 48 hours) 224# Allows stale content to be served if backend errors occur 225# Provides resilience during service outages 226CACHE_STALE_IF_ERROR=172800 227 228# Stale-while-revalidate directive for Cache-Control in seconds (default: 86400 = 24 hours) 229# Allows stale content to be served while fetching fresh content in background 230# Improves perceived performance for users 231CACHE_STALE_WHILE_REVALIDATE=86400 232 233# Max-stale directive for Cache-Control in seconds (default: 172800 = 48 hours) 234# Maximum time client will accept stale responses 235# Provides upper bound on cached content age 236CACHE_MAX_STALE=172800 237 238# Min-fresh directive for Cache-Control in seconds (default: 3600 = 1 hour) 239# Minimum time response must remain fresh 240# Clients won't accept responses expiring within this time 241CACHE_MIN_FRESH=3600 242 243# ---------------------------------------------------------------------------- 244# METRICS CONFIGURATION 245# ---------------------------------------------------------------------------- 246 247# Metrics adapter type: 'noop' or 'statsd' (default: noop) 248# - 'noop': No metrics collection (default) 249# - 'statsd': Send metrics to StatsD server 250METRICS_ADAPTER=statsd 251 252# StatsD host and port (required when METRICS_ADAPTER=statsd) 253# Format: hostname:port 254# Examples: 255# - localhost:8125 (local StatsD) 256# - statsd.example.com:8125 (remote StatsD) 257METRICS_STATSD_HOST=localhost:8125 258 259# Bind address for StatsD UDP socket (default: [::]:0) 260# Controls which local address to bind for sending UDP packets 261# Examples: 262# - [::]:0 (IPv6 any address, random port - default) 263# - 0.0.0.0:0 (IPv4 any address, random port) 264# - 192.168.1.100:0 (specific interface) 265METRICS_STATSD_BIND=[::]:0 266 267# Prefix for all metrics (default: quickdid) 268# Used to namespace metrics in your monitoring system 269# Examples: 270# - quickdid (default) 271# - prod.quickdid 272# - us-east-1.quickdid 273METRICS_PREFIX=quickdid 274 275# Tags for all metrics (comma-separated key:value pairs) 276# Added to all metrics for filtering and grouping 277# Examples: 278# - env:production,service:quickdid 279# - env:staging,region:us-east-1,version:1.0.0 280METRICS_TAGS=env:production,service:quickdid 281 282# ---------------------------------------------------------------------------- 283# PROACTIVE REFRESH CONFIGURATION 284# ---------------------------------------------------------------------------- 285 286# Enable proactive cache refresh (default: false) 287# When enabled, cache entries nearing expiration are automatically refreshed 288# in the background to prevent cache misses for frequently accessed handles 289PROACTIVE_REFRESH_ENABLED=false 290 291# Threshold for proactive refresh as percentage of TTL (default: 0.8) 292# Range: 0.0-1.0 (0% to 100% of TTL) 293# Example: 0.8 means refresh when 80% of TTL has elapsed 294# Lower values = more aggressive refreshing, higher load 295# Higher values = less aggressive refreshing, more cache misses 296PROACTIVE_REFRESH_THRESHOLD=0.8 297 298# ---------------------------------------------------------------------------- 299# JETSTREAM CONSUMER CONFIGURATION 300# ---------------------------------------------------------------------------- 301 302# Enable Jetstream consumer for real-time cache updates (default: false) 303# When enabled, connects to AT Protocol firehose for live updates 304# Processes Account events (deleted/deactivated) and Identity events (handle changes) 305# Automatically reconnects with exponential backoff on connection failures 306JETSTREAM_ENABLED=false 307 308# Jetstream WebSocket hostname (default: jetstream.atproto.tools) 309# The firehose service to connect to for real-time AT Protocol events 310# Examples: 311# - jetstream.atproto.tools (production firehose) 312# - jetstream-staging.atproto.tools (staging environment) 313# - localhost:6008 (local development) 314JETSTREAM_HOSTNAME=jetstream.atproto.tools 315 316# ---------------------------------------------------------------------------- 317# STATIC FILES CONFIGURATION 318# ---------------------------------------------------------------------------- 319 320# Directory path for serving static files (default: www) 321# This directory should contain: 322# - index.html (landing page) 323# - .well-known/atproto-did (service DID identifier) 324# - .well-known/did.json (DID document) 325# In Docker, this defaults to /app/www 326# You can mount custom files via Docker volumes 327STATIC_FILES_DIR=/app/www 328 329# ---------------------------------------------------------------------------- 330# PERFORMANCE TUNING 331# ---------------------------------------------------------------------------- 332 333# Tokio runtime worker threads (defaults to CPU count) 334# Adjust based on your container's CPU allocation 335# TOKIO_WORKER_THREADS=4 336 337# Maximum concurrent connections (optional) 338# Helps prevent resource exhaustion 339# MAX_CONNECTIONS=10000 340 341# ---------------------------------------------------------------------------- 342# DOCKER-SPECIFIC CONFIGURATION 343# ---------------------------------------------------------------------------- 344 345# Container restart policy (for docker-compose) 346# Options: no, always, on-failure, unless-stopped 347RESTART_POLICY=unless-stopped 348 349# Resource limits (for docker-compose) 350# Adjust based on your available resources 351MEMORY_LIMIT=512M 352CPU_LIMIT=1.0 353``` 354 355## Docker Deployment 356 357### Building the Docker Image 358 359Create a `Dockerfile` in your project root: 360 361```dockerfile 362# Build stage 363FROM rust:1.75-slim AS builder 364 365# Install build dependencies 366RUN apt-get update && apt-get install -y \ 367 pkg-config \ 368 libssl-dev \ 369 && rm -rf /var/lib/apt/lists/* 370 371# Create app directory 372WORKDIR /app 373 374# Copy source files 375COPY Cargo.toml Cargo.lock ./ 376COPY src ./src 377 378# Build the application 379RUN cargo build --release 380 381# Runtime stage 382FROM debian:bookworm-slim 383 384# Install runtime dependencies 385RUN apt-get update && apt-get install -y \ 386 ca-certificates \ 387 libssl3 \ 388 curl \ 389 && rm -rf /var/lib/apt/lists/* 390 391# Create non-root user 392RUN useradd -m -u 1000 quickdid 393 394# Copy binary from builder 395COPY --from=builder /app/target/release/quickdid /usr/local/bin/quickdid 396 397# Set ownership and permissions 398RUN chown quickdid:quickdid /usr/local/bin/quickdid 399 400# Switch to non-root user 401USER quickdid 402 403# Expose default port 404EXPOSE 8080 405 406# Health check 407HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ 408 CMD curl -f http://localhost:8080/health || exit 1 409 410# Run the application 411ENTRYPOINT ["quickdid"] 412``` 413 414Build the image: 415 416```bash 417docker build -t quickdid:latest . 418``` 419 420### Running a Single Instance 421 422```bash 423# Run with environment file 424docker run -d \ 425 --name quickdid \ 426 --env-file .env \ 427 -p 8080:8080 \ 428 --restart unless-stopped \ 429 quickdid:latest 430``` 431 432## Docker Compose Setup 433 434### Redis-based Production Setup with Jetstream 435 436Create a `docker-compose.yml` file for a complete production setup with Redis and optional Jetstream consumer: 437 438```yaml 439version: '3.8' 440 441services: 442 quickdid: 443 image: quickdid:latest 444 container_name: quickdid 445 env_file: .env 446 ports: 447 - "8080:8080" 448 depends_on: 449 redis: 450 condition: service_healthy 451 networks: 452 - quickdid-network 453 restart: ${RESTART_POLICY:-unless-stopped} 454 deploy: 455 resources: 456 limits: 457 memory: ${MEMORY_LIMIT:-512M} 458 cpus: ${CPU_LIMIT:-1.0} 459 reservations: 460 memory: 256M 461 cpus: '0.5' 462 healthcheck: 463 test: ["CMD", "curl", "-f", "http://localhost:8080/health"] 464 interval: 30s 465 timeout: 3s 466 retries: 3 467 start_period: 10s 468 logging: 469 driver: "json-file" 470 options: 471 max-size: "10m" 472 max-file: "3" 473 474 redis: 475 image: redis:7-alpine 476 container_name: quickdid-redis 477 command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru 478 volumes: 479 - redis-data:/data 480 networks: 481 - quickdid-network 482 restart: unless-stopped 483 healthcheck: 484 test: ["CMD", "redis-cli", "ping"] 485 interval: 10s 486 timeout: 3s 487 retries: 3 488 logging: 489 driver: "json-file" 490 options: 491 max-size: "10m" 492 max-file: "3" 493 494 # Optional: Nginx reverse proxy with SSL 495 nginx: 496 image: nginx:alpine 497 container_name: quickdid-nginx 498 ports: 499 - "80:80" 500 - "443:443" 501 volumes: 502 - ./nginx.conf:/etc/nginx/nginx.conf:ro 503 - ./certs:/etc/nginx/certs:ro 504 - ./acme-challenge:/var/www/acme:ro 505 depends_on: 506 - quickdid 507 networks: 508 - quickdid-network 509 restart: unless-stopped 510 logging: 511 driver: "json-file" 512 options: 513 max-size: "10m" 514 max-file: "3" 515 516networks: 517 quickdid-network: 518 driver: bridge 519 520volumes: 521 redis-data: 522 driver: local 523``` 524 525### SQLite-based Single-Instance Setup with Jetstream 526 527For single-instance deployments without Redis, create a simpler `docker-compose.sqlite.yml` with optional Jetstream consumer: 528 529```yaml 530version: '3.8' 531 532services: 533 quickdid: 534 image: quickdid:latest 535 container_name: quickdid-sqlite 536 environment: 537 HTTP_EXTERNAL: quickdid.example.com 538 HTTP_PORT: 8080 539 SQLITE_URL: sqlite:/data/quickdid.db 540 CACHE_TTL_MEMORY: 600 541 CACHE_TTL_SQLITE: 86400 542 QUEUE_ADAPTER: sqlite 543 QUEUE_BUFFER_SIZE: 5000 544 QUEUE_SQLITE_MAX_SIZE: 10000 545 # Optional: Enable Jetstream for real-time cache updates 546 # JETSTREAM_ENABLED: true 547 # JETSTREAM_HOSTNAME: jetstream.atproto.tools 548 RUST_LOG: info 549 ports: 550 - "8080:8080" 551 volumes: 552 - quickdid-data:/data 553 networks: 554 - quickdid-network 555 restart: ${RESTART_POLICY:-unless-stopped} 556 deploy: 557 resources: 558 limits: 559 memory: ${MEMORY_LIMIT:-256M} 560 cpus: ${CPU_LIMIT:-0.5} 561 reservations: 562 memory: 128M 563 cpus: '0.25' 564 healthcheck: 565 test: ["CMD", "curl", "-f", "http://localhost:8080/health"] 566 interval: 30s 567 timeout: 3s 568 retries: 3 569 start_period: 10s 570 logging: 571 driver: "json-file" 572 options: 573 max-size: "10m" 574 max-file: "3" 575 576 # Optional: Nginx reverse proxy with SSL 577 nginx: 578 image: nginx:alpine 579 container_name: quickdid-nginx 580 ports: 581 - "80:80" 582 - "443:443" 583 volumes: 584 - ./nginx.conf:/etc/nginx/nginx.conf:ro 585 - ./certs:/etc/nginx/certs:ro 586 - ./acme-challenge:/var/www/acme:ro 587 depends_on: 588 - quickdid 589 networks: 590 - quickdid-network 591 restart: unless-stopped 592 logging: 593 driver: "json-file" 594 options: 595 max-size: "10m" 596 max-file: "3" 597 598networks: 599 quickdid-network: 600 driver: bridge 601 602volumes: 603 quickdid-data: 604 driver: local 605``` 606 607### Nginx Configuration (nginx.conf) 608 609```nginx 610events { 611 worker_connections 1024; 612} 613 614http { 615 upstream quickdid { 616 server quickdid:8080; 617 } 618 619 server { 620 listen 80; 621 server_name quickdid.example.com; 622 623 # ACME challenge for Let's Encrypt 624 location /.well-known/acme-challenge/ { 625 root /var/www/acme; 626 } 627 628 # Redirect HTTP to HTTPS 629 location / { 630 return 301 https://$server_name$request_uri; 631 } 632 } 633 634 server { 635 listen 443 ssl http2; 636 server_name quickdid.example.com; 637 638 ssl_certificate /etc/nginx/certs/fullchain.pem; 639 ssl_certificate_key /etc/nginx/certs/privkey.pem; 640 ssl_protocols TLSv1.2 TLSv1.3; 641 ssl_ciphers HIGH:!aNULL:!MD5; 642 643 location / { 644 proxy_pass http://quickdid; 645 proxy_set_header Host $host; 646 proxy_set_header X-Real-IP $remote_addr; 647 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; 648 proxy_set_header X-Forwarded-Proto $scheme; 649 650 # WebSocket support (if needed) 651 proxy_http_version 1.1; 652 proxy_set_header Upgrade $http_upgrade; 653 proxy_set_header Connection "upgrade"; 654 655 # Timeouts 656 proxy_connect_timeout 60s; 657 proxy_send_timeout 60s; 658 proxy_read_timeout 60s; 659 } 660 661 # Health check endpoint 662 location /health { 663 proxy_pass http://quickdid/health; 664 access_log off; 665 } 666 } 667} 668``` 669 670### Starting the Stack 671 672```bash 673# Start Redis-based stack 674docker-compose up -d 675 676# Start SQLite-based stack 677docker-compose -f docker-compose.sqlite.yml up -d 678 679# View logs 680docker-compose logs -f 681# or for SQLite setup 682docker-compose -f docker-compose.sqlite.yml logs -f 683 684# Check service status 685docker-compose ps 686 687# Stop all services 688docker-compose down 689# or for SQLite setup 690docker-compose -f docker-compose.sqlite.yml down 691``` 692 693## Health Monitoring 694 695QuickDID provides health check endpoints for monitoring: 696 697### Basic Health Check 698 699```bash 700curl http://quickdid.example.com/health 701``` 702 703Expected response: 704```json 705{ 706 "status": "healthy", 707 "version": "1.0.0", 708 "uptime_seconds": 3600 709} 710``` 711 712### Monitoring with Prometheus (Optional) 713 714Add to your `docker-compose.yml`: 715 716```yaml 717 prometheus: 718 image: prom/prometheus:latest 719 container_name: quickdid-prometheus 720 volumes: 721 - ./prometheus.yml:/etc/prometheus/prometheus.yml 722 - prometheus-data:/prometheus 723 command: 724 - '--config.file=/etc/prometheus/prometheus.yml' 725 - '--storage.tsdb.path=/prometheus' 726 ports: 727 - "9090:9090" 728 networks: 729 - quickdid-network 730 restart: unless-stopped 731 732volumes: 733 prometheus-data: 734 driver: local 735``` 736 737## Security Considerations 738 739### 1. Service Key Protection 740 741- **Never commit** sensitive configuration to version control 742- Store keys in a secure secret management system (e.g., HashiCorp Vault, AWS Secrets Manager) 743- Rotate keys regularly 744- Use different keys for different environments 745 746### 2. Network Security 747 748- Use HTTPS in production with valid SSL certificates 749- Implement rate limiting at the reverse proxy level 750- Use firewall rules to restrict access to Redis 751- Enable Redis authentication in production 752 753### 3. Container Security 754 755- Run containers as non-root user (already configured in Dockerfile) 756- Keep base images updated 757- Scan images for vulnerabilities regularly 758- Use read-only filesystems where possible 759 760### 4. Redis Security 761 762```bash 763# Add to Redis configuration for production 764requirepass your_strong_password_here 765maxclients 10000 766timeout 300 767``` 768 769### 5. Environment Variables 770 771- Use Docker secrets or external secret management 772- Avoid logging sensitive environment variables 773- Implement proper access controls 774 775## Troubleshooting 776 777### Common Issues and Solutions 778 779#### 1. Container Won't Start 780 781```bash 782# Check logs 783docker logs quickdid 784 785# Verify environment variables 786docker exec quickdid env | grep -E "HTTP_EXTERNAL|HTTP_PORT" 787 788# Test Redis connectivity 789docker exec quickdid redis-cli -h redis ping 790``` 791 792#### 2. Handle Resolution Failures 793 794```bash 795# Enable debug logging 796docker exec quickdid sh -c "export RUST_LOG=debug" 797 798# Check DNS resolution 799docker exec quickdid nslookup plc.directory 800 801# Verify Redis cache (if using Redis) 802docker exec -it quickdid-redis redis-cli 803> KEYS handle:* 804> TTL handle:example_key 805 806# Check SQLite cache (if using SQLite) 807docker exec quickdid sqlite3 /data/quickdid.db ".tables" 808docker exec quickdid sqlite3 /data/quickdid.db "SELECT COUNT(*) FROM handle_resolution_cache;" 809``` 810 811#### 3. Performance Issues 812 813```bash 814# Monitor Redis memory usage (if using Redis) 815docker exec quickdid-redis redis-cli INFO memory 816 817# Check SQLite database size (if using SQLite) 818docker exec quickdid ls -lh /data/quickdid.db 819docker exec quickdid sqlite3 /data/quickdid.db "PRAGMA page_count; PRAGMA page_size;" 820 821# Check container resource usage 822docker stats quickdid 823 824# Analyze slow queries (with debug logging) 825docker logs quickdid | grep "resolution took" 826``` 827 828#### 4. Health Check Failures 829 830```bash 831# Manual health check 832docker exec quickdid curl -v http://localhost:8080/health 833 834# Check service binding 835docker exec quickdid netstat -tlnp | grep 8080 836``` 837 838### Debugging Commands 839 840```bash 841# Interactive shell in container 842docker exec -it quickdid /bin/bash 843 844# Test handle resolution 845curl "http://localhost:8080/xrpc/com.atproto.identity.resolveHandle?handle=example.bsky.social" 846 847# Check Redis keys (if using Redis) 848docker exec quickdid-redis redis-cli --scan --pattern "handle:*" | head -20 849 850# Check SQLite cache entries (if using SQLite) 851docker exec quickdid sqlite3 /data/quickdid.db "SELECT COUNT(*) as total_entries, MIN(updated) as oldest, MAX(updated) as newest FROM handle_resolution_cache;" 852 853# Check SQLite queue entries (if using SQLite queue adapter) 854docker exec quickdid sqlite3 /data/quickdid.db "SELECT COUNT(*) as queue_entries, MIN(queued_at) as oldest, MAX(queued_at) as newest FROM handle_resolution_queue;" 855 856# Monitor real-time logs 857docker-compose logs -f quickdid | grep -E "ERROR|WARN" 858``` 859 860## Maintenance 861 862### Backup and Restore 863 864#### Redis Backup 865```bash 866# Backup Redis data 867docker exec quickdid-redis redis-cli BGSAVE 868docker cp quickdid-redis:/data/dump.rdb ./backups/redis-$(date +%Y%m%d).rdb 869 870# Restore Redis data 871docker cp ./backups/redis-backup.rdb quickdid-redis:/data/dump.rdb 872docker restart quickdid-redis 873``` 874 875#### SQLite Backup 876```bash 877# Backup SQLite database 878docker exec quickdid sqlite3 /data/quickdid.db ".backup /tmp/backup.db" 879docker cp quickdid:/tmp/backup.db ./backups/sqlite-$(date +%Y%m%d).db 880 881# Alternative: Copy database file directly (service must be stopped) 882docker-compose -f docker-compose.sqlite.yml stop quickdid 883docker cp quickdid:/data/quickdid.db ./backups/sqlite-$(date +%Y%m%d).db 884docker-compose -f docker-compose.sqlite.yml start quickdid 885 886# Restore SQLite database 887docker-compose -f docker-compose.sqlite.yml stop quickdid 888docker cp ./backups/sqlite-backup.db quickdid:/data/quickdid.db 889docker-compose -f docker-compose.sqlite.yml start quickdid 890``` 891 892### Updates and Rollbacks 893 894```bash 895# Update to new version 896docker pull quickdid:new-version 897docker-compose down 898docker-compose up -d 899 900# Rollback if needed 901docker-compose down 902docker tag quickdid:previous quickdid:latest 903docker-compose up -d 904``` 905 906### Log Rotation 907 908Configure Docker's built-in log rotation in `/etc/docker/daemon.json`: 909 910```json 911{ 912 "log-driver": "json-file", 913 "log-opts": { 914 "max-size": "10m", 915 "max-file": "3" 916 } 917} 918``` 919 920## Performance Optimization 921 922### Caching Strategy Selection 923 924**Cache Priority**: QuickDID uses the first available cache in this order: 9251. **Redis** (distributed, best for multi-instance) 9262. **SQLite** (persistent, best for single-instance) 9273. **Memory** (fast, but lost on restart) 928 929**Real-time Updates with Jetstream**: When `JETSTREAM_ENABLED=true`, QuickDID: 930- Connects to AT Protocol firehose for live cache updates 931- Processes Account events to purge deleted/deactivated accounts 932- Processes Identity events to update handle-to-DID mappings 933- Automatically reconnects with exponential backoff on failures 934- Tracks metrics for successful and failed event processing 935 936**Recommendations by Deployment Type**: 937- **Single instance, persistent**: Use SQLite for both caching and queuing (`SQLITE_URL=sqlite:./quickdid.db`, `QUEUE_ADAPTER=sqlite`) 938- **Multi-instance, HA**: Use Redis for both caching and queuing (`REDIS_URL=redis://redis:6379/0`, `QUEUE_ADAPTER=redis`) 939- **Real-time sync**: Enable Jetstream consumer (`JETSTREAM_ENABLED=true`) for live cache updates 940- **Testing/development**: Use memory-only caching with MPSC queuing (`QUEUE_ADAPTER=mpsc`) 941- **Hybrid**: Configure both Redis and SQLite for redundancy 942 943### Queue Strategy Selection 944 945**Queue Adapter Options**: 9461. **Redis** (`QUEUE_ADAPTER=redis`) - Distributed queuing, best for multi-instance deployments 9472. **SQLite** (`QUEUE_ADAPTER=sqlite`) - Persistent queuing, best for single-instance deployments 9483. **MPSC** (`QUEUE_ADAPTER=mpsc`) - In-memory queuing, lightweight for single-instance without persistence needs 9494. **No-op** (`QUEUE_ADAPTER=noop`) - Disable queuing entirely (testing only) 950 951### Redis Optimization 952 953```redis 954# Add to redis.conf or pass as command arguments 955maxmemory 2gb 956maxmemory-policy allkeys-lru 957save "" # Disable persistence for cache-only usage 958tcp-keepalive 300 959timeout 0 960``` 961 962### System Tuning 963 964```bash 965# Add to host system's /etc/sysctl.conf 966net.core.somaxconn = 1024 967net.ipv4.tcp_tw_reuse = 1 968net.ipv4.ip_local_port_range = 10000 65000 969fs.file-max = 100000 970``` 971 972## Configuration Validation 973 974QuickDID validates all configuration at startup. The following rules are enforced: 975 976### Required Fields 977 978- **HTTP_EXTERNAL**: Must be provided 979- **HTTP_EXTERNAL**: Must be provided 980 981### Value Constraints 982 9831. **TTL Values** (`CACHE_TTL_MEMORY`, `CACHE_TTL_REDIS`, `CACHE_TTL_SQLITE`): 984 - Must be positive integers (> 0) 985 - Recommended minimum: 60 seconds 986 9872. **Timeout Values** (`QUEUE_REDIS_TIMEOUT`): 988 - Must be positive integers (> 0) 989 - Recommended range: 1-60 seconds 990 9913. **Queue Adapter** (`QUEUE_ADAPTER`): 992 - Must be one of: `mpsc`, `redis`, `sqlite`, `noop`, `none` 993 - Case-sensitive 994 9954. **Rate Limiting** (`RESOLVER_MAX_CONCURRENT`): 996 - Must be between 0 and 10000 997 - 0 = disabled (default) 998 - When > 0, limits concurrent handle resolutions 999 10005. **Rate Limiting Timeout** (`RESOLVER_MAX_CONCURRENT_TIMEOUT_MS`): 1001 - Must be between 0 and 60000 (milliseconds) 1002 - 0 = no timeout (default) 1003 - Maximum: 60000ms (60 seconds) 1004 1005### Validation Errors 1006 1007If validation fails, QuickDID will exit with one of these error codes: 1008 1009- `error-quickdid-config-1`: Missing required environment variable 1010- `error-quickdid-config-2`: Invalid configuration value 1011- `error-quickdid-config-3`: Invalid TTL value (must be positive) 1012- `error-quickdid-config-4`: Invalid timeout value (must be positive) 1013 1014### Testing Configuration 1015 1016```bash 1017# Validate configuration without starting service 1018HTTP_EXTERNAL=test quickdid --help 1019 1020# Test with specific values (will fail validation) 1021CACHE_TTL_MEMORY=0 quickdid --help 1022 1023# Debug configuration parsing 1024RUST_LOG=debug HTTP_EXTERNAL=test quickdid 1025``` 1026 1027## Support and Resources 1028 1029- **Documentation**: [QuickDID GitHub Repository](https://github.com/smokesignal.events/quickdid) 1030- **Configuration Reference**: See [configuration-reference.md](./configuration-reference.md) for detailed documentation of all options 1031- **AT Protocol Specs**: [atproto.com](https://atproto.com) 1032- **Issues**: Report bugs via GitHub Issues 1033- **Community**: Join the AT Protocol Discord server 1034 1035## License 1036 1037QuickDID is licensed under the MIT License. See LICENSE file for details.