# QuickDID Configuration Reference

This document provides a comprehensive reference for all configuration options available in QuickDID.

## Table of Contents

- [Required Configuration](#required-configuration)
- [Network Configuration](#network-configuration)
- [Caching Configuration](#caching-configuration)
- [Queue Configuration](#queue-configuration)
- [Rate Limiting Configuration](#rate-limiting-configuration)
- [HTTP Caching Configuration](#http-caching-configuration)
- [Metrics Configuration](#metrics-configuration)
- [Proactive Refresh Configuration](#proactive-refresh-configuration)
- [Jetstream Consumer Configuration](#jetstream-consumer-configuration)
- [Static Files Configuration](#static-files-configuration)
- [Configuration Examples](#configuration-examples)
- [Validation Rules](#validation-rules)

## Required Configuration

These environment variables MUST be set for QuickDID to start.

### `HTTP_EXTERNAL`

**Required**: Yes  
**Type**: String  
**Format**: Hostname with optional port  

The external hostname where this service will be accessible. This is used to generate the service DID and for AT Protocol identity resolution.

**Examples**:
```bash
# Production domain
HTTP_EXTERNAL=quickdid.example.com

# With non-standard port
HTTP_EXTERNAL=quickdid.example.com:8080

# Development/testing
HTTP_EXTERNAL=localhost:3007
```

**Constraints**:
- Must be a valid hostname or hostname:port combination
- Port (if specified) must be between 1-65535

## Network Configuration

### `HTTP_PORT`

**Required**: No  
**Type**: String  
**Default**: `8080`  
**Range**: 1-65535  

The port number for the HTTP server to bind to.

**Examples**:
```bash
HTTP_PORT=8080    # Default
HTTP_PORT=3000    # Common alternative
HTTP_PORT=80      # Standard HTTP (requires root/privileges)
```

### `PLC_HOSTNAME`

**Required**: No  
**Type**: String  
**Default**: `plc.directory`  

The hostname of the PLC directory service for DID resolution.

**Examples**:
```bash
PLC_HOSTNAME=plc.directory        # Production (default)
PLC_HOSTNAME=test.plc.directory   # Testing environment
PLC_HOSTNAME=localhost:2582       # Local PLC server
```

### `DNS_NAMESERVERS`

**Required**: No  
**Type**: String (comma-separated IP addresses)  
**Default**: System DNS  

Custom DNS nameservers for handle resolution via TXT records.

**Examples**:
```bash
# Google DNS
DNS_NAMESERVERS=8.8.8.8,8.8.4.4

# Cloudflare DNS
DNS_NAMESERVERS=1.1.1.1,1.0.0.1

# Multiple providers
DNS_NAMESERVERS=8.8.8.8,1.1.1.1

# Local DNS
DNS_NAMESERVERS=192.168.1.1
```

### `USER_AGENT`

**Required**: No  
**Type**: String  
**Default**: `quickdid/{version} (+https://github.com/smokesignal.events/quickdid)`  

HTTP User-Agent header for outgoing requests.

**Examples**:
```bash
# Custom agent
USER_AGENT="MyService/1.0.0 (+https://myservice.com)"

# With contact info
USER_AGENT="quickdid/1.0.0 (+https://quickdid.example.com; admin@example.com)"
```

### `CERTIFICATE_BUNDLES`

**Required**: No  
**Type**: String (comma-separated file paths)  
**Default**: System CA certificates  

Additional CA certificate bundles for TLS connections.

**Examples**:
```bash
# Single certificate
CERTIFICATE_BUNDLES=/etc/ssl/certs/custom-ca.pem

# Multiple certificates
CERTIFICATE_BUNDLES=/certs/ca1.pem,/certs/ca2.pem

# Corporate CA
CERTIFICATE_BUNDLES=/usr/local/share/ca-certificates/corporate-ca.crt
```

## Caching Configuration

### `REDIS_URL`

**Required**: No (recommended for multi-instance production)  
**Type**: String  
**Format**: Redis connection URL  

Redis connection URL for persistent caching. Enables distributed caching and better performance.

**Examples**:
```bash
# Local Redis (no auth)
REDIS_URL=redis://localhost:6379/0

# With authentication
REDIS_URL=redis://user:password@redis.example.com:6379/0

# Using database 1
REDIS_URL=redis://localhost:6379/1

# Redis Sentinel
REDIS_URL=redis-sentinel://sentinel1:26379,sentinel2:26379/mymaster/0

# TLS connection
REDIS_URL=rediss://secure-redis.example.com:6380/0
```

### `SQLITE_URL`

**Required**: No (recommended for single-instance production)  
**Type**: String  
**Format**: SQLite database URL  

SQLite database URL for persistent caching. Provides single-file persistent storage without external dependencies.

**Examples**:
```bash
# File-based database (recommended)
SQLITE_URL=sqlite:./quickdid.db

# With absolute path
SQLITE_URL=sqlite:/var/lib/quickdid/cache.db

# In-memory database (testing only)
SQLITE_URL=sqlite::memory:

# Alternative file syntax
SQLITE_URL=sqlite:///path/to/database.db
```

**Cache Priority**: QuickDID uses the first available cache:
1. Redis (if `REDIS_URL` is configured)
2. SQLite (if `SQLITE_URL` is configured)
3. In-memory cache (fallback)

### `CACHE_TTL_MEMORY`

**Required**: No  
**Type**: Integer (seconds)  
**Default**: `600` (10 minutes)  
**Range**: 60-3600 (recommended)  
**Constraints**: Must be > 0  

Time-to-live for in-memory cache entries in seconds. Used when Redis is not available.

**Examples**:
```bash
CACHE_TTL_MEMORY=300    # 5 minutes (aggressive refresh)
CACHE_TTL_MEMORY=600    # 10 minutes (default, balanced)
CACHE_TTL_MEMORY=1800   # 30 minutes (less frequent updates)
CACHE_TTL_MEMORY=3600   # 1 hour (stable data)
```

**Recommendations**:
- Lower values: Fresher data, more DNS/HTTP lookups, higher load
- Higher values: Better performance, potentially stale data
- Production with Redis: Can use lower values (300-600)
- Production without Redis: Use higher values (1800-3600)

### `CACHE_TTL_REDIS`

**Required**: No  
**Type**: Integer (seconds)  
**Default**: `7776000` (90 days)  
**Range**: 3600-31536000 (1 hour to 1 year)  
**Constraints**: Must be > 0  

Time-to-live for Redis cache entries in seconds.

**Examples**:
```bash
CACHE_TTL_REDIS=3600      # 1 hour (frequently changing data)
CACHE_TTL_REDIS=86400     # 1 day (recommended for active handles)
CACHE_TTL_REDIS=604800    # 1 week (balanced)
CACHE_TTL_REDIS=2592000   # 30 days (stable handles)
CACHE_TTL_REDIS=7776000   # 90 days (default, maximum stability)
```

### `CACHE_TTL_SQLITE`

**Required**: No  
**Type**: Integer (seconds)  
**Default**: `7776000` (90 days)  
**Range**: 3600-31536000 (1 hour to 1 year)  
**Constraints**: Must be > 0  

Time-to-live for SQLite cache entries in seconds. Only used when `SQLITE_URL` is configured.

**Examples**:
```bash
CACHE_TTL_SQLITE=3600      # 1 hour (frequently changing data)
CACHE_TTL_SQLITE=86400     # 1 day (recommended for active handles)
CACHE_TTL_SQLITE=604800    # 1 week (balanced)
CACHE_TTL_SQLITE=2592000   # 30 days (stable handles)
CACHE_TTL_SQLITE=7776000   # 90 days (default, maximum stability)
```

**TTL Recommendations**:
- Social media handles: 1-7 days
- Corporate/stable handles: 30-90 days
- Test environments: 1 hour
- Single-instance deployments: Can use longer TTLs (30-90 days)
- Multi-instance deployments: Use shorter TTLs (1-7 days)

## Queue Configuration

### `QUEUE_ADAPTER`

**Required**: No  
**Type**: String  
**Default**: `mpsc`  
**Values**: `mpsc`, `redis`, `sqlite`, `noop`, `none`  

The type of queue adapter for background handle resolution.

**Options**:
- `mpsc`: In-memory multi-producer single-consumer queue (default)
- `redis`: Redis-backed distributed queue
- `sqlite`: SQLite-backed persistent queue
- `noop`: Disable queue processing (testing only)
- `none`: Alias for `noop`

**Examples**:
```bash
# Single instance deployment
QUEUE_ADAPTER=mpsc

# Multi-instance or high availability
QUEUE_ADAPTER=redis

# Single instance with persistence
QUEUE_ADAPTER=sqlite

# Testing without background processing
QUEUE_ADAPTER=noop

# Alternative syntax for disabling
QUEUE_ADAPTER=none
```

### `QUEUE_REDIS_URL`

**Required**: No  
**Type**: String  
**Default**: Falls back to `REDIS_URL`  

Dedicated Redis URL for queue operations. Use when separating cache and queue Redis instances.

**Examples**:
```bash
# Separate Redis for queues
QUEUE_REDIS_URL=redis://queue-redis:6379/2

# With different credentials
QUEUE_REDIS_URL=redis://queue_user:queue_pass@redis.example.com:6379/1
```

### `QUEUE_REDIS_PREFIX`

**Required**: No  
**Type**: String  
**Default**: `queue:handleresolver:`  

Redis key prefix for queue operations. Use to namespace queues when sharing Redis.

**Examples**:
```bash
# Default
QUEUE_REDIS_PREFIX=queue:handleresolver:

# Environment-specific
QUEUE_REDIS_PREFIX=prod:queue:hr:
QUEUE_REDIS_PREFIX=staging:queue:hr:

# Version-specific
QUEUE_REDIS_PREFIX=quickdid:v1:queue:

# Instance-specific
QUEUE_REDIS_PREFIX=us-east-1:queue:hr:
```

### `QUEUE_REDIS_TIMEOUT`

**Required**: No  
**Type**: Integer (seconds)  
**Default**: `5`  
**Range**: 1-60 (recommended)  
**Constraints**: Must be > 0  

Redis blocking timeout for queue operations in seconds. Controls how long to wait for new items.

**Examples**:
```bash
QUEUE_REDIS_TIMEOUT=1    # Very responsive, more polling
QUEUE_REDIS_TIMEOUT=5    # Default, balanced
QUEUE_REDIS_TIMEOUT=10   # Less polling, slower shutdown
QUEUE_REDIS_TIMEOUT=30   # Minimal polling, slow shutdown
```

### `QUEUE_REDIS_DEDUP_ENABLED`

**Required**: No  
**Type**: Boolean  
**Default**: `false`  

Enable deduplication for Redis queue to prevent duplicate handles from being queued multiple times within the TTL window. When enabled, uses Redis SET with TTL to track handles currently being processed.

**Examples**:
```bash
# Enable deduplication (recommended for production)
QUEUE_REDIS_DEDUP_ENABLED=true

# Disable deduplication (default)
QUEUE_REDIS_DEDUP_ENABLED=false
```

**Use cases**:
- **Production**: Enable to prevent duplicate work and reduce load
- **High-traffic**: Essential to avoid processing the same handle multiple times
- **Development**: Can be disabled for simpler debugging

### `QUEUE_REDIS_DEDUP_TTL`

**Required**: No  
**Type**: Integer (seconds)  
**Default**: `60`  
**Range**: 10-300 (recommended)  
**Constraints**: Must be > 0 when deduplication is enabled  

TTL for Redis queue deduplication keys in seconds. Determines how long to prevent duplicate handle resolution requests.

**Examples**:
```bash
# Quick deduplication window (10 seconds)
QUEUE_REDIS_DEDUP_TTL=10

# Default (1 minute)
QUEUE_REDIS_DEDUP_TTL=60

# Extended deduplication (5 minutes)
QUEUE_REDIS_DEDUP_TTL=300
```

**Recommendations**:
- **Fast processing**: 10-30 seconds
- **Normal processing**: 60 seconds (default)
- **Slow processing or high load**: 120-300 seconds

### `QUEUE_WORKER_ID`

**Required**: No  
**Type**: String  
**Default**: `worker1`  

Worker identifier for queue operations. Used in logs and monitoring.

**Examples**:
```bash
# Simple numbering
QUEUE_WORKER_ID=worker-001

# Environment-based
QUEUE_WORKER_ID=prod-us-east-1
QUEUE_WORKER_ID=staging-worker-2

# Hostname-based
QUEUE_WORKER_ID=$(hostname)

# Pod name in Kubernetes
QUEUE_WORKER_ID=$HOSTNAME
```

### `QUEUE_BUFFER_SIZE`

**Required**: No  
**Type**: Integer  
**Default**: `1000`  
**Range**: 100-100000 (recommended)  

Buffer size for the MPSC queue adapter. Only used when `QUEUE_ADAPTER=mpsc`.

**Examples**:
```bash
QUEUE_BUFFER_SIZE=100     # Minimal memory, may block
QUEUE_BUFFER_SIZE=1000    # Default, balanced
QUEUE_BUFFER_SIZE=5000    # High traffic
QUEUE_BUFFER_SIZE=10000   # Very high traffic
```

### `QUEUE_SQLITE_MAX_SIZE`

**Required**: No  
**Type**: Integer  
**Default**: `10000`  
**Range**: 100-1000000 (recommended)  
**Constraints**: Must be >= 0  

Maximum queue size for SQLite adapter work shedding. When the queue exceeds this limit, the oldest entries are automatically deleted to maintain the specified size limit, preserving the most recently queued work items.

**Work Shedding Behavior**:
- New work items are always accepted
- When queue size exceeds `QUEUE_SQLITE_MAX_SIZE`, oldest entries are deleted
- Deletion happens atomically with insertion in a single transaction
- Essential for long-running deployments to prevent unbounded disk growth
- Set to `0` to disable work shedding (unlimited queue size)

**Examples**:
```bash
QUEUE_SQLITE_MAX_SIZE=0        # Unlimited (disable work shedding)
QUEUE_SQLITE_MAX_SIZE=1000     # Small deployment, frequent processing
QUEUE_SQLITE_MAX_SIZE=10000    # Default, balanced for most deployments
QUEUE_SQLITE_MAX_SIZE=100000   # High-traffic deployment with slower processing
QUEUE_SQLITE_MAX_SIZE=1000000  # Very high-traffic, maximum recommended
```

**Recommendations**:
- **Small deployments**: 1000-5000 entries
- **Production deployments**: 10000-50000 entries
- **High-traffic deployments**: 50000-1000000 entries
- **Development/testing**: 100-1000 entries
- **Disk space concerns**: Lower values (1000-5000)
- **High ingestion rate**: Higher values (50000-1000000)

## Rate Limiting Configuration

### `RESOLVER_MAX_CONCURRENT`

**Required**: No  
**Type**: Integer  
**Default**: `0` (disabled)  
**Range**: 0-10000  
**Constraints**: Must be between 0 and 10000  

Maximum concurrent handle resolutions allowed. When set to a value greater than 0, enables semaphore-based rate limiting to protect upstream DNS and HTTP services from being overwhelmed.

**How it works**:
- Uses a semaphore to limit concurrent resolutions
- Applied between the base resolver and caching layers
- Requests wait for an available permit before resolution
- Helps prevent overwhelming upstream services

**Examples**:
```bash
# Disabled (default)
RESOLVER_MAX_CONCURRENT=0

# Light rate limiting
RESOLVER_MAX_CONCURRENT=10

# Moderate rate limiting
RESOLVER_MAX_CONCURRENT=50

# Heavy traffic with rate limiting
RESOLVER_MAX_CONCURRENT=100

# Maximum allowed
RESOLVER_MAX_CONCURRENT=10000
```

**Recommendations**:
- **Development**: 0 (disabled) or 10-50 for testing
- **Production (low traffic)**: 50-100
- **Production (high traffic)**: 100-500
- **Production (very high traffic)**: 500-1000
- **Testing rate limiting**: 1-5 to observe behavior

**Placement in resolver stack**:
```
Request → Cache → RateLimited → Base → DNS/HTTP
```

### `RESOLVER_MAX_CONCURRENT_TIMEOUT_MS`

**Required**: No  
**Type**: Integer (milliseconds)  
**Default**: `0` (no timeout)  
**Range**: 0-60000  
**Constraints**: Must be between 0 and 60000 (60 seconds max)  

Timeout for acquiring a rate limit permit in milliseconds. When set to a value greater than 0, requests will timeout if they cannot acquire a permit within the specified time, preventing them from waiting indefinitely when the rate limiter is at capacity.

**How it works**:
- Applied when `RESOLVER_MAX_CONCURRENT` is enabled (> 0)
- Uses `tokio::time::timeout` to limit permit acquisition time
- Returns an error if timeout expires before permit is acquired
- Prevents request queue buildup during high load

**Examples**:
```bash
# No timeout (default)
RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=0

# Quick timeout for responsive failures (100ms)
RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=100

# Moderate timeout (1 second)
RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=1000

# Longer timeout for production (5 seconds)
RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=5000

# Maximum allowed (60 seconds)
RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=60000
```

**Recommendations**:
- **Development**: 100-1000ms for quick feedback
- **Production (low latency)**: 1000-5000ms
- **Production (high latency tolerance)**: 5000-30000ms
- **Testing**: 100ms to quickly identify bottlenecks
- **0**: Use when you want requests to wait indefinitely

**Error behavior**:
When a timeout occurs, the request fails with:
```
Rate limit permit acquisition timed out after {timeout}ms
```

## Metrics Configuration

### `METRICS_ADAPTER`

**Required**: No  
**Type**: String  
**Default**: `noop`  
**Values**: `noop`, `statsd`  

Metrics adapter type for collecting and publishing metrics.

**Options**:
- `noop`: No metrics collection (default)
- `statsd`: Send metrics to StatsD server

**Examples**:
```bash
# No metrics (default)
METRICS_ADAPTER=noop

# Enable StatsD metrics
METRICS_ADAPTER=statsd
```

### `METRICS_STATSD_HOST`

**Required**: Yes (when METRICS_ADAPTER=statsd)  
**Type**: String  
**Format**: hostname:port  

StatsD server host and port for metrics collection.

**Examples**:
```bash
# Local StatsD
METRICS_STATSD_HOST=localhost:8125

# Remote StatsD
METRICS_STATSD_HOST=statsd.example.com:8125

# Docker network
METRICS_STATSD_HOST=statsd:8125
```

### `METRICS_STATSD_BIND`

**Required**: No  
**Type**: String  
**Default**: `[::]:0`  

Bind address for StatsD UDP socket. Controls which local address to bind for sending UDP packets.

**Examples**:
```bash
# IPv6 any address, random port (default)
METRICS_STATSD_BIND=[::]:0

# IPv4 any address, random port
METRICS_STATSD_BIND=0.0.0.0:0

# Specific interface
METRICS_STATSD_BIND=192.168.1.100:0

# Specific port
METRICS_STATSD_BIND=[::]:8126
```

### `METRICS_PREFIX`

**Required**: No  
**Type**: String  
**Default**: `quickdid`  

Prefix for all metrics. Used to namespace metrics in your monitoring system.

**Examples**:
```bash
# Default
METRICS_PREFIX=quickdid

# Environment-specific
METRICS_PREFIX=prod.quickdid
METRICS_PREFIX=staging.quickdid

# Region-specific
METRICS_PREFIX=us-east-1.quickdid
METRICS_PREFIX=eu-west-1.quickdid

# Service-specific
METRICS_PREFIX=api.quickdid
```

### `METRICS_TAGS`

**Required**: No  
**Type**: String (comma-separated key:value pairs)  
**Default**: None  

Default tags for all metrics. Added to all metrics for filtering and grouping.

**Examples**:
```bash
# Basic tags
METRICS_TAGS=env:production,service:quickdid

# Detailed tags
METRICS_TAGS=env:production,service:quickdid,region:us-east-1,version:1.0.0

# Deployment-specific
METRICS_TAGS=env:staging,cluster:k8s-staging,namespace:quickdid
```

**Common tag patterns**:
- `env`: Environment (production, staging, development)
- `service`: Service name
- `region`: Geographic region
- `version`: Application version
- `cluster`: Kubernetes cluster name
- `instance`: Instance identifier

## Proactive Refresh Configuration

### `PROACTIVE_REFRESH_ENABLED`

**Required**: No  
**Type**: Boolean  
**Default**: `false`  

Enable proactive cache refresh for frequently accessed handles. When enabled, cache entries that have reached the refresh threshold will be queued for background refresh to keep the cache warm.

**Examples**:
```bash
# Enable proactive refresh (recommended for production)
PROACTIVE_REFRESH_ENABLED=true

# Disable proactive refresh (default)
PROACTIVE_REFRESH_ENABLED=false
```

**Benefits**:
- Prevents cache misses for popular handles
- Maintains consistent response times
- Reduces latency spikes during cache expiration

**Considerations**:
- Increases background processing load
- More DNS/HTTP requests to upstream services
- Best for high-traffic services with predictable access patterns

### `PROACTIVE_REFRESH_THRESHOLD`

**Required**: No  
**Type**: Float  
**Default**: `0.8`  
**Range**: 0.0-1.0  
**Constraints**: Must be between 0.0 and 1.0  

Threshold as a percentage (0.0-1.0) of cache TTL when to trigger proactive refresh. For example, 0.8 means refresh when an entry has lived for 80% of its TTL.

**Examples**:
```bash
# Very aggressive (refresh at 50% of TTL)
PROACTIVE_REFRESH_THRESHOLD=0.5

# Moderate (refresh at 70% of TTL)
PROACTIVE_REFRESH_THRESHOLD=0.7

# Default (refresh at 80% of TTL)
PROACTIVE_REFRESH_THRESHOLD=0.8

# Conservative (refresh at 90% of TTL)
PROACTIVE_REFRESH_THRESHOLD=0.9

# Very conservative (refresh at 95% of TTL)
PROACTIVE_REFRESH_THRESHOLD=0.95
```

**Recommendations**:
- **High-traffic services**: 0.5-0.7 (aggressive refresh)
- **Normal traffic**: 0.8 (default, balanced)
- **Low traffic**: 0.9-0.95 (conservative)
- **Development**: 0.5 (test refresh behavior)

**Impact on different cache TTLs**:
- TTL=600s (10 min), threshold=0.8: Refresh after 8 minutes
- TTL=3600s (1 hour), threshold=0.8: Refresh after 48 minutes
- TTL=86400s (1 day), threshold=0.8: Refresh after 19.2 hours

## Jetstream Consumer Configuration

### `JETSTREAM_ENABLED`

**Required**: No  
**Type**: Boolean  
**Default**: `false`  

Enable Jetstream consumer for real-time cache updates from the AT Protocol firehose. When enabled, QuickDID connects to the Jetstream WebSocket service to receive live updates about account and identity changes.

**How it works**:
- Subscribes to Account and Identity events from the firehose
- Processes Account events to purge deleted/deactivated accounts
- Processes Identity events to update handle-to-DID mappings
- Automatically reconnects with exponential backoff on connection failures
- Tracks metrics for successful and failed event processing

**Examples**:
```bash
# Enable Jetstream consumer (recommended for production)
JETSTREAM_ENABLED=true

# Disable Jetstream consumer (default)
JETSTREAM_ENABLED=false
```

**Benefits**:
- Real-time cache synchronization with AT Protocol network
- Automatic removal of deleted/deactivated accounts
- Immediate handle change updates
- Reduces stale data in cache

**Considerations**:
- Requires stable WebSocket connection
- Increases network traffic (incoming events)
- Best for services requiring up-to-date handle mappings
- Automatically handles reconnection on failures

### `JETSTREAM_HOSTNAME`

**Required**: No  
**Type**: String  
**Default**: `jetstream.atproto.tools`  

The hostname of the Jetstream WebSocket service to connect to for real-time AT Protocol events. Only used when `JETSTREAM_ENABLED=true`.

**Examples**:
```bash
# Production firehose (default)
JETSTREAM_HOSTNAME=jetstream.atproto.tools

# Staging environment
JETSTREAM_HOSTNAME=jetstream-staging.atproto.tools

# Local development firehose
JETSTREAM_HOSTNAME=localhost:6008

# Custom deployment
JETSTREAM_HOSTNAME=jetstream.example.com
```

**Event Processing**:
- **Account events**: 
  - `status: deleted` → Purges handle and DID from all caches
  - `status: deactivated` → Purges handle and DID from all caches
  - Other statuses → Ignored
  
- **Identity events**:
  - Updates handle-to-DID mapping in cache
  - Removes old handle mapping if changed
  - Maintains bidirectional cache consistency

**Metrics Tracked** (when metrics are enabled):
- `jetstream.events.received`: Total events received
- `jetstream.events.processed`: Successfully processed events
- `jetstream.events.failed`: Failed event processing
- `jetstream.connections.established`: Successful connections
- `jetstream.connections.failed`: Failed connection attempts

**Reconnection Behavior**:
- Initial retry delay: 1 second
- Maximum retry delay: 60 seconds
- Exponential backoff with jitter
- Automatic recovery on transient failures

**Recommendations**:
- **Production**: Use default `jetstream.atproto.tools`
- **Development**: Consider local firehose for testing
- **High availability**: Monitor connection metrics
- **Network issues**: Check WebSocket connectivity

## Static Files Configuration

### `STATIC_FILES_DIR`

**Required**: No  
**Type**: String (directory path)  
**Default**: `www`  

Directory path for serving static files. This directory should contain the landing page and AT Protocol well-known files.

**Directory Structure**:
```
www/
├── index.html              # Landing page
├── .well-known/
│   ├── atproto-did        # Service DID identifier
│   └── did.json           # DID document
└── (other static assets)
```

**Examples**:
```bash
# Default (relative to working directory)
STATIC_FILES_DIR=www

# Absolute path
STATIC_FILES_DIR=/var/www/quickdid

# Docker container path
STATIC_FILES_DIR=/app/www

# Custom directory
STATIC_FILES_DIR=./public
```

**Docker Volume Mounting**:
```yaml
volumes:
  # Mount entire custom directory
  - ./custom-www:/app/www:ro
  
  # Mount specific files
  - ./custom-index.html:/app/www/index.html:ro
  - ./well-known:/app/www/.well-known:ro
```

**Generating Well-Known Files**:
```bash
# Generate .well-known files for your domain
HTTP_EXTERNAL=your-domain.com ./generate-wellknown.sh
```

## HTTP Caching Configuration

### `CACHE_MAX_AGE`

**Required**: No  
**Type**: Integer (seconds)  
**Default**: `86400` (24 hours)  
**Range**: 0-31536000 (0 to 1 year)  

Maximum age for HTTP Cache-Control header in seconds. When set to 0, the Cache-Control header is disabled and will not be added to responses. This controls how long clients and intermediate caches can cache responses.

**Examples**:
```bash
# Default (24 hours)
CACHE_MAX_AGE=86400

# Aggressive caching (7 days)
CACHE_MAX_AGE=604800

# Conservative caching (1 hour)
CACHE_MAX_AGE=3600

# Disable Cache-Control header
CACHE_MAX_AGE=0
```

### `CACHE_STALE_IF_ERROR`

**Required**: No  
**Type**: Integer (seconds)  
**Default**: `172800` (48 hours)  

Allows stale content to be served if the backend encounters an error. This provides resilience during service outages.

**Examples**:
```bash
# Default (48 hours)
CACHE_STALE_IF_ERROR=172800

# Extended error tolerance (7 days)
CACHE_STALE_IF_ERROR=604800

# Minimal error tolerance (1 hour)
CACHE_STALE_IF_ERROR=3600
```

### `CACHE_STALE_WHILE_REVALIDATE`

**Required**: No  
**Type**: Integer (seconds)  
**Default**: `86400` (24 hours)  

Allows stale content to be served while fresh content is being fetched in the background. This improves perceived performance.

**Examples**:
```bash
# Default (24 hours)
CACHE_STALE_WHILE_REVALIDATE=86400

# Quick revalidation (1 hour)
CACHE_STALE_WHILE_REVALIDATE=3600

# Extended revalidation (7 days)
CACHE_STALE_WHILE_REVALIDATE=604800
```

### `CACHE_MAX_STALE`

**Required**: No  
**Type**: Integer (seconds)  
**Default**: `172800` (48 hours)  

Maximum time a client will accept stale responses. This provides an upper bound on how old cached content can be.

**Examples**:
```bash
# Default (48 hours)
CACHE_MAX_STALE=172800

# Extended staleness (7 days)
CACHE_MAX_STALE=604800

# Strict freshness (1 hour)
CACHE_MAX_STALE=3600
```

### `CACHE_MIN_FRESH`

**Required**: No  
**Type**: Integer (seconds)  
**Default**: `3600` (1 hour)  

Minimum time a response must remain fresh. Clients will not accept responses that will expire within this time.

**Examples**:
```bash
# Default (1 hour)
CACHE_MIN_FRESH=3600

# Strict freshness (24 hours)
CACHE_MIN_FRESH=86400

# Relaxed freshness (5 minutes)
CACHE_MIN_FRESH=300
```

**Cache-Control Header Format**:

When `CACHE_MAX_AGE` is greater than 0, the following Cache-Control header is added to responses:
```
Cache-Control: public, max-age=86400, stale-while-revalidate=86400, stale-if-error=172800, max-stale=172800, min-fresh=3600
```

**Recommendations**:
- **High-traffic services**: Use longer max-age (86400-604800) to reduce load
- **Frequently changing data**: Use shorter max-age (3600-14400)
- **Critical services**: Set higher stale-if-error for resilience
- **Performance-sensitive**: Enable stale-while-revalidate for better UX
- **Disable caching**: Set CACHE_MAX_AGE=0 for real-time data

### `ETAG_SEED`

**Required**: No  
**Type**: String  
**Default**: Application version (from `CARGO_PKG_VERSION`)  

Seed value for ETAG generation to allow cache invalidation. This value is incorporated into ETAG checksums, allowing server administrators to invalidate client-cached responses after major changes or deployments.

**How it works**:
- Combined with response content to generate ETAG checksums
- Uses MetroHash64 for fast, non-cryptographic hashing
- Generates weak ETags (W/"hash") for HTTP caching
- Changing the seed invalidates all client caches

**Examples**:
```bash
# Default (uses application version)
# ETAG_SEED is automatically set to the version

# Deployment-specific seed
ETAG_SEED=prod-2024-01-15

# Version with timestamp
ETAG_SEED=v1.0.0-1705344000

# Environment-specific
ETAG_SEED=staging-v2

# Force cache invalidation after config change
ETAG_SEED=config-update-2024-01-15
```

**Use cases**:
- **Major configuration changes**: Update seed to invalidate all cached responses
- **Data migration**: Force clients to refetch after backend changes
- **Security updates**: Ensure clients get fresh data after security fixes
- **A/B testing**: Different seeds for different deployment groups
- **Rollback scenarios**: Revert to previous seed to restore cache behavior

**Recommendations**:
- **Default**: Use the application version (automatic)
- **Production**: Include deployment date or config version
- **Staging**: Use environment-specific seeds
- **After incidents**: Update seed to force fresh data
- **Routine deployments**: Keep the same seed if no data changes

## Configuration Examples

### Minimal Development Configuration

```bash
# .env.development
HTTP_EXTERNAL=localhost:3007
RUST_LOG=debug
```

### Standard Production Configuration (Redis)

```bash
# .env.production.redis
# Required
HTTP_EXTERNAL=quickdid.example.com

# Network
HTTP_PORT=8080
USER_AGENT=quickdid/1.0.0 (+https://quickdid.example.com)

# Caching (Redis-based)
REDIS_URL=redis://redis:6379/0
CACHE_TTL_MEMORY=600
CACHE_TTL_REDIS=86400  # 1 day

# Queue
QUEUE_ADAPTER=redis
QUEUE_REDIS_TIMEOUT=5
QUEUE_BUFFER_SIZE=5000
QUEUE_REDIS_DEDUP_ENABLED=true  # Prevent duplicate work
QUEUE_REDIS_DEDUP_TTL=60

# Rate Limiting (optional, recommended for production)
RESOLVER_MAX_CONCURRENT=100
RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=5000  # 5 second timeout

# Metrics (optional, recommended for production)
METRICS_ADAPTER=statsd
METRICS_STATSD_HOST=localhost:8125
METRICS_PREFIX=quickdid
METRICS_TAGS=env:prod,service:quickdid

# Proactive Refresh (optional, recommended for high-traffic)
PROACTIVE_REFRESH_ENABLED=true
PROACTIVE_REFRESH_THRESHOLD=0.8

# Jetstream Consumer (optional, recommended for real-time sync)
JETSTREAM_ENABLED=true
JETSTREAM_HOSTNAME=jetstream.atproto.tools

# HTTP Caching (Cache-Control headers)
CACHE_MAX_AGE=86400  # 24 hours
CACHE_STALE_IF_ERROR=172800  # 48 hours
CACHE_STALE_WHILE_REVALIDATE=86400  # 24 hours

# Logging
RUST_LOG=info
```

### Standard Production Configuration (SQLite)

```bash
# .env.production.sqlite
# Required
HTTP_EXTERNAL=quickdid.example.com

# Network
HTTP_PORT=8080
USER_AGENT=quickdid/1.0.0 (+https://quickdid.example.com)

# Caching (SQLite-based for single instance)
SQLITE_URL=sqlite:/data/quickdid.db
CACHE_TTL_MEMORY=600
CACHE_TTL_SQLITE=86400  # 1 day

# Queue (SQLite for single instance with persistence)
QUEUE_ADAPTER=sqlite
QUEUE_BUFFER_SIZE=5000
QUEUE_SQLITE_MAX_SIZE=10000

# Rate Limiting (optional, recommended for production)
RESOLVER_MAX_CONCURRENT=100
RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=5000  # 5 second timeout

# Jetstream Consumer (optional, recommended for real-time sync)
JETSTREAM_ENABLED=true
JETSTREAM_HOSTNAME=jetstream.atproto.tools

# HTTP Caching (Cache-Control headers)
CACHE_MAX_AGE=86400  # 24 hours
CACHE_STALE_IF_ERROR=172800  # 48 hours
CACHE_STALE_WHILE_REVALIDATE=86400  # 24 hours

# Logging
RUST_LOG=info
```

### High-Availability Configuration (Redis)

```bash
# .env.ha.redis
# Required
HTTP_EXTERNAL=quickdid.example.com

# Network
HTTP_PORT=8080
DNS_NAMESERVERS=8.8.8.8,8.8.4.4,1.1.1.1,1.0.0.1

# Caching (separate Redis instances)
REDIS_URL=redis://cache-redis:6379/0
CACHE_TTL_MEMORY=300
CACHE_TTL_REDIS=3600

# Queue (dedicated Redis)
QUEUE_ADAPTER=redis
QUEUE_REDIS_URL=redis://queue-redis:6379/0
QUEUE_REDIS_PREFIX=prod:queue:
QUEUE_WORKER_ID=${HOSTNAME:-worker1}
QUEUE_REDIS_TIMEOUT=10
QUEUE_REDIS_DEDUP_ENABLED=true  # Essential for multi-instance
QUEUE_REDIS_DEDUP_TTL=120  # Longer TTL for HA

# Performance
QUEUE_BUFFER_SIZE=10000

# Rate Limiting (important for HA deployments)
RESOLVER_MAX_CONCURRENT=500
RESOLVER_MAX_CONCURRENT_TIMEOUT_MS=10000  # 10 second timeout for HA

# Metrics (recommended for HA monitoring)
METRICS_ADAPTER=statsd
METRICS_STATSD_HOST=statsd:8125
METRICS_PREFIX=quickdid.prod
METRICS_TAGS=env:prod,service:quickdid,cluster:ha

# Proactive Refresh (recommended for HA)
PROACTIVE_REFRESH_ENABLED=true
PROACTIVE_REFRESH_THRESHOLD=0.7  # More aggressive for HA

# Jetstream Consumer (recommended for real-time sync in HA)
JETSTREAM_ENABLED=true
JETSTREAM_HOSTNAME=jetstream.atproto.tools

# Logging
RUST_LOG=warn
```

### Hybrid Configuration (Redis + SQLite Fallback)

```bash
# .env.hybrid
# Required
HTTP_EXTERNAL=quickdid.example.com

# Network
HTTP_PORT=8080

# Caching (Redis primary, SQLite fallback)
REDIS_URL=redis://redis:6379/0
SQLITE_URL=sqlite:/data/fallback.db
CACHE_TTL_MEMORY=600
CACHE_TTL_REDIS=86400
CACHE_TTL_SQLITE=604800  # 1 week (longer for fallback)

# Queue
QUEUE_ADAPTER=redis
QUEUE_REDIS_TIMEOUT=5

# Logging
RUST_LOG=info
```

### Docker Compose Configuration (Redis)

```yaml
# docker-compose.redis.yml
version: '3.8'

services:
  quickdid:
    image: quickdid:latest
    environment:
      HTTP_EXTERNAL: quickdid.example.com
      HTTP_PORT: 8080
      REDIS_URL: redis://redis:6379/0
      CACHE_TTL_MEMORY: 600
      CACHE_TTL_REDIS: 86400
      QUEUE_ADAPTER: redis
      QUEUE_REDIS_TIMEOUT: 5
      JETSTREAM_ENABLED: true
      JETSTREAM_HOSTNAME: jetstream.atproto.tools
      RUST_LOG: info
    ports:
      - "8080:8080"
    depends_on:
      - redis

  redis:
    image: redis:7-alpine
    command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru
```

### Docker Compose Configuration (SQLite)

```yaml
# docker-compose.sqlite.yml
version: '3.8'

services:
  quickdid:
    image: quickdid:latest
    environment:
      HTTP_EXTERNAL: quickdid.example.com
      HTTP_PORT: 8080
      SQLITE_URL: sqlite:/data/quickdid.db
      CACHE_TTL_MEMORY: 600
      CACHE_TTL_SQLITE: 86400
      QUEUE_ADAPTER: sqlite
      QUEUE_BUFFER_SIZE: 5000
      QUEUE_SQLITE_MAX_SIZE: 10000
      JETSTREAM_ENABLED: true
      JETSTREAM_HOSTNAME: jetstream.atproto.tools
      RUST_LOG: info
    ports:
      - "8080:8080"
    volumes:
      - quickdid-data:/data

volumes:
  quickdid-data:
    driver: local
```

## Validation Rules

QuickDID validates configuration at startup. The following rules are enforced:

### Required Fields

1. **HTTP_EXTERNAL**: Must be provided
2. **HTTP_EXTERNAL**: Must be provided

### Value Constraints

1. **TTL Values** (`CACHE_TTL_MEMORY`, `CACHE_TTL_REDIS`, `CACHE_TTL_SQLITE`):
   - Must be positive integers (> 0)
   - Recommended minimum: 60 seconds

2. **Timeout Values** (`QUEUE_REDIS_TIMEOUT`):
   - Must be positive integers (> 0)
   - Recommended range: 1-60 seconds

3. **Queue Adapter** (`QUEUE_ADAPTER`):
   - Must be one of: `mpsc`, `redis`, `sqlite`, `noop`, `none`
   - Case-sensitive

4. **Rate Limiting** (`RESOLVER_MAX_CONCURRENT`):
   - Must be between 0 and 10000
   - 0 = disabled (default)
   - Values > 10000 will fail validation

5. **Rate Limiting Timeout** (`RESOLVER_MAX_CONCURRENT_TIMEOUT_MS`):
   - Must be between 0 and 60000 (milliseconds)
   - 0 = no timeout (default)
   - Values > 60000 will fail validation

6. **Port** (`HTTP_PORT`):
   - Must be valid port number (1-65535)
   - Ports < 1024 require elevated privileges

### Validation Errors

If validation fails, QuickDID will exit with one of these error codes:

- `error-quickdid-config-1`: Missing required environment variable
- `error-quickdid-config-2`: Invalid configuration value
- `error-quickdid-config-3`: Invalid TTL value (must be positive)
- `error-quickdid-config-4`: Invalid timeout value (must be positive)

### Testing Configuration

Test your configuration without starting the service:

```bash
# Validate configuration
HTTP_EXTERNAL=test quickdid --help

# Test with specific values
CACHE_TTL_MEMORY=0 quickdid --help  # Will fail validation

# Check parsed configuration (with debug logging)
RUST_LOG=debug HTTP_EXTERNAL=test quickdid
```

## Best Practices

### Security

1. Use environment-specific configuration management
2. Use TLS for Redis connections in production (`rediss://`)
3. Never commit sensitive configuration to version control
5. Implement network segmentation for Redis access

### Performance

1. **With Redis**: Use lower memory cache TTL (300-600s)
2. **With SQLite**: Use moderate memory cache TTL (600-1800s)
3. **Without persistent cache**: Use higher memory cache TTL (1800-3600s)
4. **High traffic**: Increase QUEUE_BUFFER_SIZE (5000-10000)
5. **Multi-region**: Use region-specific QUEUE_WORKER_ID

### Caching and Queue Strategy

1. **Multi-instance/HA deployments**: Use Redis for distributed caching and queuing
2. **Single-instance deployments**: Use SQLite for persistent caching and queuing
3. **Development/testing**: Use memory-only caching with MPSC queuing
4. **Hybrid setups**: Configure both Redis and SQLite for redundancy
5. **Real-time sync**: Enable Jetstream consumer for live cache updates
6. **Queue adapter guidelines**:
   - Redis: Best for multi-instance deployments with distributed processing
   - SQLite: Best for single-instance deployments needing persistence
   - MPSC: Best for single-instance deployments without persistence needs
7. **Cache TTL guidelines**:
   - Redis: Shorter TTLs (1-7 days) for frequently updated handles
   - SQLite: Longer TTLs (7-90 days) for stable single-instance caching
   - Memory: Short TTLs (5-30 minutes) as fallback
8. **Jetstream guidelines**:
   - Production: Enable for real-time cache synchronization
   - High-traffic: Essential for reducing stale data
   - Development: Can be disabled for simpler testing
   - Monitor WebSocket connection health in production

### Monitoring

1. Set descriptive QUEUE_WORKER_ID for log correlation
2. Use structured logging with appropriate RUST_LOG levels
3. Monitor Redis memory usage and adjust TTLs accordingly
4. Track cache hit rates to optimize TTL values

### Deployment

1. Use `.env` files for local development
2. Use secrets management for production configurations
3. Set resource limits in container orchestration
4. Use health checks to monitor service availability
5. Implement gradual rollouts with feature flags
6. **SQLite deployments**: Ensure persistent volume for database file
7. **Redis deployments**: Configure Redis persistence and backup
8. **Hybrid deployments**: Test fallback scenarios (Redis unavailable)