Security Best Practices#

Overview#

This document outlines security guidelines for implementing and deploying the ATP Keyserver encryption protocol. Following these practices ensures proper protection of user data and credentials. Note that this are just abstract guidelines and should be adapted to specific use cases.

Threat Model#

Assets to Protect#

1. Encryption Keys

32-byte symmetric keys for group encryption
64-byte asymmetric private keys for signing
Historical key versions for decrypting old content

2. Service Auth Tokens

Short-lived JWT tokens for keyserver authentication
Signed with user's ATProto signing key
Valid for 60 seconds (default)

3. Plaintext Content

Decrypted message content
User-generated data before encryption
Temporary plaintexts during crypto operations

Threat Actors#

External Attackers:

Network eavesdroppers (mitigated by HTTPS)
Malicious relays or PDSes (cannot decrypt)
Compromised keyserver (never sees plaintext)

Insider Threats:

Keyserver operator (only sees encrypted content and key requests)
PDS operator (only sees encrypted content and issues auth tokens)
Relay operator (only sees encrypted content)

Client-Side Threats:

Malware on user device
Memory dumping attacks
Credential theft from local storage
XSS attacks in web applications

Attack Scenarios#

Compromised Encryption Key:

Impact: All content encrypted with that key version exposed
Mitigation: Key rotation limits damage to post-rotation content
Detection: Access logs show unusual patterns

Stolen Service Auth Token:

Impact: Attacker can request keys for 60 seconds
Mitigation: Short token lifetime limits window
Detection: Access logs show unexpected IP/user-agent

Man-in-the-Middle:

Impact: Attacker intercepts HTTP traffic
Mitigation: HTTPS required for all communications
Detection: Certificate pinning (optional)

Memory Extraction:

Impact: Keys and plaintexts in memory exposed
Mitigation: Minimize key lifetime, zero memory after use
Detection: Difficult to detect

Security Properties#

End-to-End Encryption#

What it provides:

Only authorized group members can decrypt content
Keyserver never sees plaintext
PDS never sees plaintext
Relays never see plaintext

What it doesn't provide:

Forward secrecy (old keys retained for compatibility)
Protection from compromised endpoints
Protection from malicious clients

Service Auth Token Security#

Audience Binding:

Token aud claim specifies intended keyserver
Keyserver MUST reject tokens with wrong aud
Prevents token reuse across services

Method Binding (Optional):

Token lxm claim restricts to specific endpoint
Recommended for sensitive operations (rotation, member management)
Keyserver MUST verify lxm if present

Short-Lived Tokens:

Default 60 second expiry
Compromised token quickly becomes useless
Limits window for stolen token abuse

Signature Verification:

Signed with user's ATProto signing key
Keyserver verifies signature using DID document
No shared secrets between PDS and keyserver

Key Versioning Security#

Purpose: Limits damage from compromised keys by rotating to new versions.

Properties:

Old keys remain accessible (backward compatibility)
New content uses new key
Attacker with old key cannot decrypt new content
Attacker with old key CAN decrypt old content (by design)

Trade-off: Forward secrecy sacrificed for ATProto's distributed storage model.

Client Implementation Guidelines#

Memory Management#

Encryption Keys:

DO:
- Store keys in memory only
- Clear keys from memory after use
- Use typed arrays (Uint8Array) for key material
- Zero memory before garbage collection (if possible)

DON'T:
- Persist keys to localStorage/sessionStorage
- Log keys to console or files
- Store keys in plain strings (use Uint8Array)
- Keep keys in memory longer than necessary

Service Auth Tokens:

DO:
- Store tokens in memory only
- Clear tokens on logout
- Refresh tokens before expiry
- Use separate cache keys per service

DON'T:
- Persist tokens to disk
- Log tokens to console or files
- Share tokens between services
- Reuse expired tokens

Plaintext Content:

DO:
- Process plaintext immediately
- Clear sensitive strings after use
- Minimize time plaintext exists in memory

DON'T:
- Log plaintext content
- Store plaintext alongside ciphertext
- Keep plaintext in global variables
- Display plaintext in debug messages

Token Caching#

Service Auth Token Cache:

// GOOD: Memory-only cache with proper expiry
class ServiceAuthCache {
  private tokens = new Map<string, {
    token: string
    expiresAt: number
  }>()

  getToken(aud: string): string | null {
    const cached = this.tokens.get(aud)
    const now = Math.floor(Date.now() / 1000)

    // Refresh if <10 seconds remain
    if (cached && cached.expiresAt > now + 10) {
      return cached.token
    }

    return null
  }

  // Clear on logout
  clear(): void {
    this.tokens.clear()
  }
}

Encryption Key Cache:

// GOOD: Memory-only cache with TTL
class KeyCache {
  private keys = new Map<string, {
    key: string
    expiresAt: number
    version: number
  }>()

  // Active keys: 1 hour TTL
  // Historical keys: 24 hour TTL
  set(groupId: string, key: string, version: number, ttl: number): void {
    this.keys.set(`${groupId}:${version}`, {
      key,
      version,
      expiresAt: Date.now() + (ttl * 1000)
    })
  }

  // Clear on logout
  clear(): void {
    this.keys.clear()
  }
}

Cache Configuration:

Active keys: 1 hour TTL (balance freshness vs performance)
Historical keys: 24 hour TTL (immutable, safe to cache longer)
Service auth tokens: 60 second TTL (or server-specified exp)
Max cache size: 1000 entries (prevent memory exhaustion)

Error Handling#

Information Disclosure:

DO:
- Return generic error messages to UI
- Log detailed errors server-side only
- Sanitize error messages before display

DON'T:
- Include keys or tokens in error messages
- Expose stack traces to users
- Log plaintext content in errors
- Show decryption failures with key details

Error Message Examples:

// GOOD: Generic error for user
"Cannot decrypt message. You may no longer have access."

// BAD: Leaks information
"Decryption failed: key version 2 not found for group did:plc:abc123#followers"

HTTPS Requirements#

All communications MUST use HTTPS:

PDS to client
Keyserver to client
Token issuance
Key retrieval

Certificate Validation:

Verify certificate chain
Check certificate expiry
Validate hostname
Consider certificate pinning for sensitive deployments

Logout Procedures#

Complete cleanup on logout:

function logout() {
  // 1. Clear service auth token cache
  serviceAuthCache.clear()

  // 2. Clear encryption key cache
  keyCache.clear()

  // 3. Clear session tokens
  localStorage.removeItem('pds_session')

  // 4. Clear any plaintext in memory
  // (JavaScript GC will handle, but can null references)

  // 5. Redirect to login page
  window.location.href = '/login'
}

Server Deployment Guidelines#

Keyserver Security#

HTTPS Configuration:

Use TLS 1.2 or higher
Strong cipher suites only
Valid certificate from trusted CA
HSTS header enabled

DID Verification:

Validate all DIDs before storage/lookup
Only allow did:plc and did:web methods
Sanitize DID strings to prevent injection

Database Security:

SQLite file permissions: 600 (owner read/write only)
Store database on encrypted filesystem
Regular database backups (encrypted)
WAL mode for better concurrency

Access Logging:

Log all key access requests
Include: timestamp, DID, key version, IP, user-agent
Monitor for unusual patterns
Alert on suspicious activity

Rate Limiting:

Limit requests per IP
Limit requests per DID
Implement exponential backoff
Block malicious actors

Environment Variables#

Required:

DID: Keyserver's own DID (public)

Optional:

PORT: HTTP port (default 4000)

Never Include:

Private keys (generated and stored in database)
PDS credentials (keyserver doesn't need them)
Admin passwords (keyserver has no admin interface)

Monitoring#

Key Metrics:

Request rate per endpoint
Error rate by type
Key access patterns per DID
Token verification failures
Database size growth

Alerts:

Unusual access patterns (e.g., 10,000 requests/hour from single DID)
High error rates (potential attack)
Database failures
Certificate expiry warnings

Common Pitfalls#

Don't Log Sensitive Data#

// BAD: Logs encryption key
console.log('Encrypting with key:', secretKey)

// BAD: Logs service auth token
console.log('Auth token:', authToken)

// BAD: Logs plaintext
console.log('Decrypted content:', plaintext)

// GOOD: Logs only non-sensitive info
console.log('Encrypting message for group:', groupId)
console.log('Using key version:', version)

Don't Persist Keys Without Encryption#

// BAD: Keys in localStorage
localStorage.setItem('encryption_keys', JSON.stringify(keys))

// BAD: Keys in cookies
document.cookie = `key=${secretKey}`

// GOOD: Memory-only storage
const keyCache = new Map<string, string>()

Don't Skip Version Validation#

// BAD: Assumes key_version exists
const version = post.key_version
fetchKey(groupId, version) // Undefined if key_version missing!

// GOOD: Validates before use
if (!post.key_version) {
  throw new Error('Missing key version')
}
const version = post.key_version
fetchKey(groupId, version)

Don't Assume Key Fetch Succeeds#

// BAD: No error handling
const key = await fetchKey(groupId)
decrypt(ciphertext, key)

// GOOD: Handles all error cases
try {
  const key = await fetchKey(groupId)
  const plaintext = decrypt(ciphertext, key)
  return plaintext
} catch (error) {
  if (error.statusCode === 403) {
    // User lost access
    return null
  } else if (error.statusCode === 404) {
    // Group deleted
    return null
  } else {
    // Network or server error
    throw error
  }
}

Don't Retry Decryption Infinitely#

// BAD: Infinite retry on decryption failure
while (true) {
  try {
    return decrypt(ciphertext, key)
  } catch (error) {
    // Retries forever!
  }
}

// GOOD: No retry for decryption failures
try {
  return decrypt(ciphertext, key)
} catch (error) {
  // Decryption failure is permanent
  // Could be corrupted data or wrong key
  throw new Error('Cannot decrypt message')
}

Don't Reuse Expired Tokens#

// BAD: Doesn't check expiry
function getToken(): string {
  return cachedToken
}

// GOOD: Checks expiry before returning
function getToken(): string | null {
  const now = Math.floor(Date.now() / 1000)
  if (cachedToken && cachedToken.expiresAt > now + 10) {
    return cachedToken.token
  }
  return null // Needs refresh
}

// BAD: Uses same token for different services
const token = await getServiceAuth({ aud: 'did:web:service1.com' })
// Uses token for different service!
fetch('https://service2.com/api', {
  headers: { Authorization: `Bearer ${token}` }
})

// GOOD: Separate tokens per service
const token1 = await getServiceAuth({ aud: 'did:web:service1.com' })
const token2 = await getServiceAuth({ aud: 'did:web:service2.com' })

Security Audit Checklist#

Client Implementation#

Keys stored in memory only, never persisted to disk
Service auth tokens stored in memory only
Keys and tokens cleared on logout
HTTPS used for all network requests
Certificate validation enabled
Error messages don't leak sensitive data
No keys or tokens logged to console/files
Proper cache TTLs configured (1h active, 24h historical)
Service auth tokens refreshed before expiry
Audience binding verified for all tokens
Version validation before key fetch
Graceful error handling for authorization failures
No infinite retry loops for decryption
AT-URI used as AAD for all encryption/decryption
Nonces generated with cryptographically secure randomness

Server Implementation#

HTTPS with valid certificate
TLS 1.2 or higher
Strong cipher suites only
HSTS header enabled
DID validation on all inputs
SQLite file permissions set to 600
Database on encrypted filesystem
Regular encrypted backups
Access logging enabled
Rate limiting configured
Monitoring and alerting in place
JWT audience verification enforced
JWT method binding enforced (if present)
JWT signature verification working
DID resolution caching enabled
Error messages don't leak sensitive data

Infrastructure#

Reverse proxy configured (nginx/caddy)
SSL termination at proxy
Firewall rules limiting access
Server in secure datacenter
Regular security updates applied
Incident response plan documented
Backup and recovery procedures tested
Monitoring dashboards configured
Log aggregation and analysis in place

Incident Response#

Suspected Key Compromise#

Detection:

Unusual access patterns in logs
User reports unauthorized decryption
Security researcher disclosure

Response:

Rotate the compromised key immediately
Notify affected users
Review access logs for suspicious activity
Document incident for post-mortem
Update security controls to prevent recurrence

Suspected Token Theft#

Detection:

Access from unexpected IP/location
Multiple simultaneous sessions
User reports unauthorized activity

Response:

Revoke user's PDS session (forces new auth)
Token expires in ≤60 seconds (automatic mitigation)
Review access logs
Notify user if suspicious activity confirmed

Data Breach#

Server compromise:

Encrypted content exposure: LOW risk (server only stores keys, not content)
Key database exposure: HIGH risk (all keys compromised)

Response:

Immediately take keyserver offline
Rotate ALL keys in system
Notify all users
Conduct forensic analysis
Update security controls
Gradual restoration after security verified

Vulnerability Disclosure#

Responsible Disclosure Process:

Acknowledge receipt within 24 hours
Assess severity and impact
Develop and test fix
Coordinate disclosure timeline with reporter
Release fix before public disclosure
Credit reporter (if desired)

Compliance Considerations#

Data Protection Regulations#

GDPR (EU) Compliance:

The keyserver implements GDPR Article 17 (Right to Erasure) through cryptographic erasure:

Personal Data Classification:
- Encryption keys are personal data
- DIDs are personal identifiers
- Access logs contain IP addresses (personal data)
- Group memberships are personal data
Right to Deletion (Article 17):
- Users can delete their account via /xrpc/dev.atpkeyserver.alpha.account.delete
- Requires explicit confirmation: "DELETE_ALL_MY_DATA"
- Deletes all keys, groups, memberships, and access logs
- Implements cryptographic erasure (explained below)
Cryptographic Erasure:

Concept: Encrypted data without keys is computationally infeasible to decrypt, effectively rendering it useless random bytes.

Legal Basis: EU data protection authorities recognize cryptographic erasure as effective deletion under GDPR:
- Cloud providers (AWS, Azure, Google Cloud) use this approach
- Ciphertext without keys is no longer "personal data"
- Satisfies right to erasure while preserving network integrity
Implementation:
- Keyserver deletes all key versions (personal and group keys)
- Clients should delete encrypted posts from PDS (ATProto deletion protocol)
- Deletion notices may be honored by relays (not guaranteed)
- Even if ciphertext persists, it becomes permanently unreadable without keys
- No technical means to recover deleted keys
- Irreversible process (as required by GDPR)
Access Log Retention:
- 90-day automatic retention policy (configurable)
- Logs cleaned up on server startup
- Balances security monitoring with privacy requirements
- Can be adjusted based on jurisdiction (30-180 days)
Data Minimization:
- Only essential data collected (DIDs, keys, timestamps)
- No unnecessary personal information stored
- IP addresses in logs are optional (can be disabled)
- User agents logged for security, not tracking
Breach Notification:
- Required within 72 hours under GDPR
- Users should be notified if keys are compromised
- Access logs help identify affected users
- Rotation functionality limits damage

CCPA (California) Compliance:

Users have right to know what data is collected (documented in API)
Users have right to deletion (account deletion endpoint)
Security practices must be documented (this document)
No sale of personal data (keys never shared)

Industry Standards#

OWASP Top 10:

Protect against injection (DID validation)
Proper authentication (service auth)
Sensitive data exposure (HTTPS, no logging)
Access control (group membership checks)
Security misconfiguration (secure defaults)

NIST Guidelines:

Use approved cryptographic algorithms (XChaCha20-Poly1305, Ed25519)
Key management best practices (versioning, rotation)
Secure communication (HTTPS, TLS 1.2+)
Access control (JWT authentication)

Resources#

Security Contact#

For security issues, please contact:

Email: [To be configured by deployment team]
PGP Key: [To be configured by deployment team]
Disclosure Policy: Responsible disclosure preferred

Do not disclose security vulnerabilities publicly without coordination.

Security Best Practices#

Overview#

Threat Model#

Assets to Protect#

Threat Actors#

Attack Scenarios#

Security Properties#

End-to-End Encryption#

Service Auth Token Security#

Key Versioning Security#

Client Implementation Guidelines#

Memory Management#

Token Caching#

Error Handling#

HTTPS Requirements#

Logout Procedures#

Server Deployment Guidelines#

Keyserver Security#

Environment Variables#

Monitoring#

Common Pitfalls#

Don't Log Sensitive Data#

Don't Persist Keys Without Encryption#

Don't Skip Version Validation#

Don't Assume Key Fetch Succeeds#

Don't Retry Decryption Infinitely#

Don't Reuse Expired Tokens#

Don't Share Tokens Between Services#

Security Audit Checklist#

Client Implementation#

Server Implementation#

Infrastructure#

Incident Response#

Suspected Key Compromise#

Suspected Token Theft#

Data Breach#

Vulnerability Disclosure#

Compliance Considerations#

Data Protection Regulations#

Industry Standards#

Resources#

Further Reading#

Security Contact#