Secure storage and distribution of cryptographic keys in ATProto applications

Security Best Practices#

Overview#

This document outlines security guidelines for implementing and deploying the ATP Keyserver encryption protocol. Following these practices ensures proper protection of user data and credentials. Note that this are just abstract guidelines and should be adapted to specific use cases.

Threat Model#

Assets to Protect#

1. Encryption Keys

  • 32-byte symmetric keys for group encryption
  • 64-byte asymmetric private keys for signing
  • Historical key versions for decrypting old content

2. Service Auth Tokens

  • Short-lived JWT tokens for keyserver authentication
  • Signed with user's ATProto signing key
  • Valid for 60 seconds (default)

3. Plaintext Content

  • Decrypted message content
  • User-generated data before encryption
  • Temporary plaintexts during crypto operations

Threat Actors#

External Attackers:

  • Network eavesdroppers (mitigated by HTTPS)
  • Malicious relays or PDSes (cannot decrypt)
  • Compromised keyserver (never sees plaintext)

Insider Threats:

  • Keyserver operator (only sees encrypted content and key requests)
  • PDS operator (only sees encrypted content and issues auth tokens)
  • Relay operator (only sees encrypted content)

Client-Side Threats:

  • Malware on user device
  • Memory dumping attacks
  • Credential theft from local storage
  • XSS attacks in web applications

Attack Scenarios#

Compromised Encryption Key:

  • Impact: All content encrypted with that key version exposed
  • Mitigation: Key rotation limits damage to post-rotation content
  • Detection: Access logs show unusual patterns

Stolen Service Auth Token:

  • Impact: Attacker can request keys for 60 seconds
  • Mitigation: Short token lifetime limits window
  • Detection: Access logs show unexpected IP/user-agent

Man-in-the-Middle:

  • Impact: Attacker intercepts HTTP traffic
  • Mitigation: HTTPS required for all communications
  • Detection: Certificate pinning (optional)

Memory Extraction:

  • Impact: Keys and plaintexts in memory exposed
  • Mitigation: Minimize key lifetime, zero memory after use
  • Detection: Difficult to detect

Security Properties#

End-to-End Encryption#

What it provides:

  • Only authorized group members can decrypt content
  • Keyserver never sees plaintext
  • PDS never sees plaintext
  • Relays never see plaintext

What it doesn't provide:

  • Forward secrecy (old keys retained for compatibility)
  • Protection from compromised endpoints
  • Protection from malicious clients

Service Auth Token Security#

Audience Binding:

  • Token aud claim specifies intended keyserver
  • Keyserver MUST reject tokens with wrong aud
  • Prevents token reuse across services

Method Binding (Optional):

  • Token lxm claim restricts to specific endpoint
  • Recommended for sensitive operations (rotation, member management)
  • Keyserver MUST verify lxm if present

Short-Lived Tokens:

  • Default 60 second expiry
  • Compromised token quickly becomes useless
  • Limits window for stolen token abuse

Signature Verification:

  • Signed with user's ATProto signing key
  • Keyserver verifies signature using DID document
  • No shared secrets between PDS and keyserver

Key Versioning Security#

Purpose: Limits damage from compromised keys by rotating to new versions.

Properties:

  • Old keys remain accessible (backward compatibility)
  • New content uses new key
  • Attacker with old key cannot decrypt new content
  • Attacker with old key CAN decrypt old content (by design)

Trade-off: Forward secrecy sacrificed for ATProto's distributed storage model.

Client Implementation Guidelines#

Memory Management#

Encryption Keys:

DO:
- Store keys in memory only
- Clear keys from memory after use
- Use typed arrays (Uint8Array) for key material
- Zero memory before garbage collection (if possible)

DON'T:
- Persist keys to localStorage/sessionStorage
- Log keys to console or files
- Store keys in plain strings (use Uint8Array)
- Keep keys in memory longer than necessary

Service Auth Tokens:

DO:
- Store tokens in memory only
- Clear tokens on logout
- Refresh tokens before expiry
- Use separate cache keys per service

DON'T:
- Persist tokens to disk
- Log tokens to console or files
- Share tokens between services
- Reuse expired tokens

Plaintext Content:

DO:
- Process plaintext immediately
- Clear sensitive strings after use
- Minimize time plaintext exists in memory

DON'T:
- Log plaintext content
- Store plaintext alongside ciphertext
- Keep plaintext in global variables
- Display plaintext in debug messages

Token Caching#

Service Auth Token Cache:

// GOOD: Memory-only cache with proper expiry
class ServiceAuthCache {
  private tokens = new Map<string, {
    token: string
    expiresAt: number
  }>()

  getToken(aud: string): string | null {
    const cached = this.tokens.get(aud)
    const now = Math.floor(Date.now() / 1000)

    // Refresh if <10 seconds remain
    if (cached && cached.expiresAt > now + 10) {
      return cached.token
    }

    return null
  }

  // Clear on logout
  clear(): void {
    this.tokens.clear()
  }
}

Encryption Key Cache:

// GOOD: Memory-only cache with TTL
class KeyCache {
  private keys = new Map<string, {
    key: string
    expiresAt: number
    version: number
  }>()

  // Active keys: 1 hour TTL
  // Historical keys: 24 hour TTL
  set(groupId: string, key: string, version: number, ttl: number): void {
    this.keys.set(`${groupId}:${version}`, {
      key,
      version,
      expiresAt: Date.now() + (ttl * 1000)
    })
  }

  // Clear on logout
  clear(): void {
    this.keys.clear()
  }
}

Cache Configuration:

  • Active keys: 1 hour TTL (balance freshness vs performance)
  • Historical keys: 24 hour TTL (immutable, safe to cache longer)
  • Service auth tokens: 60 second TTL (or server-specified exp)
  • Max cache size: 1000 entries (prevent memory exhaustion)

Error Handling#

Information Disclosure:

DO:
- Return generic error messages to UI
- Log detailed errors server-side only
- Sanitize error messages before display

DON'T:
- Include keys or tokens in error messages
- Expose stack traces to users
- Log plaintext content in errors
- Show decryption failures with key details

Error Message Examples:

// GOOD: Generic error for user
"Cannot decrypt message. You may no longer have access."

// BAD: Leaks information
"Decryption failed: key version 2 not found for group did:plc:abc123#followers"

HTTPS Requirements#

All communications MUST use HTTPS:

  • PDS to client
  • Keyserver to client
  • Token issuance
  • Key retrieval

Certificate Validation:

  • Verify certificate chain
  • Check certificate expiry
  • Validate hostname
  • Consider certificate pinning for sensitive deployments

Logout Procedures#

Complete cleanup on logout:

function logout() {
  // 1. Clear service auth token cache
  serviceAuthCache.clear()

  // 2. Clear encryption key cache
  keyCache.clear()

  // 3. Clear session tokens
  localStorage.removeItem('pds_session')

  // 4. Clear any plaintext in memory
  // (JavaScript GC will handle, but can null references)

  // 5. Redirect to login page
  window.location.href = '/login'
}

Server Deployment Guidelines#

Keyserver Security#

HTTPS Configuration:

  • Use TLS 1.2 or higher
  • Strong cipher suites only
  • Valid certificate from trusted CA
  • HSTS header enabled

DID Verification:

  • Validate all DIDs before storage/lookup
  • Only allow did:plc and did:web methods
  • Sanitize DID strings to prevent injection

Database Security:

  • SQLite file permissions: 600 (owner read/write only)
  • Store database on encrypted filesystem
  • Regular database backups (encrypted)
  • WAL mode for better concurrency

Access Logging:

  • Log all key access requests
  • Include: timestamp, DID, key version, IP, user-agent
  • Monitor for unusual patterns
  • Alert on suspicious activity

Rate Limiting:

  • Limit requests per IP
  • Limit requests per DID
  • Implement exponential backoff
  • Block malicious actors

Environment Variables#

Required:

  • DID: Keyserver's own DID (public)

Optional:

  • PORT: HTTP port (default 4000)

Never Include:

  • Private keys (generated and stored in database)
  • PDS credentials (keyserver doesn't need them)
  • Admin passwords (keyserver has no admin interface)

Monitoring#

Key Metrics:

  • Request rate per endpoint
  • Error rate by type
  • Key access patterns per DID
  • Token verification failures
  • Database size growth

Alerts:

  • Unusual access patterns (e.g., 10,000 requests/hour from single DID)
  • High error rates (potential attack)
  • Database failures
  • Certificate expiry warnings

Common Pitfalls#

Don't Log Sensitive Data#

// BAD: Logs encryption key
console.log('Encrypting with key:', secretKey)

// BAD: Logs service auth token
console.log('Auth token:', authToken)

// BAD: Logs plaintext
console.log('Decrypted content:', plaintext)

// GOOD: Logs only non-sensitive info
console.log('Encrypting message for group:', groupId)
console.log('Using key version:', version)

Don't Persist Keys Without Encryption#

// BAD: Keys in localStorage
localStorage.setItem('encryption_keys', JSON.stringify(keys))

// BAD: Keys in cookies
document.cookie = `key=${secretKey}`

// GOOD: Memory-only storage
const keyCache = new Map<string, string>()

Don't Skip Version Validation#

// BAD: Assumes key_version exists
const version = post.key_version
fetchKey(groupId, version) // Undefined if key_version missing!

// GOOD: Validates before use
if (!post.key_version) {
  throw new Error('Missing key version')
}
const version = post.key_version
fetchKey(groupId, version)

Don't Assume Key Fetch Succeeds#

// BAD: No error handling
const key = await fetchKey(groupId)
decrypt(ciphertext, key)

// GOOD: Handles all error cases
try {
  const key = await fetchKey(groupId)
  const plaintext = decrypt(ciphertext, key)
  return plaintext
} catch (error) {
  if (error.statusCode === 403) {
    // User lost access
    return null
  } else if (error.statusCode === 404) {
    // Group deleted
    return null
  } else {
    // Network or server error
    throw error
  }
}

Don't Retry Decryption Infinitely#

// BAD: Infinite retry on decryption failure
while (true) {
  try {
    return decrypt(ciphertext, key)
  } catch (error) {
    // Retries forever!
  }
}

// GOOD: No retry for decryption failures
try {
  return decrypt(ciphertext, key)
} catch (error) {
  // Decryption failure is permanent
  // Could be corrupted data or wrong key
  throw new Error('Cannot decrypt message')
}

Don't Reuse Expired Tokens#

// BAD: Doesn't check expiry
function getToken(): string {
  return cachedToken
}

// GOOD: Checks expiry before returning
function getToken(): string | null {
  const now = Math.floor(Date.now() / 1000)
  if (cachedToken && cachedToken.expiresAt > now + 10) {
    return cachedToken.token
  }
  return null // Needs refresh
}

Don't Share Tokens Between Services#

// BAD: Uses same token for different services
const token = await getServiceAuth({ aud: 'did:web:service1.com' })
// Uses token for different service!
fetch('https://service2.com/api', {
  headers: { Authorization: `Bearer ${token}` }
})

// GOOD: Separate tokens per service
const token1 = await getServiceAuth({ aud: 'did:web:service1.com' })
const token2 = await getServiceAuth({ aud: 'did:web:service2.com' })

Security Audit Checklist#

Client Implementation#

  • Keys stored in memory only, never persisted to disk
  • Service auth tokens stored in memory only
  • Keys and tokens cleared on logout
  • HTTPS used for all network requests
  • Certificate validation enabled
  • Error messages don't leak sensitive data
  • No keys or tokens logged to console/files
  • Proper cache TTLs configured (1h active, 24h historical)
  • Service auth tokens refreshed before expiry
  • Audience binding verified for all tokens
  • Version validation before key fetch
  • Graceful error handling for authorization failures
  • No infinite retry loops for decryption
  • AT-URI used as AAD for all encryption/decryption
  • Nonces generated with cryptographically secure randomness

Server Implementation#

  • HTTPS with valid certificate
  • TLS 1.2 or higher
  • Strong cipher suites only
  • HSTS header enabled
  • DID validation on all inputs
  • SQLite file permissions set to 600
  • Database on encrypted filesystem
  • Regular encrypted backups
  • Access logging enabled
  • Rate limiting configured
  • Monitoring and alerting in place
  • JWT audience verification enforced
  • JWT method binding enforced (if present)
  • JWT signature verification working
  • DID resolution caching enabled
  • Error messages don't leak sensitive data

Infrastructure#

  • Reverse proxy configured (nginx/caddy)
  • SSL termination at proxy
  • Firewall rules limiting access
  • Server in secure datacenter
  • Regular security updates applied
  • Incident response plan documented
  • Backup and recovery procedures tested
  • Monitoring dashboards configured
  • Log aggregation and analysis in place

Incident Response#

Suspected Key Compromise#

Detection:

  • Unusual access patterns in logs
  • User reports unauthorized decryption
  • Security researcher disclosure

Response:

  1. Rotate the compromised key immediately
  2. Notify affected users
  3. Review access logs for suspicious activity
  4. Document incident for post-mortem
  5. Update security controls to prevent recurrence

Suspected Token Theft#

Detection:

  • Access from unexpected IP/location
  • Multiple simultaneous sessions
  • User reports unauthorized activity

Response:

  1. Revoke user's PDS session (forces new auth)
  2. Token expires in ≤60 seconds (automatic mitigation)
  3. Review access logs
  4. Notify user if suspicious activity confirmed

Data Breach#

Server compromise:

  • Encrypted content exposure: LOW risk (server only stores keys, not content)
  • Key database exposure: HIGH risk (all keys compromised)

Response:

  1. Immediately take keyserver offline
  2. Rotate ALL keys in system
  3. Notify all users
  4. Conduct forensic analysis
  5. Update security controls
  6. Gradual restoration after security verified

Vulnerability Disclosure#

Responsible Disclosure Process:

  1. Acknowledge receipt within 24 hours
  2. Assess severity and impact
  3. Develop and test fix
  4. Coordinate disclosure timeline with reporter
  5. Release fix before public disclosure
  6. Credit reporter (if desired)

Compliance Considerations#

Data Protection Regulations#

GDPR (EU) Compliance:

The keyserver implements GDPR Article 17 (Right to Erasure) through cryptographic erasure:

  1. Personal Data Classification:

    • Encryption keys are personal data
    • DIDs are personal identifiers
    • Access logs contain IP addresses (personal data)
    • Group memberships are personal data
  2. Right to Deletion (Article 17):

    • Users can delete their account via /xrpc/dev.atpkeyserver.alpha.account.delete
    • Requires explicit confirmation: "DELETE_ALL_MY_DATA"
    • Deletes all keys, groups, memberships, and access logs
    • Implements cryptographic erasure (explained below)
  3. Cryptographic Erasure:

    Concept: Encrypted data without keys is computationally infeasible to decrypt, effectively rendering it useless random bytes.

    Legal Basis: EU data protection authorities recognize cryptographic erasure as effective deletion under GDPR:

    • Cloud providers (AWS, Azure, Google Cloud) use this approach
    • Ciphertext without keys is no longer "personal data"
    • Satisfies right to erasure while preserving network integrity

    Implementation:

    • Keyserver deletes all key versions (personal and group keys)
    • Clients should delete encrypted posts from PDS (ATProto deletion protocol)
    • Deletion notices may be honored by relays (not guaranteed)
    • Even if ciphertext persists, it becomes permanently unreadable without keys
    • No technical means to recover deleted keys
    • Irreversible process (as required by GDPR)
  4. Access Log Retention:

    • 90-day automatic retention policy (configurable)
    • Logs cleaned up on server startup
    • Balances security monitoring with privacy requirements
    • Can be adjusted based on jurisdiction (30-180 days)
  5. Data Minimization:

    • Only essential data collected (DIDs, keys, timestamps)
    • No unnecessary personal information stored
    • IP addresses in logs are optional (can be disabled)
    • User agents logged for security, not tracking
  6. Breach Notification:

    • Required within 72 hours under GDPR
    • Users should be notified if keys are compromised
    • Access logs help identify affected users
    • Rotation functionality limits damage

CCPA (California) Compliance:

  • Users have right to know what data is collected (documented in API)
  • Users have right to deletion (account deletion endpoint)
  • Security practices must be documented (this document)
  • No sale of personal data (keys never shared)

Industry Standards#

OWASP Top 10:

  • Protect against injection (DID validation)
  • Proper authentication (service auth)
  • Sensitive data exposure (HTTPS, no logging)
  • Access control (group membership checks)
  • Security misconfiguration (secure defaults)

NIST Guidelines:

  • Use approved cryptographic algorithms (XChaCha20-Poly1305, Ed25519)
  • Key management best practices (versioning, rotation)
  • Secure communication (HTTPS, TLS 1.2+)
  • Access control (JWT authentication)

Resources#

Further Reading#

Security Contact#

For security issues, please contact:

  • Email: [To be configured by deployment team]
  • PGP Key: [To be configured by deployment team]
  • Disclosure Policy: Responsible disclosure preferred

Do not disclose security vulnerabilities publicly without coordination.