# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with
code in this repository.

## Project Overview

Slices is an AT Protocol (ATProto) indexing and querying service that allows
developers to create custom slices (subsets) of the ATProto network data. It
indexes records from the Bluesky/ATProto network via Jetstream, validates them
against Lexicon schemas, and provides flexible querying capabilities through an
XRPC API.

## Development Setup

### Database Connection

The application uses PostgreSQL. You can connect to the database using:

1. **Docker Compose** (recommended for local development):
   ```bash
   docker-compose up postgres
   ```
   This starts PostgreSQL on port 5432 with:
   - Database: `slices`
   - User: `slices`
   - Password: `slices`

2. **Environment Variables** (.env file): Create an `api/.env` file (copy from
   `api/.env.example`):
   ```
   DATABASE_URL=postgresql://slices:slices@localhost:5432/slices
   SYSTEM_SLICE_URI=at://did:plc:bcgltzqazw5tb6k2g3ttenbj/network.slices.slice/3lymhd4jhrd2z
   AUTH_BASE_URL=http://localhost:8081
   RELAY_ENDPOINT=https://relay1.us-west.bsky.network
   PROCESS_TYPE=all  # Optional: all (default), app, worker
   ```

### Process Types

The application supports running different components in separate processes for better resource isolation and scaling:

- `all` (default): Everything (HTTP API + Jetstream + sync workers)
- `app`: HTTP API server + Jetstream real-time indexing
- `worker`: Background sync job processing only

Set via `PROCESS_TYPE` environment variable (or `FLY_PROCESS_GROUP` on Fly.io).

## Common Development Commands

```bash
# Type checking and validation
cargo check

# Run development server
cargo run

# Run sync script
./scripts/sync.sh http://localhost:3000 <token>           # Local dev
./scripts/sync.sh https://api.slices.network <token>     # Production

# Database setup
sqlx database create

# Database migrations
sqlx migrate run
sqlx migrate add <migration_name>

# sqlx query cache (run after changing queries)
cargo sqlx prepare

# Build for production
cargo build --release
```

## High-Level Architecture

### Data Flow

1. **Real-time Indexing:** Jetstream → JetstreamConsumer → Lexicon Validation →
   Database → Index
2. **XRPC Query:** HTTP Request → OAuth Verification → Dynamic Handler →
   Database Query → Response
3. **Background Sync:** Trigger → Job Queue → SyncService → ATProto Relay →
   Validation → Database

### Key Architectural Decisions

- **Single-table design** for records using PostgreSQL with JSONB for
  flexibility across arbitrary lexicons
- **Dynamic XRPC endpoint generation** - routes like
  `/{collection}.createRecord` are generated at runtime
- **Dual indexing strategy** - real-time via Jetstream and bulk sync via
  background jobs
- **Cursor-based pagination** using `base64(sort_value::indexed_at::cid)` for
  stable pagination
- **OAuth DPoP authentication** integrated with AIP server for ATProto
  authentication
- **Multi-tier caching** with Redis (if configured) or in-memory fallback for
  performance optimization

### Module Organization

- `src/api/` - HTTP handlers for XRPC endpoints (actors, records, oauth, sync,
  etc.)
- `src/main.rs` - Application entry point, server setup, Jetstream startup
- `src/database.rs` - All database operations, query building, cursor pagination
- `src/jetstream.rs` - Real-time event processing from ATProto firehose
- `src/sync.rs` - Bulk synchronization operations with ATProto relay
- `src/auth.rs` - OAuth verification and DPoP authentication setup
- `src/cache.rs` - Generic caching interface and in-memory cache implementation
- `src/redis_cache.rs` - Redis cache implementation for distributed caching
- `src/errors.rs` - Error type definitions (reference for new errors)

## Error Handling

All error strings must use this format:

    error-slices-<domain>-<number> <message>: <details>

Example errors:

- error-slices-resolve-1 Multiple DIDs resolved for method
- error-slices-plc-1 HTTP request failed: https://google.com/ Not Found
- error-slices-key-1 Error decoding key: invalid

Errors should be represented as enums using the `thiserror` library when
possible using `src/errors.rs` as a reference and example.

Avoid creating new errors with the `anyhow!(...)` macro.

## Time, Date, and Duration

Use the `chrono` crate for time, date, and duration logic.

Use the `duration_str` crate for parsing string duration values.

All stored dates and times must be in UTC. UTC should be used whenever
determining the current time and computing values like expiration.

## HTTP Handler Organization

HTTP handlers should be organized as Rust source files in the `src/api`
directory. Each handler should have its own request and response types and
helper functionality.

- After updating, run `cargo check` to fix errors and warnings
- Don't use dead code, if it's not used remove it

## Caching Architecture

The application uses a flexible caching system that supports both Redis and in-memory caching with automatic fallback.

### Cache Configuration

Configure caching via environment variables:

```bash
# Redis configuration (optional)
REDIS_URL=redis://localhost:6379
REDIS_TTL_SECONDS=3600
```

If `REDIS_URL` is not set, the application automatically falls back to in-memory caching.

### Cache Types and TTLs

- **Actor Cache** (Jetstream): No TTL (permanent cache for slice actors)
- **Lexicon Cache**: 2 hours (7200s) - lexicons change infrequently
- **Domain Cache**: 4 hours (14400s) - slice domain mappings rarely change
- **Collections Cache**: 2 hours (7200s) - slice collections change infrequently
- **Auth Cache**: 5 minutes (300s) - OAuth tokens and AT Protocol sessions
- **DID Resolution Cache**: 24 hours (86400s) - DID documents change rarely

### Cache Implementation

- `src/cache.rs` - Defines the `Cache` trait and `SliceCache` wrapper with domain-specific methods
- `src/redis_cache.rs` - Redis implementation of the `Cache` trait
- Both implementations provide the same interface through `SliceCache`
- Cache keys use prefixed formats (e.g., `actor:{did}:{slice_uri}`, `oauth_userinfo:{token}`)

### Cache Usage Patterns

- **Jetstream Consumer**: Creates 4 separate cache instances for actors, lexicons, domains, and collections
- **Auth System**: Uses dedicated auth cache for OAuth and AT Protocol session caching
- **Actor Resolution**: Caches DID resolution results to avoid repeated lookups
- **Automatic Fallback**: Redis failures automatically fall back to in-memory caching without errors

## Deployment

### Fly.io (Multi-Process Architecture)

The application is configured to run different components in separate process groups on Fly.io:

```toml
[processes]
  app = "app"      # HTTP API + Jetstream
  worker = "worker"  # Sync job processing
```

**Scale processes independently:**
```bash
# Scale app instances (HTTP + Jetstream)
fly scale count app=2

# Scale sync workers (for heavy backfills)
fly scale count worker=5

# Different VM sizes per workload
fly scale vm shared-cpu-1x --process-group app
fly scale vm shared-cpu-2x --process-group worker
```