# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview Slices is an AT Protocol (ATProto) indexing and querying service that allows developers to create custom slices (subsets) of the ATProto network data. It indexes records from the Bluesky/ATProto network via Jetstream, validates them against Lexicon schemas, and provides flexible querying capabilities through an XRPC API. ## Development Setup ### Database Connection The application uses PostgreSQL. You can connect to the database using: 1. **Docker Compose** (recommended for local development): ```bash docker-compose up postgres ``` This starts PostgreSQL on port 5432 with: - Database: `slices` - User: `slices` - Password: `slices` 2. **Environment Variables** (.env file): Create an `api/.env` file (copy from `api/.env.example`): ``` DATABASE_URL=postgresql://slices:slices@localhost:5432/slices SYSTEM_SLICE_URI=at://did:plc:bcgltzqazw5tb6k2g3ttenbj/network.slices.slice/3lymhd4jhrd2z AUTH_BASE_URL=http://localhost:8081 RELAY_ENDPOINT=https://relay1.us-west.bsky.network PROCESS_TYPE=all # Optional: all (default), app, worker ``` ### Process Types The application supports running different components in separate processes for better resource isolation and scaling: - `all` (default): Everything (HTTP API + Jetstream + sync workers) - `app`: HTTP API server + Jetstream real-time indexing - `worker`: Background sync job processing only Set via `PROCESS_TYPE` environment variable (or `FLY_PROCESS_GROUP` on Fly.io). ## Common Development Commands ```bash # Type checking and validation cargo check # Run development server cargo run # Run sync script ./scripts/sync.sh http://localhost:3000 # Local dev ./scripts/sync.sh https://api.slices.network # Production # Database setup sqlx database create # Database migrations sqlx migrate run sqlx migrate add # sqlx query cache (run after changing queries) cargo sqlx prepare # Build for production cargo build --release ``` ## High-Level Architecture ### Data Flow 1. **Real-time Indexing:** Jetstream → JetstreamConsumer → Lexicon Validation → Database → Index 2. **XRPC Query:** HTTP Request → OAuth Verification → Dynamic Handler → Database Query → Response 3. **Background Sync:** Trigger → Job Queue → SyncService → ATProto Relay → Validation → Database ### Key Architectural Decisions - **Single-table design** for records using PostgreSQL with JSONB for flexibility across arbitrary lexicons - **Dynamic XRPC endpoint generation** - routes like `/{collection}.createRecord` are generated at runtime - **Dual indexing strategy** - real-time via Jetstream and bulk sync via background jobs - **Cursor-based pagination** using `base64(sort_value::indexed_at::cid)` for stable pagination - **OAuth DPoP authentication** integrated with AIP server for ATProto authentication - **Multi-tier caching** with Redis (if configured) or in-memory fallback for performance optimization ### Module Organization - `src/api/` - HTTP handlers for XRPC endpoints (actors, records, oauth, sync, etc.) - `src/main.rs` - Application entry point, server setup, Jetstream startup - `src/database.rs` - All database operations, query building, cursor pagination - `src/jetstream.rs` - Real-time event processing from ATProto firehose - `src/sync.rs` - Bulk synchronization operations with ATProto relay - `src/auth.rs` - OAuth verification and DPoP authentication setup - `src/cache.rs` - Generic caching interface and in-memory cache implementation - `src/redis_cache.rs` - Redis cache implementation for distributed caching - `src/errors.rs` - Error type definitions (reference for new errors) ## Error Handling All error strings must use this format: error-slices-- :
Example errors: - error-slices-resolve-1 Multiple DIDs resolved for method - error-slices-plc-1 HTTP request failed: https://google.com/ Not Found - error-slices-key-1 Error decoding key: invalid Errors should be represented as enums using the `thiserror` library when possible using `src/errors.rs` as a reference and example. Avoid creating new errors with the `anyhow!(...)` macro. ## Time, Date, and Duration Use the `chrono` crate for time, date, and duration logic. Use the `duration_str` crate for parsing string duration values. All stored dates and times must be in UTC. UTC should be used whenever determining the current time and computing values like expiration. ## HTTP Handler Organization HTTP handlers should be organized as Rust source files in the `src/api` directory. Each handler should have its own request and response types and helper functionality. - After updating, run `cargo check` to fix errors and warnings - Don't use dead code, if it's not used remove it ## Caching Architecture The application uses a flexible caching system that supports both Redis and in-memory caching with automatic fallback. ### Cache Configuration Configure caching via environment variables: ```bash # Redis configuration (optional) REDIS_URL=redis://localhost:6379 REDIS_TTL_SECONDS=3600 ``` If `REDIS_URL` is not set, the application automatically falls back to in-memory caching. ### Cache Types and TTLs - **Actor Cache** (Jetstream): No TTL (permanent cache for slice actors) - **Lexicon Cache**: 2 hours (7200s) - lexicons change infrequently - **Domain Cache**: 4 hours (14400s) - slice domain mappings rarely change - **Collections Cache**: 2 hours (7200s) - slice collections change infrequently - **Auth Cache**: 5 minutes (300s) - OAuth tokens and AT Protocol sessions - **DID Resolution Cache**: 24 hours (86400s) - DID documents change rarely ### Cache Implementation - `src/cache.rs` - Defines the `Cache` trait and `SliceCache` wrapper with domain-specific methods - `src/redis_cache.rs` - Redis implementation of the `Cache` trait - Both implementations provide the same interface through `SliceCache` - Cache keys use prefixed formats (e.g., `actor:{did}:{slice_uri}`, `oauth_userinfo:{token}`) ### Cache Usage Patterns - **Jetstream Consumer**: Creates 4 separate cache instances for actors, lexicons, domains, and collections - **Auth System**: Uses dedicated auth cache for OAuth and AT Protocol session caching - **Actor Resolution**: Caches DID resolution results to avoid repeated lookups - **Automatic Fallback**: Redis failures automatically fall back to in-memory caching without errors ## Deployment ### Fly.io (Multi-Process Architecture) The application is configured to run different components in separate process groups on Fly.io: ```toml [processes] app = "app" # HTTP API + Jetstream worker = "worker" # Sync job processing ``` **Scale processes independently:** ```bash # Scale app instances (HTTP + Jetstream) fly scale count app=2 # Scale sync workers (for heavy backfills) fly scale count worker=5 # Different VM sizes per workload fly scale vm shared-cpu-1x --process-group app fly scale vm shared-cpu-2x --process-group worker ```