Monorepo for wisp.place. A static site hosting service built on top of the AT Protocol. wisp.place

update claude.md

Changed files
+420 -25
+420 -25
claude.md
··· 1 - Wisp.place - Decentralized Static Site Hosting 2 3 - Architecture Overview 4 5 - Wisp.Place a two-service application that provides static site hosting on the AT 6 - Protocol. Wisp aims to be a CDN for static sites where the content is ultimately owned by the user at their repo. The microservice is responsbile for injesting firehose events and serving a on-disk cache of the latest site files. 7 8 - Service 1: Main App (Port 8000, Bun runtime, elysia.js) 9 - - User-facing editor and API 10 - - OAuth authentication (AT Protocol) 11 - - File upload processing (gzip + base64 encoding) 12 - - Domain management (subdomains + custom domains) 13 - - DNS verification worker 14 - - React frontend 15 16 - Service 2: Hosting Service (Port 3001, Node.js runtime, hono.js) 17 - - AT Protocol Firehose listener for real-time updates 18 - - Serves hosted websites from local cache 19 - - Multi-domain routing (custom domains, wisp.place subdomains, sites subdomain) 20 - - Distributed locking for multi-instance coordination 21 22 - Tech Stack 23 24 - - Backend: Bun/Node.js, Elysia.js, PostgreSQL, AT Protocol SDK 25 - - Frontend: React 19, Tailwind CSS v4, Shadcn UI 26 27 - Key Features 28 29 - - AT Protocol Integration: Sites stored as place.wisp.fs records in user repos 30 - - File Processing: Validates, compresses (gzip), encodes (base64), uploads to user's PDS 31 - - Domain Management: wisp.place subdomains + custom BYOD domains with DNS verification 32 - - Real-time Sync: Firehose worker listens for site updates and caches files locally 33 - - Atomic Updates: Safe cache swapping without downtime
··· 1 + # Wisp.place - Codebase Overview 2 + 3 + **Project URL**: https://wisp.place 4 + 5 + A decentralized static site hosting service built on the AT Protocol (Bluesky). Users can host static websites directly in their AT Protocol accounts, keeping full control and ownership while benefiting from fast CDN distribution. 6 + 7 + --- 8 + 9 + ## ๐Ÿ—๏ธ Architecture Overview 10 + 11 + ### Multi-Part System 12 + 1. **Main Backend** (`/src`) - OAuth, site management, custom domains 13 + 2. **Hosting Service** (`/hosting-service`) - Microservice that serves cached sites 14 + 3. **CLI Tool** (`/cli`) - Rust CLI for direct site uploads to PDS 15 + 4. **Frontend** (`/public`) - React UI for onboarding, editor, admin 16 + 17 + ### Tech Stack 18 + - **Backend**: Elysia (Bun) + TypeScript + PostgreSQL 19 + - **Frontend**: React 19 + Tailwind CSS 4 + Radix UI 20 + - **CLI**: Rust with Jacquard (AT Protocol library) 21 + - **Database**: PostgreSQL for session/domain/site caching 22 + - **AT Protocol**: OAuth 2.0 + custom lexicons for storage 23 + 24 + --- 25 + 26 + ## ๐Ÿ“‚ Directory Structure 27 + 28 + ### `/src` - Main Backend Server 29 + **Purpose**: Core server handling OAuth, site management, custom domains, admin features 30 + 31 + **Key Routes**: 32 + - `/api/auth/*` - OAuth signin/callback/logout/status 33 + - `/api/domain/*` - Custom domain management (BYOD) 34 + - `/wisp/*` - Site upload and management 35 + - `/api/user/*` - User info and site listing 36 + - `/api/admin/*` - Admin console (logs, metrics, DNS verification) 37 + 38 + **Key Files**: 39 + - `index.ts` - Express-like Elysia app setup with middleware (CORS, CSP, security headers) 40 + - `lib/oauth-client.ts` - OAuth client setup with session/state persistence 41 + - `lib/db.ts` - PostgreSQL schema and queries for all tables 42 + - `lib/wisp-auth.ts` - Cookie-based authentication middleware 43 + - `lib/wisp-utils.ts` - File compression (gzip), manifest creation, blob handling 44 + - `lib/sync-sites.ts` - Syncs user's place.wisp.fs records from PDS to database cache 45 + - `lib/dns-verify.ts` - DNS verification for custom domains (TXT + CNAME) 46 + - `lib/dns-verification-worker.ts` - Background worker that checks domain verification every 10 minutes 47 + - `lib/admin-auth.ts` - Simple username/password admin authentication 48 + - `lib/observability.ts` - Logging, error tracking, metrics collection 49 + - `routes/auth.ts` - OAuth flow handlers 50 + - `routes/wisp.ts` - File upload and site creation (/wisp/upload-files) 51 + - `routes/domain.ts` - Domain claiming/verification API 52 + - `routes/user.ts` - User status/info/sites listing 53 + - `routes/site.ts` - Site metadata and file retrieval 54 + - `routes/admin.ts` - Admin dashboard API (logs, system health, manual DNS trigger) 55 + 56 + ### `/lexicons` & `src/lexicons/` 57 + **Purpose**: AT Protocol Lexicon definitions for custom data types 58 + 59 + **Key File**: `fs.json` - Defines `place.wisp.fs` record format 60 + - **structure**: Virtual filesystem manifest with tree structure 61 + - **site**: string identifier 62 + - **root**: directory object containing entries 63 + - **file**: blob reference + metadata (encoding, mimeType, base64 flag) 64 + - **directory**: array of entries (recursive) 65 + - **entry**: name + node (file or directory) 66 + 67 + **Important**: Files are gzip-compressed and base64-encoded before upload to bypass PDS content sniffing 68 + 69 + ### `/hosting-service` 70 + **Purpose**: Lightweight microservice that serves cached sites from disk 71 + 72 + **Architecture**: 73 + - Routes by domain lookup in PostgreSQL 74 + - Caches site content locally on first access or firehose event 75 + - Listens to AT Protocol firehose for new site records 76 + - Automatically downloads and caches files from PDS 77 + - SSRF-protected fetch (timeout, size limits, private IP blocking) 78 + 79 + **Routes**: 80 + 1. Custom domains (`/*`) โ†’ lookup custom_domains table 81 + 2. Wisp subdomains (`/*.wisp.place/*`) โ†’ lookup domains table 82 + 3. DNS hash routing (`/hash.dns.wisp.place/*`) โ†’ lookup custom_domains by hash 83 + 4. Direct serving (`/s.wisp.place/:identifier/:site/*`) โ†’ fetch from PDS if not cached 84 + 85 + **HTML Path Rewriting**: Absolute paths in HTML (`/style.css`) automatically rewritten to relative (`/:identifier/:site/style.css`) 86 + 87 + ### `/cli` 88 + **Purpose**: Rust CLI tool for direct site uploads using app password or OAuth 89 + 90 + **Flow**: 91 + 1. Authenticate with handle + app password or OAuth 92 + 2. Walk directory tree, compress files 93 + 3. Upload blobs to PDS via agent 94 + 4. Create place.wisp.fs record with manifest 95 + 5. Store site in database cache 96 + 97 + **Auth Methods**: 98 + - `--password` flag for app password auth 99 + - OAuth loopback server for browser-based auth 100 + - Supports both (password preferred if provided) 101 + 102 + --- 103 + 104 + ## ๐Ÿ” Key Concepts 105 + 106 + ### Custom Domains (BYOD - Bring Your Own Domain) 107 + **Process**: 108 + 1. User claims custom domain via API 109 + 2. System generates hash (SHA256(domain + secret)) 110 + 3. User adds DNS records: 111 + - TXT at `_wisp.example.com` = their DID 112 + - CNAME at `example.com` = `{hash}.dns.wisp.place` 113 + 4. Background worker checks verification every 10 minutes 114 + 5. Once verified, custom domain routes to their hosted sites 115 + 116 + **Tables**: `custom_domains` (id, domain, did, rkey, verified, last_verified_at) 117 + 118 + ### Wisp Subdomains 119 + **Process**: 120 + 1. Handle claimed on first signup (e.g., alice โ†’ alice.wisp.place) 121 + 2. Stored in `domains` table mapping domain โ†’ DID 122 + 3. Served by hosting service 123 + 124 + ### Site Storage 125 + **Locations**: 126 + - **Authoritative**: PDS (AT Protocol repo) as `place.wisp.fs` record 127 + - **Cache**: PostgreSQL `sites` table (rkey, did, site_name, created_at) 128 + - **File Cache**: Hosting service caches downloaded files on disk 129 + 130 + **Limits**: 131 + - MAX_SITE_SIZE: 300MB total 132 + - MAX_FILE_SIZE: 100MB per file 133 + - MAX_FILE_COUNT: 2000 files 134 + 135 + ### File Compression Strategy 136 + **Why**: Bypass PDS content sniffing issues (was treating HTML as images) 137 + 138 + **Process**: 139 + 1. All files gzip-compressed (level 9) 140 + 2. Compressed content base64-encoded 141 + 3. Uploaded as `application/octet-stream` MIME type 142 + 4. Blob metadata stores original MIME type + encoding flag 143 + 5. Hosting service decompresses on serve 144 + 145 + --- 146 + 147 + ## ๐Ÿ”„ Data Flow 148 + 149 + ### User Registration โ†’ Site Upload 150 + ``` 151 + 1. OAuth signin โ†’ state/session stored in DB 152 + 2. Cookie set with DID 153 + 3. Sync sites from PDS to cache DB 154 + 4. If no sites/domain โ†’ redirect to onboarding 155 + 5. User creates site โ†’ POST /wisp/upload-files 156 + 6. Files compressed, uploaded as blobs 157 + 7. place.wisp.fs record created 158 + 8. Site cached in DB 159 + 9. Hosting service notified via firehose 160 + ``` 161 + 162 + ### Custom Domain Setup 163 + ``` 164 + 1. User claims domain (DB check + allocation) 165 + 2. System generates hash 166 + 3. User adds DNS records (_wisp.domain TXT + CNAME) 167 + 4. Background worker verifies every 10 min 168 + 5. Hosting service routes based on verification status 169 + ``` 170 + 171 + ### Site Access 172 + ``` 173 + Hosting Service: 174 + 1. Request arrives at custom domain or *.wisp.place 175 + 2. Domain lookup in PostgreSQL 176 + 3. Check cache for site files 177 + 4. If not cached: 178 + - Fetch from PDS using DID + rkey 179 + - Decompress files 180 + - Save to disk cache 181 + 5. Serve files (with HTML path rewriting) 182 + ``` 183 + 184 + --- 185 + 186 + ## ๐Ÿ› ๏ธ Important Implementation Details 187 + 188 + ### OAuth Implementation 189 + - **State & Session Storage**: PostgreSQL (with expiration) 190 + - **Key Rotation**: Periodic rotation + expiration cleanup (hourly) 191 + - **OAuth Flow**: Redirects to PDS, returns to /api/auth/callback 192 + - **Session Timeout**: 30 days 193 + - **State Timeout**: 1 hour 194 + 195 + ### Security Headers 196 + - X-Frame-Options: DENY 197 + - X-Content-Type-Options: nosniff 198 + - Strict-Transport-Security: max-age=31536000 199 + - Content-Security-Policy (configured for Elysia + React) 200 + - X-XSS-Protection: 1; mode=block 201 + - Referrer-Policy: strict-origin-when-cross-origin 202 + 203 + ### Admin Authentication 204 + - Simple username/password (hashed with bcrypt) 205 + - Session-based cookie auth (24hr expiration) 206 + - Separate `admin_session` cookie 207 + - Initial setup prompted on startup 208 + 209 + ### Observability 210 + - **Logging**: Structured logging with service tags + event types 211 + - **Error Tracking**: Captures error context (message, stack, etc.) 212 + - **Metrics**: Request counts, latencies, error rates 213 + - **Log Levels**: debug, info, warn, error 214 + - **Collection**: Centralized log collector with in-memory buffer 215 + 216 + --- 217 + 218 + ## ๐Ÿ“ Database Schema 219 + 220 + ### oauth_states 221 + - key (primary key) 222 + - data (JSON) 223 + - created_at, expires_at (timestamps) 224 225 + ### oauth_sessions 226 + - sub (primary key - subject/DID) 227 + - data (JSON with OAuth session) 228 + - updated_at, expires_at 229 230 + ### oauth_keys 231 + - kid (primary key - key ID) 232 + - jwk (JSON Web Key) 233 + - created_at 234 + 235 + ### domains 236 + - domain (primary key - e.g., alice.wisp.place) 237 + - did (unique - user's DID) 238 + - rkey (optional - record key) 239 + - created_at 240 + 241 + ### custom_domains 242 + - id (primary key - UUID) 243 + - domain (unique - e.g., example.com) 244 + - did (user's DID) 245 + - rkey (optional) 246 + - verified (boolean) 247 + - last_verified_at (timestamp) 248 + - created_at 249 + 250 + ### sites 251 + - id, did, rkey, site_name 252 + - created_at, updated_at 253 + - Indexes on (did), (did, rkey), (rkey) 254 + 255 + ### admin_users 256 + - username (primary key) 257 + - password_hash (bcrypt) 258 + - created_at 259 + 260 + --- 261 + 262 + ## ๐Ÿš€ Key Workflows 263 + 264 + ### Sign In Flow 265 + 1. POST /api/auth/signin with handle 266 + 2. System generates state token 267 + 3. Redirects to PDS OAuth endpoint 268 + 4. PDS redirects back to /api/auth/callback?code=X&state=Y 269 + 5. Validate state (CSRF protection) 270 + 6. Exchange code for session 271 + 7. Store session in DB, set DID cookie 272 + 8. Sync sites from PDS 273 + 9. Redirect to /editor or /onboarding 274 + 275 + ### File Upload Flow 276 + 1. POST /wisp/upload-files with siteName + files 277 + 2. Validate site name (rkey format rules) 278 + 3. For each file: 279 + - Check size limits 280 + - Read as ArrayBuffer 281 + - Gzip compress 282 + - Base64 encode 283 + 4. Upload all blobs in parallel via agent.com.atproto.repo.uploadBlob() 284 + 5. Create manifest with all blob refs 285 + 6. putRecord() for place.wisp.fs with manifest 286 + 7. Upsert to sites table 287 + 8. Return URI + CID 288 289 + ### Domain Verification Flow 290 + 1. POST /api/custom-domains/claim 291 + 2. Generate hash = SHA256(domain + secret) 292 + 3. Store in custom_domains with verified=false 293 + 4. Return hash for user to configure DNS 294 + 5. Background worker periodically: 295 + - Query custom_domains where verified=false 296 + - Verify TXT record at _wisp.domain 297 + - Verify CNAME points to hash.dns.wisp.place 298 + - Update verified flag + last_verified_at 299 + 6. Hosting service routes when verified=true 300 301 + --- 302 303 + ## ๐ŸŽจ Frontend Structure 304 305 + ### `/public` 306 + - **index.tsx** - Landing page with sign-in form 307 + - **editor/editor.tsx** - Site editor/management UI 308 + - **admin/admin.tsx** - Admin dashboard 309 + - **components/ui/** - Reusable components (Button, Card, Dialog, etc.) 310 + - **styles/global.css** - Tailwind + custom styles 311 312 + ### Page Flow 313 + 1. `/` - Landing page (sign in / get started) 314 + 2. `/editor` - Main app (requires auth) 315 + 3. `/admin` - Admin console (requires admin auth) 316 + 4. `/onboarding` - First-time user setup 317 318 + --- 319 + 320 + ## ๐Ÿ” Notable Implementation Patterns 321 + 322 + ### File Handling 323 + - Files stored as base64-encoded gzip in PDS blobs 324 + - Metadata preserves original MIME type 325 + - Hosting service decompresses on serve 326 + - Workaround for PDS image pipeline issues with HTML 327 + 328 + ### Error Handling 329 + - Comprehensive logging with context 330 + - Graceful degradation (e.g., site sync failure doesn't break auth) 331 + - Structured error responses with details 332 + 333 + ### Performance 334 + - Site sync: Batch fetch up to 100 records per request 335 + - Blob upload: Parallel promises for all files 336 + - DNS verification: Batched background worker (10 min intervals) 337 + - Caching: Two-tier (DB + disk in hosting service) 338 + 339 + ### Validation 340 + - Lexicon validation on manifest creation 341 + - Record type checking 342 + - Domain format validation 343 + - Site name format validation (AT Protocol rkey rules) 344 + - File size limits enforced before upload 345 + 346 + --- 347 + 348 + ## ๐Ÿ› Known Quirks & Workarounds 349 + 350 + 1. **PDS Content Sniffing**: Files must be uploaded as `application/octet-stream` (even HTML) and base64-encoded to prevent PDS from misinterpreting content 351 + 352 + 2. **Max URL Query Size**: DNS verification worker queries in batch; may need pagination for users with many custom domains 353 + 354 + 3. **File Count Limits**: Max 500 entries per directory (Lexicon constraint); large sites split across multiple directories 355 + 356 + 4. **Blob Size Limits**: Individual blobs limited to 100MB by Lexicon; large files handled differently if needed 357 + 358 + 5. **HTML Path Rewriting**: Only in hosting service for `/s.wisp.place/:identifier/:site/*` routes; custom domains handled differently 359 + 360 + --- 361 + 362 + ## ๐Ÿ“‹ Environment Variables 363 + 364 + - `DOMAIN` - Base domain with protocol (default: `https://wisp.place`) 365 + - `CLIENT_NAME` - OAuth client name (default: `PDS-View`) 366 + - `DATABASE_URL` - PostgreSQL connection (default: `postgres://postgres:postgres@localhost:5432/wisp`) 367 + - `NODE_ENV` - production/development 368 + - `HOSTING_PORT` - Hosting service port (default: 3001) 369 + - `BASE_DOMAIN` - Domain for URLs (default: wisp.place) 370 + 371 + --- 372 + 373 + ## ๐Ÿง‘โ€๐Ÿ’ป Development Notes 374 + 375 + ### Adding New Features 376 + 1. **New routes**: Add to `/src/routes/*.ts`, import in index.ts 377 + 2. **DB changes**: Add migration in db.ts 378 + 3. **New lexicons**: Update `/lexicons/*.json`, regenerate types 379 + 4. **Admin features**: Add to /api/admin endpoints 380 + 381 + ### Testing 382 + - Run with `bun test` 383 + - CSRF tests in lib/csrf.test.ts 384 + - Utility tests in lib/wisp-utils.test.ts 385 + 386 + ### Debugging 387 + - Check logs via `/api/admin/logs` (requires admin auth) 388 + - DNS verification manual trigger: POST /api/admin/verify-dns 389 + - Health check: GET /api/health (includes DNS verifier status) 390 + 391 + --- 392 + 393 + ## ๐Ÿš€ Deployment Considerations 394 + 395 + 1. **Secrets**: Admin password, OAuth keys, database credentials 396 + 2. **HTTPS**: Required (HSTS header enforces it) 397 + 3. **CDN**: Custom domains require DNS configuration 398 + 4. **Scaling**: 399 + - Main server: Horizontal scaling with session DB 400 + - Hosting service: Independent scaling, disk cache per instance 401 + 5. **Backups**: PostgreSQL database critical; firehose provides recovery 402 + 403 + --- 404 + 405 + ## ๐Ÿ“š Related Technologies 406 + 407 + - **AT Protocol**: Decentralized identity, OAuth 2.0 408 + - **Jacquard**: Rust library for AT Protocol interactions 409 + - **Elysia**: Bun web framework (similar to Express/Hono) 410 + - **Lexicon**: AT Protocol's schema definition language 411 + - **Firehose**: Real-time event stream of repo changes 412 + - **PDS**: Personal Data Server (where users' data stored) 413 + 414 + --- 415 + 416 + ## ๐ŸŽฏ Project Goals 417 + 418 + โœ… Decentralized site hosting (data owned by users) 419 + โœ… Custom domain support with DNS verification 420 + โœ… Fast CDN distribution via hosting service 421 + โœ… Developer tools (CLI + API) 422 + โœ… Admin dashboard for monitoring 423 + โœ… Zero user data retention (sites in PDS, sessions in DB only) 424 + 425 + --- 426 + 427 + **Last Updated**: November 2025 428 + **Status**: Active development