CLAUDE.md#

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview#

tinsnip provides opinionated container management for homelab environments that standardizes how services are configured, deployed, and managed.

Architecture Philosophy:

Standardized UIDs: Every service gets a predictable UID following the SMEP scheme for proper NFS permissions
NFS-backed persistence: Config, data, and state stored on NFS exports for continuous delivery
XDG integration: Service data exposed via symbolic links in XDG Base Directory locations
Rootless Docker: All containers run rootless with standardized host-to-container data mapping
Configuration standards: Services configured via standardized environment variables

Service Categories:

Home-grown services: Easy to write using standardized environment variables
Third-party services: Require custom docker-compose.yml and setup.sh to fit tinsnip standards (e.g., LLDAP, proxies)

SDLC#

We write code on the dev's machine.
There is a terminal running python3 -m http.server & in ./
There is a test machine which we run curl -fsSL "http://192.168.0.218:8000/install.sh?$(date +%s)" | REPO_URL="http://192.168.0.218:8000" bash to install this codebase.
From there we run tinsnip commands like tin sheet create mysheet ds412plus.local

We do not edit code on the test machine - all edits are in dev then curled We do not patch individual files - curl the lot every time We do not patch sheets, machines or services - remove and recreate every time

Test Iteration Workflow#

When iterating on changes (e.g., fixing marshal deployment):

# 1. Make changes on dev machine (this machine)

# 2. Reinstall tinsnip on test machine (geneticalgorithm)
ssh geneticalgorithm 'curl -fsSL "http://192.168.0.218:8000/install.sh?$(date +%s)" | REPO_URL="http://192.168.0.218:8000" bash'

# 3. Remove the machine being tested (use --delete-data during iteration)
ssh geneticalgorithm 'yes | TIN_SHEET=dynamicalsystem.com ~/.local/opt/dynamicalsystem.tinsnip/bin/tin machine rm marshal-prod --delete-data 2>&1 | head -100'

# 4. Recreate machine (INTERACTIVE - requires NAS password)
# Cannot be automated via ssh - user must run manually on geneticalgorithm:
#   TIN_SHEET=dynamicalsystem.com tin machine create marshal prod ds412plus.local

# 5. Deploy service (INTERACTIVE - requires confirmation)
# Cannot be automated via ssh - user must run manually on geneticalgorithm:
#   TIN_SHEET=dynamicalsystem.com tin service deploy marshal-prod marshal

Note: Steps 4 and 5 require interactive input (NAS sudo password and deployment confirmation) so cannot be run via ssh. User must execute these manually on the test machine.

Sheet Concept#

Sheets are tinsnip's organizational unit for isolating different service deployments. Think of them like separate "layers" or "environments" that don't interfere with each other.

Key Properties:

Isolated UID Ranges: Each sheet gets its own UID number (1-4 for user sheets, 5 reserved for topsheet)
Separate Storage: Each sheet can have its own NAS server and storage paths
Independent Services: Services in different sheets never conflict with ports or UIDs
Shared Registry: All sheets use topsheet's station-prod for registry management

Common Use Cases:

Personal vs Work: personal sheet (UIDs 1xxxx) and work sheet (UIDs 2xxxx)
Multi-tenant: company-a sheet (UIDs 3xxxx) and company-b sheet (UIDs 4xxxx)
Environment Separation: development sheet and production sheet

The topsheet (Default Sheet):

The default sheet (UID 5xxxx) where most homelab users deploy all their services
Contains station-prod which provides registry infrastructure for ALL sheets
All machines mount /mnt/station-prod to access registries regardless of their sheet
Most users will only ever use topsheet - additional sheets (1-4) are for advanced multi-tenant scenarios

Station Infrastructure:

Only topsheet has a station (UID 50000)
Station provides three registries accessible to all sheets:
- Sheet Registry: /mnt/station-prod/data/sheets - maps sheet names to numbers
- Machine Registries: /mnt/station-prod/data/machines/<sheet> - per-sheet machine number assignments
- NAS Registry: /mnt/station-prod/data/nas-credentials/nas-servers - NAS server mappings

Sheet Management:

tin sheet create infrastructure   # Register new sheet (gets next available number 1-4)
tin sheet list                    # Show all registered sheets
tin sheet show infrastructure     # Show sheet number and details
tin sheet rm infrastructure       # Remove sheet and all its machines

Project Architecture#

tinsnip has two distinct but complementary systems:

1. Machine Infrastructure (`./machine/`)#

Purpose: Creates the foundational infrastructure prerequisites for service deployment

Responsibilities:

Creates service-specific users with SMEP UIDs
Sets up NFS mounts with correct permissions
Installs rootless Docker
Creates XDG symbolic links
Establishes machine registry for UID coordination

Workflow:

# Create infrastructure for a service
./machine/setup.sh gazette prod DS412plus
# Result: Machine ready to deploy 'gazette' service in 'prod' environment

2. Service Deployment (`./service/{service-name}/`)#

Purpose: Deploys specific services onto prepared machine infrastructure

Responsibilities:

Service-specific docker-compose.yml configuration
Custom setup.sh for third-party service adaptation
Environment variable configuration
Service orchestration and management

Workflow:

# Deploy service to prepared machine (as service user)
sudo -u gazette-prod -i
cd /mnt/gazette-prod/service/gazette
docker compose up -d

Project Structure#

tinsnip/
├── install.sh                   # Downloads and installs tinsnip
├── setup.sh                     # Main setup orchestrator
├── machine/                     # Machine infrastructure setup
│   ├── setup.sh                 # Machine setup entry point
│   ├── validate.sh              # Infrastructure validation suite
│   └── scripts/                 # Implementation scripts
│       ├── lib.sh               # SMEP UID calculations
│       ├── setup_service.sh     # Complete service environment setup
│       ├── setup_station.sh     # Tinsnip station (machine registry)
│       ├── mount_nas.sh         # NFS mounting with XDG integration
│       └── install_docker.sh    # Rootless Docker installation
└── scripts/                     # Legacy deployment scripts
    ├── create_tinsnip_user.sh   # Creates tinsnip user and system users
    └── deploy_service.sh        # Generic service deployment script

**Note**: Service definitions are maintained in a separate repository:
- Default: `git@tangled.sh:dynamicalsystem.com/service`
- Custom: Set via `SERVICE_REPO_URL` environment variable

Deployment Workflow#

1. Install tinsnip#

# Platform only
curl -fsSL "https://tangled.sh/dynamicalsystem.com/tinsnip/raw/main/install.sh?$(date +%s)" | bash

# Platform + service catalog
curl -fsSL "https://tangled.sh/dynamicalsystem.com/tinsnip/raw/main/install.sh?$(date +%s)" | INSTALL_SERVICES=true bash

cd ~/.local/opt/dynamicalsystem.tinsnip

2. Create Machine Infrastructure#

# Prepare machine for a specific service environment
./machine/setup.sh <service> <environment> <nas-server>

# Examples:
./machine/setup.sh gazette prod DS412plus     # Production gazette
./machine/setup.sh lldap test 192.168.1.100  # Test LLDAP

3. Deploy Service#

# Switch to service user (created by machine setup)
sudo -u <service>-<environment> -i

# Copy service from catalog (if using official services)
cp -r ~/.local/opt/dynamicalsystem.service/<service> /mnt/<service>-<environment>/service/
cd /mnt/<service>-<environment>/service/<service>

# Or create your own service following tinsnip conventions
mkdir -p /mnt/<service>-<environment>/service/<service>
cd /mnt/<service>-<environment>/service/<service>
# Add docker-compose.yml and any setup scripts

# Deploy using docker-compose
docker compose up -d

# Check status
docker compose ps
docker compose logs -f

4. Manage Services#

# Service management (as service user)
sudo -u gazette-prod -i
cd /mnt/gazette-prod/service/gazette
docker compose restart
docker compose logs -f

# Access service data via XDG paths
ls ~/.local/share/dynamicalsystem/@gazette     # Data
ls ~/.config/dynamicalsystem/@gazette          # Config  
ls ~/.local/state/dynamicalsystem/@gazette     # State/logs

5. Add New Service#

Option A: Add to Service Catalog

Fork or clone the service repository
Create <new-service>/ directory
Add docker-compose.yml following tinsnip conventions
Optional: Add setup.sh for service-specific configuration
Push to your service repository
Deploy using standard workflow

Option B: Local Service Only

Run ./machine/setup.sh <new-service> <env> <nas> to create infrastructure
As service user, create service directly in /mnt/<new-service>-<environment>/service/<new-service>/
Add docker-compose.yml following tinsnip conventions
Deploy with docker compose up -d

tinsnip Conventions for Services:

Map data volumes to /mnt/<service>-<environment>/{state,data,config}
Use environment variables for configuration
Expose ports based on service UID (e.g., 11300 for redis-prod)
Run as non-root user inside containers

See the service repository for examples.

Architecture Benefits#

Infrastructure Standardization:
- ./machine/ creates consistent, predictable infrastructure for any service
- S-M-E-P UID scheme ensures no conflicts across machines/environments/sheets
- NFS permissions are automatically correct for each service
Service Portability:
- Config, data, and state preserved on NFS exports enable continuous delivery
- Services can be stopped, machines rebuilt, and services redeployed without data loss
- XDG integration makes service data accessible to user applications
Development Flexibility:
- Home-grown services: Write to standardized environment variables, automatic integration
- Third-party services: Custom docker-compose.yml adapts them to tinsnip standards
- Rootless Docker ensures security without complexity
Multi-tenancy Support:
- Configurable sheets via TIN_SHEET environment variable or /etc/tinsnip-sheet
- Each sheet gets isolated UIDs, ports, and storage paths
- See SHEET_CONFIGURATION.md for multi-tenant setups
Operational Simplicity:
- Service registry coordinates UID assignments automatically
- Predictable port allocation based on UIDs
- Standard Docker network and storage patterns
- See DEPLOYMENT_STRATEGY.md for complete operational details

Security Considerations#

Never run setup.sh as root or as service users
Service users have no sudo access
Each service runs as its dedicated user inside containers
Secrets are generated per-service and stored with restrictive permissions

External Access Architecture#

tinsnip uses a file-based route sync approach for external access through a gateway VPS.

Components#

Marshal Service (marshal-prod) - Route coordinator on homelab
- Validates service route configs against JSON schema
- Writes route files to /mnt/marshal-prod/data/{sheet}/marshal/caddy-routes/
- Maintains route registry at /mnt/marshal-prod/data/{sheet}/marshal/route_registry.json
- Exposes HTTP API on port 10200 for route registration
Gateway VPS - Public-facing reverse proxy
- Runs Caddy with WireGuard tunnel to homelab (10.100.0.0/24)
- Dedicated caddy-sync user (not root) syncs routes via rsync
- Converts JSON routes to Caddyfile format
- Hot-reloads Caddy when routes change
- Fully automated setup via setup_route_sync.sh
Service Route Configs - Per-service routing
- Services write route config to /mnt/{service}-{env}/state/{sheet}/{service}/external_routes/caddy.json
- Services register with Marshal via POST /external_route with headers:
  - X-Service-Name: service-name
  - X-Service-Environment: prod|dev
- Marshal validates, writes to its own mount, and tracks in registry

Flow#

Service Setup → Creates route config → Registers with Marshal API
                                      ↓
Marshal validates → Writes to marshal-prod/caddy-routes/{service-env}.json
                                      ↓
Gateway caddy-sync → Syncs via rsync → Converts to Caddyfile → Reloads Caddy
      (every 5 min)       (SSH keys)        (jq parsing)      (zero downtime)
                                      ↓
External HTTPS → Gateway Caddy → WireGuard (10.100.0.2) → Service

Security Design#

Dedicated user: Gateway runs sync as caddy-sync (not root)
Minimal permissions: Can only reload Caddy, read/write route configs
Pull-based: Gateway pulls configs, homelab cannot push
SSH keys: Key-based authentication over WireGuard tunnel
Sudo limits: Only specific commands allowed via /etc/sudoers.d/caddy-sync
No API exposure: Gateway has no inbound API, only reads files

Why File-Based Sync?#

Security: No API exposed from gateway, homelab can't push
Simplicity: Standard Unix tools (rsync, cron, jq)
Reliability: Works even if gateway is temporarily unreachable
Auditability: All configs in files, easy to inspect
Standard: Same pattern as GitOps/Kubernetes ConfigMaps
Reproducible: Complete automated setup and teardown scripts

Setup#

Gateway setup is fully automated:

# Copy scripts to gateway
scp gateway/setup_route_sync.sh gateway.dynamicalsystem.com:/tmp/

# Run automated setup
ssh gateway.dynamicalsystem.com
sudo bash /tmp/setup_route_sync.sh

# Script pauses to display SSH public key
# Add key to marshal-prod@homelab, then press Enter

See gateway/ROUTE_SYNC_SETUP.md for detailed instructions and troubleshooting.

Registering External Routes#

# On homelab, register service with Marshal
curl -X POST http://localhost:10200/external_route \
  -H "X-Service-Name: bsky-pds" \
  -H "X-Service-Environment: dev"

# Gateway syncs within 5 minutes (or trigger manually)
sudo -u caddy-sync /usr/local/bin/caddy-route-sync

# Test external access
curl https://bsky-pds-dev.dynamicalsystem.com/xrpc/_health

Validation (Optional)#

Machine Setup Validation#

The project includes optional validation scripts for the machine setup infrastructure. These help verify the setup logic without making system changes.

For detailed implementation flow including NFS mounting, user creation, and XDG integration sequence, see machine/scripts/mount_nas.sh.

# Run all validations (optional)
cd machine
./validate.sh

# Run individual validation suites
./scripts/validate_functions.sh        # Core functions (UID, ports, paths)
./scripts/validate_port_edge_cases.sh  # Port conflict detection  
./scripts/validate_dry_run.sh          # Setup validation without system changes

Validation Structure:

Entrypoint: machine/validate.sh - runs all validation suites
Individual tests: machine/scripts/validate_*.sh - specific test suites
Test orchestration: The entrypoint discovers and runs all validation scripts automatically

Validation Coverage:

UID convention validation (S-M-E-P scheme)
Automatic port allocation and conflict prevention
NFS path generation for different services/environments
XDG Base Directory compliance

Running validations before deployment is recommended but not required.

Adding New Services#

When adding new services:

Follow the existing LLDAP pattern
Create appropriate system users if needed
Use the shared tinsnip_network
Document the ports and configuration