Service Repository Integration Design#

Overview#

tinsnip and the service catalog are separate repositories that work together. tinsip provides the infrastructure while the service repo provides deployable applications.

Installation Structure#

Clean Install#

~/.local/opt/
├── dynamicalsystem.tinsnip/      # Platform installation
│   ├── machine/                  # Infrastructure setup
│   ├── scripts/                  # Platform utilities
│   └── setup.sh                  # Main orchestrator
└── dynamicalsystem.service/      # Service catalog (optional)
    ├── lldap/                    # Example service
    ├── gazette/                  # Future services
    └── README.md                 # Service documentation

Sheet-aware Installation#

For custom sheets, the structure follows the pattern:

~/.local/opt/
├── {sheet}.tinsnip/          # Platform for sheet
└── {sheet}.service/          # Services for sheet

Service Repository URL Strategy#

Default Behavior: Infer from tinsnip repo#

# If tinsnip is from: https://github.com/dynamicalsystem/tinsnip
# Then service defaults to: https://github.com/dynamicalsystem/service

# If tinsnip is from: git@gitlab.acmecorp.com:infra/tinsnip  
# Then service defaults to: git@gitlab.acmecorp.com:infra/service

# Logic: Replace "tinsnip" with "service" in the repo URL

Override: SERVICE_REPO_URL#

# Organizations can override with their own service catalog
SERVICE_REPO_URL="git@github.com:mycompany/our-services" ./install.sh

# This allows using official tinsnip with custom services

Implementation in install.sh#

# Get tinsnip repo URL (from git remote or hardcoded)
get_tinsnip_repo_url() {
    if [[ -d .git ]]; then
        git remote get-url origin 2>/dev/null || echo "$DEFAULT_REPO_URL"
    else
        echo "$DEFAULT_REPO_URL"
    fi
}

# Infer service repo from tinsnip repo
infer_service_repo() {
    local tinsnip_url="$1"
    # Replace "tinsnip" with "service" in the URL
    echo "${tinsnip_url/tinsnip/service}"
}

# Use SERVICE_REPO_URL if set, otherwise infer
SERVICE_REPO="${SERVICE_REPO_URL:-$(infer_service_repo "$(get_tinsnip_repo_url)")}"

Service Discovery#

1. Installation-time Discovery#

During setup, tinsnip checks for service catalog in order:

# Priority order for service discovery
1. $SERVICE_PATH (environment variable - for local development)
2. ~/.local/opt/{sheet}.service (standard installation)
3. ./service (backward compatibility if exists)

2. Installation Options#

# Platform only
curl -fsSL "https://url/install.sh" | bash

# Platform + services (using inferred service repo)
curl -fsSL "https://url/install.sh" | INSTALL_SERVICES=true bash

# Platform + custom service repo
curl -fsSL "https://url/install.sh" | SERVICE_REPO_URL="git@example.com:custom/services" bash

Integration Points#

1. Machine Setup Integration#

Machine setup can optionally validate service exists:

tin machine create gazette prod nas-server
# Checks if ~/.local/opt/{sheet}.service/gazette exists
# Warns if service definition not found but continues

2. Service Deployment Integration#

Services are deployed from their catalog location:

# As service user
sudo -u gazette-prod -i

# Service files are available via symlink
cd ~/service/gazette  # -> /mnt/gazette-prod/service/gazette

# Or copy from catalog during deployment
cp ~/.local/opt/dynamicalsystem.service/gazette/* /mnt/gazette-prod/service/gazette/

3. Machine Registry Integration#

The station tracks deployed services, not available services:

Machine Registry (/mnt/station-prod/data/machines): Simple text file tracking machine name → number mappings
Service Catalog (~/.local/opt/{sheet}.service/): Git repository containing actual service implementations
No tight coupling between tinsnip platform and specific services

Note: The YAML-based service catalog metadata concept was removed as it was unused. Service metadata lives in the service repository itself (docker-compose.yml, README.md, etc.).

Platform Updates to Support External Services#

1. Update install.sh#

# Add service catalog installation option
INSTALL_SERVICES="${INSTALL_SERVICES:-false}"
SERVICE_REPO_URL="${SERVICE_REPO_URL:-}"  # Optional override

# Infer service repo from tinsnip repo
infer_service_repo() {
    local tinsnip_url="$1"
    echo "${tinsnip_url/tinsnip/service}"
}

# Optional service installation
if [[ "$INSTALL_SERVICES" == "true" ]]; then
    install_service_catalog
fi

2. Update documentation#

Remove service-specific content from tinsnip docs
Add "Service Catalog" section explaining the separation
Document SERVICE_REPO_URL override option

3. Create minimal service example#

Keep one example in tinsnip to show the interface:

examples/
└── service-template/
    ├── docker-compose.yml    # Template showing tinsnip conventions
    └── README.md            # How to create tinsnip services

Benefits of This Design#

Clean Separation: Platform doesn't need to know about specific services
Smart Defaults: Service repo intelligently inferred from tinsnip source
Easy Override: Organizations can point to their own catalogs
No Forking Required: Use official tinsnip with custom services via SERVICE_REPO_URL
Predictable Locations: Services always in ~/.local/opt/{sheet}.service

Example Workflows#

Deploy from Inferred Catalog#

# Install tinsnip + services (will infer service repo)
INSTALL_SERVICES=true curl -fsSL "https://..." | bash

# Setup machine
cd ~/.local/opt/dynamicalsystem.tinsnip
tin machine create lldap prod nas-server

# Deploy service
sudo -u lldap-prod -i
cp -r ~/.local/opt/dynamicalsystem.service/lldap /mnt/lldap-prod/service/
cd /mnt/lldap-prod/service/lldap
docker compose up -d

Deploy with Custom Service Catalog#

# Install tinsip + custom services
SERVICE_REPO_URL="git@github.com:acmecorp/services" INSTALL_SERVICES=true curl -fsSL "https://..." | bash

# Setup machine for custom service
cd ~/.local/opt/dynamicalsystem.tinsnip
tin machine create myapp prod nas-server

# Deploy custom service
sudo -u myapp-prod -i
cp -r ~/.local/opt/dynamicalsystem.service/myapp /mnt/myapp-prod/service/
cd /mnt/myapp-prod/service/myapp
docker compose up -d

Implementation Experiences#

During the first real deployment of the separated service repository architecture, several pain points and successes were discovered:

Pain Points Encountered#

NFS Export Detection Issues
- The check_nfs_exists function repeatedly asks to set up exports that already exist
- Even after verifying exports with exportfs -v, the setup script doesn't detect them properly
- Impact: Confusing user experience, requiring manual confirmation multiple times
- Root Cause: The NFS detection logic times out too quickly or has permission issues
Script Path Resolution Problems
- Legacy path issues resolved in CLI refactor
- setup_service.sh looks for scripts in scripts/scripts/ instead of scripts/
- Impact: Setup fails with "file not found" errors
- Fix Applied: Updated paths in setup.sh and setup_service.sh
Rootless Docker Systemd Issues
- Service users get "Failed to connect to bus: No medium found" when using systemctl
- Docker containers fail with cgroup errors: "Interactive authentication required"
- Impact: Cannot run containers as service users
- Root Cause: Service users don't have proper systemd user sessions or cgroup permissions
Service Catalog Access
- Service users (e.g., lldap-test) cannot access ~simonhorrobin/.local/opt/dynamicalsystem.service/
- No sudo access for service users to copy files
- Impact: Manual intervention required to copy service definitions
- Workaround: Admin user must copy files and chown them
Installation Script Completeness
- The install.sh script doesn't properly include the machine directory
- Git repository must be cloned manually instead of using the curl installer
- Impact: Confusing installation experience

What Worked Well#

SMEP UID Scheme
- UIDs calculated correctly: station-prod (11000), lldap-test (11210)
- Clear separation between services and environments
- Port allocation follows UID scheme as designed
NFS Mounting
- Once exports are created, mounting works reliably
- Correct ownership preserved through all_squash
- XDG symlinks created successfully
Directory Structure
- Clean separation of /mnt/<service>-<environment>/{data,config,state,service}
- Service definitions isolated in their own directories
- Proper ownership maintained throughout
Service Repository Separation
- External service repository clones successfully
- Service definitions are clean and self-contained
- Easy to understand docker-compose.yml files

Recommended Improvements#

Fix NFS Detection
- Increase timeout in check_nfs_exists function
- Add better error handling and logging
- Consider using showmount or rpcinfo for detection
Resolve Docker Issues
- Investigate systemd-logind configuration for service users
- Consider using lingering sessions: loginctl enable-linger
- Document manual Docker startup procedure as fallback
Improve Service Catalog Access
- Consider mounting service catalog via NFS
- Or copy to a shared location like /opt/tinsnip/services
- Or include in machine setup to copy to service user homes
Enhance Installation Process
- Update install.sh to properly fetch all components
- Add verification step to ensure complete installation
- Consider providing pre-built archives as alternative
Add Validation Steps
- Verify all paths exist before executing scripts
- Add dry-run mode for testing
- Better error messages with suggested fixes

Lessons Learned#

The separation of tinsip and services is conceptually sound
Real-world deployment reveals integration challenges
Service users need special consideration for Docker and file access
NFS setup verification needs to be more robust
Documentation should include troubleshooting guide for common issues

Open Discussion Items#

1. Docker Volume Persistence Architecture#

Problem: Services currently use Docker named volumes (e.g., lldap_data:/data) which store data in Docker's internal storage (~/.local/share/docker/volumes/). This defeats tinsnip's core value proposition of NFS-backed persistence for continuous delivery.

Current Impact:

Service data is lost when machines are rebuilt
Cannot achieve true continuous delivery model
Inconsistent with tinsnip's architecture philosophy

Potential Solutions:

Automatically generate docker-compose.override.yml during service setup to map volumes to /mnt/<service>-<environment>/* paths
Modify service repository docker-compose.yml files to use bind mounts by default
Create service setup script convention that handles volume mapping
Use environment variable substitution in docker-compose.yml for volume paths

Questions:

Should this be handled in the tinsnip or the service repository?
How do we maintain service portability while ensuring tinsnip integration?

2. Service Debugging and Change Control Workflow#

Problem: During service deployment debugging, errors span multiple repositories:

Platform issues (Docker setup, NFS mounting) are in the tinsnip repository
Service configuration issues (docker-compose.yml, setup scripts) are in the service repository
Debugging often happens directly on the deployment box with ad-hoc fixes

Current Impact:

Manual fixes applied to boxes are lost when machines are rebuilt
Changes made during debugging don't get captured in version control
Difficult to maintain discipline around continuous delivery principles
Knowledge transfer issues when fixes aren't documented

Workflow Challenges:

How to ensure all debugging fixes get committed to appropriate repositories?
How to maintain CD discipline when rapid iteration is needed for debugging?
How to handle cross-repository dependencies during troubleshooting?
How to preserve institutional knowledge from debugging sessions?

Questions:

Should we require all changes to go through proper git workflow even during debugging?
How do we balance speed of debugging with change control discipline?
What tooling could help capture and replay manual fixes in automation?
How do we ensure debugging insights get captured in documentation?