docs: fix inaccuracies in database migrations documentation (#487)

- Fix local dev section: uses Neon dev database, not localhost
- Correct Fly.io app name from plyr-api to relay-api throughout
- Add "current capabilities" section documenting that migration
isolation (via release_command) and multi-environment pipeline
(dev/staging/prod) are already implemented
- Update database diagram to show all four databases (dev, staging,
prod on Neon + local test)
- Remove misleading "future considerations" for features that exist

Co-authored-by: Claude <noreply@anthropic.com>

authored by zzstoatzz.io Claude and committed by GitHub 6a372817 8e2ad219

Changed files
+48 -36
docs
+48 -36
docs/deployment/database-migrations.md
··· 77 77 │ .env file: │ │ fly secrets: │ 78 78 │ DATABASE_URL= │ │ DATABASE_URL= │ 79 79 │ postgresql+asyncpg:// │ │ postgresql:// │ 80 - │ localhost:5432/plyr │ │ [neon connection] │ 80 + │ [neon dev connection] │ │ [neon prod connection] │ 81 81 │ │ │ │ 82 82 │ when you run: │ │ when fly.io runs: │ 83 83 │ uv run alembic upgrade │ │ release_command: │ 84 84 │ │ │ uv run alembic upgrade │ 85 - │ → migrates LOCAL db │ │ → migrates PROD db │ 85 + │ → migrates DEV db │ │ → migrates PROD db │ 86 86 └──────────────────────────┘ └──────────────────────────┘ 87 87 ``` 88 88 ··· 90 90 91 91 1. **no shared configuration**: local and production environments have completely separate `DATABASE_URL` values 92 92 2. **environment-specific secrets**: production database URL is stored in fly.io secrets, never in code 93 - 3. **explicit context**: you cannot accidentally run a production migration locally because your local `DATABASE_URL` points to localhost 94 - 4. **explicit context**: fly.io cannot run migrations against your local database because it only knows about the production `DATABASE_URL` 93 + 3. **explicit context**: you cannot accidentally run a production migration locally because your local `DATABASE_URL` points to the Neon dev database 94 + 4. **explicit context**: fly.io cannot run migrations against your dev database because it only knows about the production `DATABASE_URL` 95 95 96 96 **concrete example:** 97 97 98 98 ```bash 99 99 # local development 100 100 $ cat .env 101 - DATABASE_URL=postgresql+asyncpg://localhost:5432/plyr 101 + DATABASE_URL=postgresql+asyncpg://neon_user:***@ep-muddy-flower-98795112.us-east-2.aws.neon.tech/plyr-dev 102 102 103 103 $ uv run alembic upgrade head 104 - # connects to localhost:5432/plyr 105 - # migrates your local dev database 104 + # connects to neon dev database (plyr-dev) 105 + # migrates your development database 106 106 107 107 # production (inside fly.io release machine) 108 108 $ echo $DATABASE_URL 109 - postgresql://neon_user:***@ep-cool-moon-123.us-east-2.aws.neon.tech/plyr-prod 109 + postgresql://neon_user:***@ep-cold-butterfly-11920742.us-east-1.aws.neon.tech/plyr-prd 110 110 111 111 $ uv run alembic upgrade head 112 - # connects to neon production database 112 + # connects to neon production database (plyr-prd) 113 113 # migrates your production database 114 114 ``` 115 115 ··· 119 119 ``` 120 120 1. developer edits model in src/backend/models/ 121 121 2. runs: uv run alembic revision --autogenerate -m "description" 122 - 3. alembic reads DATABASE_URL from .env (localhost) 122 + 3. alembic reads DATABASE_URL from .env (neon dev) 123 123 4. generates migration by comparing: 124 124 - current model state (code) 125 - - current database state (local postgres) 125 + - current database state (neon dev database) 126 126 5. runs: uv run alembic upgrade head 127 - 6. migration applies to local database 127 + 6. migration applies to dev database 128 128 ``` 129 129 130 130 production deployment: ··· 161 161 162 162 ``` 163 163 ┌──────────────────────────────────────────────────────────────┐ 164 - │ three completely separate databases: │ 164 + │ four separate databases (three neon instances + local test): │ 165 165 │ │ 166 - │ 1. dev (localhost:5432/plyr) │ 166 + │ 1. dev (neon: plyr-dev / muddy-flower-98795112) │ 167 167 │ - for local development │ 168 168 │ - set via .env: DATABASE_URL=postgresql+asyncpg://... │ 169 169 │ - migrations run manually: uv run alembic upgrade head │ 170 170 │ │ 171 - │ 2. test (localhost:5433/plyr_test) │ 172 - │ - for automated tests │ 173 - │ - set via conftest.py fixture │ 174 - │ - schema created by tests/conftest.py │ 175 - │ - no migrations (schema created from models directly) │ 171 + │ 2. staging (neon: plyr-stg / frosty-math-37367092) │ 172 + │ - for staging environment │ 173 + │ - set via fly secrets on relay-api-staging │ 174 + │ - migrations run automatically via release_command │ 176 175 │ │ 177 - │ 3. prod (neon.tech cloud) │ 176 + │ 3. prod (neon: plyr-prd / cold-butterfly-11920742) │ 178 177 │ - for production traffic │ 179 178 │ - set via fly secrets: DATABASE_URL=postgresql://... │ 180 179 │ - migrations run automatically via release_command │ 181 180 │ │ 181 + │ 4. test (localhost:5433/plyr_test) │ 182 + │ - for automated tests only │ 183 + │ - set via conftest.py fixture │ 184 + │ - schema created from models directly (no migrations) │ 185 + │ │ 182 186 │ these databases never interact or share configuration │ 183 187 └──────────────────────────────────────────────────────────────┘ 184 188 ``` ··· 189 193 190 194 1. **dockerfile didn't include migration files** - had to create PR #14 to add `COPY alembic.ini` and `COPY alembic ./alembic` 191 195 2. **alembic version tracking out of sync** - production database had `user_preferences` table but alembic thought version was older, causing "relation already exists" errors 192 - 3. **manual stamp needed** - had to run `flyctl ssh console -a plyr-api -C "uv run alembic stamp 9e8c7aa5b945"` to fix version tracking 193 - 4. **manual migration execution** - had to run `flyctl ssh console -a plyr-api -C "uv run alembic upgrade head"` after deployment 196 + 3. **manual stamp needed** - had to run `flyctl ssh console -a relay-api -C "uv run alembic stamp 9e8c7aa5b945"` to fix version tracking 197 + 4. **manual migration execution** - had to run `flyctl ssh console -a relay-api -C "uv run alembic upgrade head"` after deployment 194 198 5. **blocked deployment** - couldn't deploy until all manual steps completed 195 199 196 200 this took ~30 minutes of manual intervention for what should be automatic. ··· 384 388 - our migrations are simple and fast (~3 seconds) 385 389 - can revisit when we have complex, long-running migrations 386 390 391 + ## current capabilities 392 + 393 + **migration isolation via release_command** (already implemented): 394 + - fly.io's `release_command` runs migrations in a separate temporary machine before deployment 395 + - migrations complete before app serves traffic (no inconsistent state) 396 + - deployment automatically aborts if migration fails 397 + - this provides similar benefits to kubernetes init containers 398 + 399 + **multi-environment pipeline** (already implemented): 400 + - dev → staging → production progression via three neon databases 401 + - migrations tested locally against neon dev first 402 + - staging deployment validates migrations before production 403 + - automated via GitHub Actions 404 + 387 405 ## future considerations 388 406 389 407 as plyr.fm scales, we may want to explore: 390 408 391 - **migration init containers** (if we move to kubernetes/docker compose): 392 - - separate container for migrations before app starts 393 - - matches reference project N's pattern 394 - - better isolation and observability 395 - 396 409 **neon branch-based migrations** (for complex changes): 397 410 - test migrations on database branch first 398 411 - promote branch to production (instant swap) 399 412 - zero downtime, instant rollback 400 413 - useful for high-risk schema changes 401 414 402 - **multi-environment pipeline**: 403 - - dev → staging → production progression 404 - - test migrations in lower environments first 405 - - automated smoke tests after migration 406 - - canary deployments for schema changes 415 + **automated smoke tests**: 416 + - run basic API health checks after migration completes 417 + - verify critical queries still work 418 + - alert if performance degrades significantly 407 419 408 420 ## migration best practices 409 421 ··· 415 427 uv run alembic current 416 428 417 429 # production 418 - flyctl ssh console -a plyr-api -C "uv run alembic current" 430 + flyctl ssh console -a relay-api -C "uv run alembic current" 419 431 ``` 420 432 421 433 2. **ensure schemas are in sync** ··· 522 534 523 535 a. **downgrade and fix**: 524 536 ```bash 525 - flyctl ssh console -a plyr-api -C "uv run alembic downgrade -1" 537 + flyctl ssh console -a relay-api -C "uv run alembic downgrade -1" 526 538 # fix migration file locally 527 539 # commit and redeploy 528 540 ``` ··· 537 549 538 550 c. **manual SQL fix**: 539 551 ```bash 540 - flyctl ssh console -a plyr-api 552 + flyctl ssh console -a relay-api 541 553 # connect to database 542 554 # run manual SQL to fix state 543 555 # stamp to correct revision ··· 612 624 613 625 --- 614 626 615 - **last updated**: 2025-11-02 627 + **last updated**: 2025-12-05 616 628 **status**: fully automated via fly.io release_command ✓ 617 629 **owner**: @zzstoatzz