feat: parallel test execution with xdist template databases (#540)

* feat: migrate ATProto sync and teal scrobbling to docket

- add sync_atproto and scrobble_to_teal as docket background tasks
- remove all fallback/bifurcation code - Redis is always required
- simplify background.py (remove is_docket_enabled)
- update auth.py and playback.py to use new schedulers
- add Redis service to CI workflow
- update tests for simplified docket-only flow

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: update test for schedule_atproto_sync, fix cookies deprecation

- test_list_record_sync: patch schedule_atproto_sync instead of removed sync_atproto_records
- test_hidden_tags_filter: move cookies to client constructor to fix httpx deprecation warning

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: template database pattern for fast parallel test execution

- use template db + clone for xdist workers (instant file copy vs migrations)
- advisory locks coordinate template creation between workers
- patch settings.database.url per-worker for production code compatibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* ci: enable parallel test execution with xdist

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: ensure test_app fixture depends on db_session for xdist

fixes race condition where tests using test_app without db_session
would run before database URL was patched for the worker

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: add testing philosophy and xdist parallel execution guide

- template database pattern for fast parallel tests
- behavior vs implementation testing philosophy
- common pitfalls with fixture dependencies
- guidance on when private function tests are acceptable

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>

authored by zzstoatzz.io Claude and committed by GitHub 1cc36aa7 1a4e897c

Changed files
+161
docs
testing
+161
docs/testing/README.md
··· 1 + # testing 2 + 3 + testing philosophy and infrastructure for plyr.fm. 4 + 5 + ## philosophy 6 + 7 + ### test behavior, not implementation 8 + 9 + tests should verify *what* the code does, not *how* it does it. this makes tests resilient to refactoring and keeps them focused on user-facing behavior. 10 + 11 + **good**: "when a user likes a track, the like count increases" 12 + **bad**: "when `_increment_like_counter` is called, it executes `UPDATE tracks SET...`" 13 + 14 + signs you're testing implementation: 15 + - mocking internal functions that aren't boundaries 16 + - asserting on SQL queries or ORM calls 17 + - testing private methods directly 18 + - tests break when you refactor without changing behavior 19 + 20 + ### test at the right level 21 + 22 + - **unit tests**: pure functions, utilities, data transformations 23 + - **integration tests**: API endpoints with real database 24 + - **skip mocks when possible**: prefer real dependencies (postgres, redis) over mocks 25 + 26 + ### keep tests fast 27 + 28 + slow tests don't get run. we use parallel execution (xdist) and template databases to keep the full suite under 30 seconds. 29 + 30 + ## parallel execution with xdist 31 + 32 + we run tests in parallel using pytest-xdist. each worker gets its own isolated database. 33 + 34 + ### how it works 35 + 36 + 1. **template database**: first worker creates a template with all migrations applied 37 + 2. **clone per worker**: each xdist worker clones from template (`CREATE DATABASE ... WITH TEMPLATE`) 38 + 3. **instant setup**: cloning is a file copy - no migrations needed per worker 39 + 4. **advisory locks**: coordinate template creation between workers 40 + 41 + this is a common pattern for fast parallel test execution in large codebases. 42 + 43 + ### the fixture chain 44 + 45 + ``` 46 + test_database_url (session) 47 + └── creates template db (once, with advisory lock) 48 + └── clones worker db from template 49 + └── patches settings.database.url for this worker 50 + 51 + _database_setup (session) 52 + └── marker that db is ready 53 + 54 + _engine (function) 55 + └── creates engine for test_database_url 56 + └── clears ENGINES cache 57 + 58 + _clear_db (function) 59 + └── calls clear_database() procedure after each test 60 + 61 + db_session (function) 62 + └── provides AsyncSession for test 63 + ``` 64 + 65 + ### common pitfall: missing db_session dependency 66 + 67 + if a test uses the FastAPI app but doesn't depend on `db_session`, the database URL won't be patched for the worker. the test will connect to the wrong database. 68 + 69 + **wrong**: 70 + ```python 71 + @pytest.fixture 72 + def test_app() -> FastAPI: 73 + return app 74 + 75 + async def test_something(test_app: FastAPI): 76 + # may connect to wrong database in xdist! 77 + ... 78 + ``` 79 + 80 + **right**: 81 + ```python 82 + @pytest.fixture 83 + def test_app(db_session: AsyncSession) -> FastAPI: 84 + _ = db_session # ensures db fixtures run first 85 + return app 86 + 87 + async def test_something(test_app: FastAPI): 88 + # database URL is correctly patched 89 + ... 90 + ``` 91 + 92 + ## running tests 93 + 94 + ```bash 95 + # from repo root 96 + just backend test 97 + 98 + # run specific test 99 + just backend test tests/api/test_tracks.py 100 + 101 + # run with coverage 102 + just backend test --cov 103 + 104 + # run single-threaded (debugging) 105 + just backend test -n 0 106 + ``` 107 + 108 + ## writing good tests 109 + 110 + ### do 111 + 112 + - use descriptive test names that describe behavior 113 + - one assertion per concept (multiple asserts ok if testing one behavior) 114 + - use fixtures for setup, not test body 115 + - test edge cases and error conditions 116 + - add regression tests when fixing bugs 117 + 118 + ### don't 119 + 120 + - use `@pytest.mark.asyncio` - we use `asyncio_mode = "auto"` 121 + - mock database calls - use real postgres 122 + - test ORM internals or SQL structure 123 + - leave tests that depend on execution order 124 + - skip tests instead of fixing them (unless truly environment-specific) 125 + 126 + ## when private function tests are acceptable 127 + 128 + generally avoid testing private functions (`_foo`), but there are pragmatic exceptions: 129 + 130 + **acceptable**: 131 + - pure utility functions with complex logic (string parsing, data transformation) 132 + - functions that are difficult to exercise through public API alone 133 + - when the private function is a clear unit with stable interface 134 + 135 + **not acceptable**: 136 + - implementation details that might change (crypto internals, caching strategy) 137 + - internal orchestration functions 138 + - anything that's already exercised by integration tests 139 + 140 + the key question: "if i refactor, will this test break even though behavior didn't change?" 141 + 142 + ## database fixtures 143 + 144 + ### clear_database procedure 145 + 146 + instead of truncating tables between tests (slow), we use a stored procedure that deletes only rows created during the test: 147 + 148 + ```sql 149 + CALL clear_database(:test_start_time) 150 + ``` 151 + 152 + this deletes rows where `created_at > test_start_time`, preserving any seed data. 153 + 154 + ### why not transactions? 155 + 156 + rolling back transactions is faster, but: 157 + - can't test commit behavior 158 + - can't test constraints properly 159 + - some ORMs behave differently in uncommitted transactions 160 + 161 + delete-by-timestamp gives us real commits while staying fast.