testing#
testing philosophy and infrastructure for plyr.fm.
philosophy#
test behavior, not implementation#
tests should verify what the code does, not how it does it. this makes tests resilient to refactoring and keeps them focused on user-facing behavior.
good: "when a user likes a track, the like count increases"
bad: "when _increment_like_counter is called, it executes UPDATE tracks SET..."
signs you're testing implementation:
- mocking internal functions that aren't boundaries
- asserting on SQL queries or ORM calls
- testing private methods directly
- tests break when you refactor without changing behavior
test at the right level#
- unit tests: pure functions, utilities, data transformations
- integration tests: API endpoints with real database
- skip mocks when possible: prefer real dependencies (postgres, redis) over mocks
keep tests fast#
slow tests don't get run. we use parallel execution (xdist) and template databases to keep the full suite under 30 seconds.
parallel execution with xdist#
we run tests in parallel using pytest-xdist. each worker gets its own isolated database.
how it works#
- template database: first worker creates a template with all migrations applied
- clone per worker: each xdist worker clones from template (
CREATE DATABASE ... WITH TEMPLATE) - instant setup: cloning is a file copy - no migrations needed per worker
- advisory locks: coordinate template creation between workers
this is a common pattern for fast parallel test execution in large codebases.
the fixture chain#
test_database_url (session)
└── creates template db (once, with advisory lock)
└── clones worker db from template
└── patches settings.database.url for this worker
_database_setup (session)
└── marker that db is ready
_engine (function)
└── creates engine for test_database_url
└── clears ENGINES cache
_clear_db (function)
└── calls clear_database() procedure after each test
db_session (function)
└── provides AsyncSession for test
common pitfall: missing db_session dependency#
if a test uses the FastAPI app but doesn't depend on db_session, the database URL won't be patched for the worker. the test will connect to the wrong database.
wrong:
@pytest.fixture
def test_app() -> FastAPI:
return app
async def test_something(test_app: FastAPI):
# may connect to wrong database in xdist!
...
right:
@pytest.fixture
def test_app(db_session: AsyncSession) -> FastAPI:
_ = db_session # ensures db fixtures run first
return app
async def test_something(test_app: FastAPI):
# database URL is correctly patched
...
running tests#
# from repo root
just backend test
# run specific test
just backend test tests/api/test_tracks.py
# run with coverage
just backend test --cov
# run single-threaded (debugging)
just backend test -n 0
writing good tests#
do#
- use descriptive test names that describe behavior
- one assertion per concept (multiple asserts ok if testing one behavior)
- use fixtures for setup, not test body
- test edge cases and error conditions
- add regression tests when fixing bugs
don't#
- use
@pytest.mark.asyncio- we useasyncio_mode = "auto" - mock database calls - use real postgres
- test ORM internals or SQL structure
- leave tests that depend on execution order
- skip tests instead of fixing them (unless truly environment-specific)
when private function tests are acceptable#
generally avoid testing private functions (_foo), but there are pragmatic exceptions:
acceptable:
- pure utility functions with complex logic (string parsing, data transformation)
- functions that are difficult to exercise through public API alone
- when the private function is a clear unit with stable interface
not acceptable:
- implementation details that might change (crypto internals, caching strategy)
- internal orchestration functions
- anything that's already exercised by integration tests
the key question: "if i refactor, will this test break even though behavior didn't change?"
database fixtures#
clear_database procedure#
instead of truncating tables between tests (slow), we use a stored procedure that deletes only rows created during the test:
CALL clear_database(:test_start_time)
this deletes rows where created_at > test_start_time, preserving any seed data.
why not transactions?#
rolling back transactions is faster, but:
- can't test commit behavior
- can't test constraints properly
- some ORMs behave differently in uncommitted transactions
delete-by-timestamp gives us real commits while staying fast.
redis isolation for parallel tests#
tests that use redis (caching, background tasks) need isolation between xdist workers. without isolation, one worker's cache entries pollute another's tests.
how it works#
each xdist worker uses a different redis database number:
| worker | redis db |
|---|---|
| master/gw0 | 1 |
| gw1 | 2 |
| gw2 | 3 |
| ... | ... |
db 0 is reserved for local development.
the redis_database fixture#
@pytest.fixture(scope="session", autouse=True)
def redis_database(worker_id: str) -> Generator[None, None, None]:
"""use isolated redis databases for parallel test execution."""
db = _redis_db_for_worker(worker_id)
new_url = _redis_url_with_db(settings.docket.url, db)
# patch settings for this worker process
settings.docket.url = new_url
os.environ["DOCKET_URL"] = new_url
clear_client_cache()
# flush db before tests
sync_redis = redis.Redis.from_url(new_url)
sync_redis.flushdb()
sync_redis.close()
yield
# flush after tests
...
this fixture is autouse=True so it applies to all tests automatically.
common pitfall: unique URIs in cache tests#
even with per-worker database isolation, tests within the same worker share redis state. if multiple tests use the same cache keys, they can interfere with each other.
wrong:
async def test_caching_first():
uris = ["at://did:plc:test/fm.plyr.track/1"] # generic URI
result = await get_active_copyright_labels(uris)
# caches the result
async def test_caching_second():
uris = ["at://did:plc:test/fm.plyr.track/1"] # same URI!
result = await get_active_copyright_labels(uris)
# gets cached value from first test - may fail unexpectedly
right:
async def test_caching_first():
uris = ["at://did:plc:first/fm.plyr.track/1"] # unique to this test
...
async def test_caching_second():
uris = ["at://did:plc:second/fm.plyr.track/1"] # different URI
...
use unique identifiers (test name, uuid, etc.) in cache keys to avoid cross-test pollution.