1# background tasks 2 3plyr.fm uses [pydocket](https://github.com/PrefectHQ/pydocket) for durable background task execution, backed by Redis. 4 5## overview 6 7background tasks handle operations that shouldn't block the request/response cycle: 8- **copyright scanning** - analyzes uploaded tracks for potential copyright matches 9- **media export** - downloads all tracks, zips them, and uploads to R2 10- **ATProto sync** - syncs records to user's PDS on login 11- **teal scrobbling** - scrobbles plays to user's PDS 12- **album list sync** - updates ATProto list records when album metadata changes 13- **PDS like/unlike** - syncs like records to user's PDS asynchronously 14- **PDS comment create/update/delete** - syncs comment records to user's PDS asynchronously 15 16## architecture 17 18``` 19┌─────────────┐ ┌─────────────┐ ┌─────────────┐ 20│ FastAPI │────▶│ Redis │◀────│ Worker │ 21│ (add task)│ │ (queue) │ │ (process) │ 22└─────────────┘ └─────────────┘ └─────────────┘ 23``` 24 25- **docket** schedules tasks to Redis 26- **worker** runs in-process alongside FastAPI, processing tasks from the queue 27- tasks are durable - if the worker crashes, tasks are retried on restart 28 29## configuration 30 31### environment variables 32 33```bash 34# redis URL (required to enable docket) 35DOCKET_URL=redis://localhost:6379 36 37# optional settings (have sensible defaults) 38DOCKET_NAME=plyr # queue namespace 39DOCKET_WORKER_CONCURRENCY=10 # concurrent task limit 40``` 41 42### ⚠️ worker settings - do not modify 43 44the worker is initialized in `backend/_internal/background.py` with pydocket's defaults. **do not change these settings without extensive testing:** 45 46| setting | default | why it matters | 47|---------|---------|----------------| 48| `heartbeat_interval` | 2s | changing this broke all task execution (2025-12-30 incident) | 49| `minimum_check_interval` | 1s | affects how quickly tasks are picked up | 50| `scheduling_resolution` | 1s | affects scheduled task precision | 51 52**2025-12-30 incident**: setting `heartbeat_interval=30s` caused all scheduled tasks (likes, comments, exports) to silently fail while perpetual tasks continued running. root cause unclear - correlation was definitive but mechanism wasn't found in pydocket source. reverted in PR #669. 53 54if you need to tune worker settings: 551. test extensively in staging with real task volume 562. verify ALL task types execute (not just perpetual tasks) 573. check logfire for task execution spans 58 59when `DOCKET_URL` is not set, docket is disabled and tasks fall back to `asyncio.create_task()` (fire-and-forget). 60 61### local development 62 63```bash 64# start redis + backend + frontend 65just dev 66 67# or manually: 68docker compose up -d # starts redis on localhost:6379 69DOCKET_URL=redis://localhost:6379 just backend run 70``` 71 72### production/staging 73 74Redis instances are self-hosted on Fly.io (redis:7-alpine): 75 76| environment | fly app | region | 77|-------------|---------|--------| 78| production | `plyr-redis` | iad | 79| staging | `plyr-redis-stg` | iad | 80 81set `DOCKET_URL` in fly.io secrets: 82```bash 83flyctl secrets set DOCKET_URL=redis://plyr-redis.internal:6379 -a relay-api 84flyctl secrets set DOCKET_URL=redis://plyr-redis-stg.internal:6379 -a relay-api-staging 85``` 86 87note: uses Fly internal networking (`.internal` domain), no TLS needed within private network. 88 89## usage 90 91### scheduling a task 92 93```python 94from backend._internal.background_tasks import schedule_copyright_scan, schedule_export 95 96# automatically uses docket if enabled, else asyncio.create_task 97await schedule_copyright_scan(track_id, audio_url) 98await schedule_export(export_id, artist_did) 99``` 100 101### adding new tasks 102 1031. define the task function in `backend/_internal/background_tasks.py`: 104```python 105async def my_new_task(arg1: str, arg2: int) -> None: 106 """task functions must be async and JSON-serializable args only.""" 107 # do work here 108 pass 109``` 110 1112. register it in `backend/_internal/background.py`: 112```python 113def _register_tasks(docket: Docket) -> None: 114 from backend._internal.background_tasks import my_new_task, scan_copyright 115 116 docket.register(scan_copyright) 117 docket.register(my_new_task) # add here 118``` 119 1203. create a scheduler helper if needed: 121```python 122async def schedule_my_task(arg1: str, arg2: int) -> None: 123 """schedule with docket if enabled, else asyncio.""" 124 if is_docket_enabled(): 125 try: 126 docket = get_docket() 127 await docket.add(my_new_task)(arg1, arg2) 128 return 129 except Exception: 130 pass # fall through to asyncio 131 132 asyncio.create_task(my_new_task(arg1, arg2)) 133``` 134 135## costs 136 137**self-hosted Redis on Fly.io** (fixed monthly): 138- ~$2/month per instance (256MB shared-cpu VM) 139- ~$4/month total for prod + staging 140 141this replaced Upstash pay-per-command pricing which was costing ~$75/month at scale (37M commands/month). 142 143## fallback behavior 144 145when docket is disabled (`DOCKET_URL` not set): 146- `schedule_copyright_scan()` uses `asyncio.create_task()` instead 147- tasks are fire-and-forget (no retries, no durability) 148- suitable for local dev without Redis 149 150## monitoring 151 152background task execution is traced in Logfire: 153- span: `scheduled copyright scan via docket` 154- span: `docket scheduling failed, falling back to asyncio` 155 156query recent background task activity: 157```sql 158SELECT start_timestamp, message, span_name, duration 159FROM records 160WHERE span_name LIKE '%copyright%' 161 AND start_timestamp > NOW() - INTERVAL '1 hour' 162ORDER BY start_timestamp DESC 163```