music on atproto
plyr.fm
1# background tasks
2
3plyr.fm uses [pydocket](https://github.com/PrefectHQ/pydocket) for durable background task execution, backed by Redis.
4
5## overview
6
7background tasks handle operations that shouldn't block the request/response cycle:
8- **copyright scanning** - analyzes uploaded tracks for potential copyright matches
9- **media export** - downloads all tracks, zips them, and uploads to R2
10- **ATProto sync** - syncs records to user's PDS on login
11- **teal scrobbling** - scrobbles plays to user's PDS
12- **album list sync** - updates ATProto list records when album metadata changes
13- **PDS like/unlike** - syncs like records to user's PDS asynchronously
14- **PDS comment create/update/delete** - syncs comment records to user's PDS asynchronously
15
16## architecture
17
18```
19┌─────────────┐ ┌─────────────┐ ┌─────────────┐
20│ FastAPI │────▶│ Redis │◀────│ Worker │
21│ (add task)│ │ (queue) │ │ (process) │
22└─────────────┘ └─────────────┘ └─────────────┘
23```
24
25- **docket** schedules tasks to Redis
26- **worker** runs in-process alongside FastAPI, processing tasks from the queue
27- tasks are durable - if the worker crashes, tasks are retried on restart
28
29## configuration
30
31### environment variables
32
33```bash
34# redis URL (required to enable docket)
35DOCKET_URL=redis://localhost:6379
36
37# optional settings (have sensible defaults)
38DOCKET_NAME=plyr # queue namespace
39DOCKET_WORKER_CONCURRENCY=10 # concurrent task limit
40```
41
42### ⚠️ worker settings - do not modify
43
44the worker is initialized in `backend/_internal/background.py` with pydocket's defaults. **do not change these settings without extensive testing:**
45
46| setting | default | why it matters |
47|---------|---------|----------------|
48| `heartbeat_interval` | 2s | changing this broke all task execution (2025-12-30 incident) |
49| `minimum_check_interval` | 1s | affects how quickly tasks are picked up |
50| `scheduling_resolution` | 1s | affects scheduled task precision |
51
52**2025-12-30 incident**: setting `heartbeat_interval=30s` caused all scheduled tasks (likes, comments, exports) to silently fail while perpetual tasks continued running. root cause unclear - correlation was definitive but mechanism wasn't found in pydocket source. reverted in PR #669.
53
54if you need to tune worker settings:
551. test extensively in staging with real task volume
562. verify ALL task types execute (not just perpetual tasks)
573. check logfire for task execution spans
58
59when `DOCKET_URL` is not set, docket is disabled and tasks fall back to `asyncio.create_task()` (fire-and-forget).
60
61### local development
62
63```bash
64# start redis + backend + frontend
65just dev
66
67# or manually:
68docker compose up -d # starts redis on localhost:6379
69DOCKET_URL=redis://localhost:6379 just backend run
70```
71
72### production/staging
73
74Redis instances are self-hosted on Fly.io (redis:7-alpine):
75
76| environment | fly app | region |
77|-------------|---------|--------|
78| production | `plyr-redis` | iad |
79| staging | `plyr-redis-stg` | iad |
80
81set `DOCKET_URL` in fly.io secrets:
82```bash
83flyctl secrets set DOCKET_URL=redis://plyr-redis.internal:6379 -a relay-api
84flyctl secrets set DOCKET_URL=redis://plyr-redis-stg.internal:6379 -a relay-api-staging
85```
86
87note: uses Fly internal networking (`.internal` domain), no TLS needed within private network.
88
89## usage
90
91### scheduling a task
92
93```python
94from backend._internal.background_tasks import schedule_copyright_scan, schedule_export
95
96# automatically uses docket if enabled, else asyncio.create_task
97await schedule_copyright_scan(track_id, audio_url)
98await schedule_export(export_id, artist_did)
99```
100
101### adding new tasks
102
1031. define the task function in `backend/_internal/background_tasks.py`:
104```python
105async def my_new_task(arg1: str, arg2: int) -> None:
106 """task functions must be async and JSON-serializable args only."""
107 # do work here
108 pass
109```
110
1112. register it in `backend/_internal/background.py`:
112```python
113def _register_tasks(docket: Docket) -> None:
114 from backend._internal.background_tasks import my_new_task, scan_copyright
115
116 docket.register(scan_copyright)
117 docket.register(my_new_task) # add here
118```
119
1203. create a scheduler helper if needed:
121```python
122async def schedule_my_task(arg1: str, arg2: int) -> None:
123 """schedule with docket if enabled, else asyncio."""
124 if is_docket_enabled():
125 try:
126 docket = get_docket()
127 await docket.add(my_new_task)(arg1, arg2)
128 return
129 except Exception:
130 pass # fall through to asyncio
131
132 asyncio.create_task(my_new_task(arg1, arg2))
133```
134
135## costs
136
137**self-hosted Redis on Fly.io** (fixed monthly):
138- ~$2/month per instance (256MB shared-cpu VM)
139- ~$4/month total for prod + staging
140
141this replaced Upstash pay-per-command pricing which was costing ~$75/month at scale (37M commands/month).
142
143## fallback behavior
144
145when docket is disabled (`DOCKET_URL` not set):
146- `schedule_copyright_scan()` uses `asyncio.create_task()` instead
147- tasks are fire-and-forget (no retries, no durability)
148- suitable for local dev without Redis
149
150## monitoring
151
152background task execution is traced in Logfire:
153- span: `scheduled copyright scan via docket`
154- span: `docket scheduling failed, falling back to asyncio`
155
156query recent background task activity:
157```sql
158SELECT start_timestamp, message, span_name, duration
159FROM records
160WHERE span_name LIKE '%copyright%'
161 AND start_timestamp > NOW() - INTERVAL '1 hour'
162ORDER BY start_timestamp DESC
163```