music on atproto
plyr.fm
1# audio transcoder service
2
3## overview
4
5the transcoder is a standalone rust-based HTTP service that handles audio format conversion using ffmpeg. it runs as a separate fly.io app to isolate CPU-intensive transcoding operations from the main backend API.
6
7## architecture
8
9### why separate service?
10
11ffmpeg operations are CPU-intensive and can block the event loop in async python applications. separating the transcoder provides:
12
13- **isolation**: transcoding doesn't affect API latency
14- **performance**: rust + tokio provides better concurrency for blocking operations
15- **scalability**: can scale transcoder independently from main backend
16- **resource allocation**: dedicated CPU/memory for transcoding work
17
18### technology stack
19
20- **rust**: high-performance systems language
21- **axum**: async web framework built on tokio
22- **ffmpeg**: industry-standard media processing
23- **fly.io**: deployment platform with auto-scaling
24
25## API
26
27### POST /transcode
28
29convert audio file to target format.
30
31**authentication**: bearer token via `X-Transcoder-Key` header
32
33**request**: multipart/form-data
34- `file`: audio file to transcode
35- `target` (optional query param): target format (default: "mp3")
36
37**example**:
38```bash
39curl -X POST https://plyr-transcoder.fly.dev/transcode?target=mp3 \
40 -H "X-Transcoder-Key: $TRANSCODER_AUTH_TOKEN" \
41 -F "file=@input.wav" \
42 --output output.mp3
43```
44
45**response**: transcoded audio file (binary)
46
47**headers**:
48- `Content-Type`: appropriate media type for target format
49- `Content-Disposition`: attachment with original filename + new extension
50
51**supported formats**:
52- mp3 (MPEG Layer 3)
53- m4a (AAC in MP4 container)
54- wav (PCM audio)
55- flac (lossless compression)
56- ogg (Vorbis codec)
57
58**status codes**:
59- 200: transcoding successful, returns audio file
60- 400: invalid input (unsupported format, missing file, etc.)
61- 401: missing or invalid authentication token
62- 413: file too large (>1GB)
63- 500: transcoding failed (ffmpeg error, I/O error, etc.)
64
65### GET /health
66
67health check endpoint (no authentication required).
68
69**response**:
70```json
71{
72 "status": "ok"
73}
74```
75
76## authentication
77
78### bearer token authentication
79
80the transcoder uses a simple bearer token authentication scheme via the `X-Transcoder-Key` header.
81
82**configuration**:
83```bash
84# set via fly secrets
85fly secrets set TRANSCODER_AUTH_TOKEN="your-secret-token-here" -a plyr-transcoder
86```
87
88**local development**:
89```bash
90# .env file
91TRANSCODER_AUTH_TOKEN=dev-token-change-me
92
93# or run without auth (dev mode)
94# just run transcoder without setting token
95```
96
97**security notes**:
98- token should be a random, high-entropy string (use `openssl rand -base64 32`)
99- main backend should store token in environment variables
100- health endpoint bypasses authentication
101- invalid/missing tokens return 401 unauthorized
102
103## transcoding process
104
105### workflow
106
1071. **receive upload**: client sends audio file via multipart form
1082. **create temp directory**: isolated workspace for this request
1093. **save input file**: write uploaded bytes to temp file
1104. **determine format**: sanitize and validate target format
1115. **run ffmpeg**: spawn ffmpeg process with appropriate codec settings
1126. **stream output**: return transcoded file directly to client
1137. **cleanup**: delete temp directory (automatic)
114
115### ffmpeg command
116
117the service constructs ffmpeg commands based on target format:
118
119```bash
120# example: convert to MP3
121ffmpeg -i input.wav -codec:a libmp3lame -qscale:a 2 -map_metadata 0 output.mp3
122
123# example: convert to M4A (AAC)
124ffmpeg -i input.wav -codec:a aac -b:a 192k -map_metadata 0 output.m4a
125
126# example: convert to FLAC (lossless)
127ffmpeg -i input.flac -codec:a flac -compression_level 8 -map_metadata 0 output.flac
128```
129
130**flags explained**:
131- `-i input.wav`: input file
132- `-codec:a <codec>`: audio codec to use
133- `-qscale:a 2`: variable bitrate quality (0-9, lower = better)
134- `-b:a 192k`: constant bitrate (for AAC)
135- `-map_metadata 0`: preserve metadata (artist, title, etc.)
136- `-compression_level 8`: FLAC compression (0-12, higher = smaller file)
137
138### codec selection
139
140| format | codec | container | typical use case |
141|--------|-------|-----------|------------------|
142| mp3 | libmp3lame | MPEG | universal compatibility |
143| m4a | aac | MP4 | modern devices, good compression |
144| wav | pcm_s16le | WAV | lossless, uncompressed |
145| flac | flac | FLAC | lossless, compressed |
146| ogg | libvorbis | OGG | open format, good compression |
147
148## deployment
149
150### fly.io configuration
151
152**app name**: `plyr-transcoder`
153**region**: iad (us-east, washington DC)
154
155**fly.toml**:
156```toml
157app = "plyr-transcoder"
158primary_region = "iad"
159
160[http_service]
161 internal_port = 8080
162 force_https = true
163 auto_stop_machines = "stop"
164 auto_start_machines = true
165 min_machines_running = 0
166
167[[vm]]
168 cpu_kind = "shared"
169 cpus = 1
170 memory = "1gb"
171
172[env]
173 TRANSCODER_HOST = "0.0.0.0"
174 TRANSCODER_PORT = "8080"
175 TRANSCODER_MAX_UPLOAD_BYTES = "1073741824" # 1GB
176```
177
178**key settings**:
179- **auto_stop_machines**: stops VM when idle (cost optimization)
180- **auto_start_machines**: starts VM on first request (zero cold-start within seconds)
181- **min_machines_running**: 0 (no always-on instances, purely on-demand)
182- **memory**: 1GB (sufficient for transcoding typical audio files)
183
184### deployment commands
185
186```bash
187# deploy from transcoder directory
188cd transcoder && fly deploy
189
190# check status
191fly status -a plyr-transcoder
192
193# view logs (blocking - use ctrl+c to exit)
194fly logs -a plyr-transcoder
195
196# scale up (for high traffic)
197fly scale count 2 -a plyr-transcoder
198
199# scale down (back to auto-scale)
200fly scale count 1 -a plyr-transcoder
201```
202
203**note**: deployment is done manually from the transcoder directory, not via main backend CI/CD.
204
205### secrets management
206
207```bash
208# set authentication token
209fly secrets set TRANSCODER_AUTH_TOKEN="$(openssl rand -base64 32)" -a plyr-transcoder
210
211# list secrets (values hidden)
212fly secrets list -a plyr-transcoder
213
214# unset secret
215fly secrets unset TRANSCODER_AUTH_TOKEN -a plyr-transcoder
216```
217
218## integration with main backend
219
220### backend configuration
221
222**note**: the main backend does not currently use the transcoder service. this is available for future use when transcoding features are needed (e.g., format conversion for browser compatibility).
223
224if needed in the future, add to `src/backend/config.py`:
225
226```python
227class TranscoderSettings(AppSettingsSection):
228 url: str = Field(
229 default="https://plyr-transcoder.fly.dev",
230 validation_alias="TRANSCODER_URL"
231 )
232 auth_token: str = Field(
233 default="",
234 validation_alias="TRANSCODER_AUTH_TOKEN"
235 )
236```
237
238### calling from backend
239
240```python
241import httpx
242
243async def transcode_audio(
244 file: BinaryIO,
245 target_format: str = "mp3"
246) -> bytes:
247 """transcode audio file using transcoder service."""
248 async with httpx.AsyncClient() as client:
249 response = await client.post(
250 f"{settings.transcoder.url}/transcode",
251 params={"target": target_format},
252 files={"file": file},
253 headers={"X-Transcoder-Key": settings.transcoder.auth_token},
254 timeout=300.0 # 5 minutes for large files
255 )
256 response.raise_for_status()
257 return response.content
258```
259
260### error handling
261
262```python
263try:
264 transcoded = await transcode_audio(file, "mp3")
265except httpx.HTTPStatusError as e:
266 if e.response.status_code == 401:
267 logger.error("transcoder authentication failed")
268 raise HTTPException(500, "transcoding service unavailable")
269 elif e.response.status_code == 413:
270 raise HTTPException(413, "file too large for transcoding")
271 else:
272 logger.error(f"transcoding failed: {e}")
273 raise HTTPException(500, "transcoding failed")
274except httpx.TimeoutException:
275 logger.error("transcoding timed out")
276 raise HTTPException(504, "transcoding took too long")
277```
278
279## local development
280
281### prerequisites
282
283- rust toolchain (install via `rustup`)
284- ffmpeg (install via `brew install ffmpeg` on macOS)
285
286### running locally
287
288```bash
289# from transcoder directory
290cd transcoder && cargo run
291
292# with custom port
293TRANSCODER_PORT=9000 cargo run
294
295# with debug logging
296RUST_LOG=debug cargo run
297```
298
299**note**: the transcoder runs on port 8080 by default (configured in fly.toml).
300
301### testing locally
302
303```bash
304# start transcoder
305just transcoder run
306
307# test health endpoint
308curl http://localhost:8082/health
309
310# test transcoding (no auth required in dev mode)
311curl -X POST http://localhost:8082/transcode?target=mp3 \
312 -F "file=@test.wav" \
313 --output transcoded.mp3
314
315# test with authentication
316export TRANSCODER_AUTH_TOKEN="dev-token"
317cargo run &
318
319curl -X POST http://localhost:8082/transcode?target=mp3 \
320 -H "X-Transcoder-Key: dev-token" \
321 -F "file=@test.wav" \
322 --output transcoded.mp3
323```
324
325## performance characteristics
326
327### typical transcoding times
328
329transcoding performance depends on:
330- input file size and duration
331- source codec complexity
332- target codec and quality settings
333- available CPU
334
335**benchmarks** (shared-cpu-1x on fly.io):
336- 3-minute MP3 (5MB) → MP3: ~2-3 seconds
337- 3-minute WAV (30MB) → MP3: ~4-5 seconds
338- 10-minute FLAC (50MB) → MP3: ~10-15 seconds
339
340### resource usage
341
342**memory**:
343- base process: ~20MB
344- active transcoding: +100-200MB per request
345- 1GB VM supports 4-5 concurrent transcodes
346
347**CPU**:
348- ffmpeg uses 100% of allocated CPU
349- single-core sufficient for typical workload
350- multi-core would enable parallel processing
351
352### scaling considerations
353
354**when to scale up**:
355- average response time >30 seconds
356- frequent 503 errors (all VMs busy)
357- queue depth increasing
358
359**scaling options**:
3601. **horizontal**: increase machine count (`fly scale count 2`)
3612. **vertical**: increase memory/CPU (`fly scale vm shared-cpu-2x`)
3623. **regional**: deploy to multiple regions for geo-distribution
363
364## monitoring
365
366### metrics to track
367
3681. **transcoding success rate**
369 - total requests
370 - successful transcodes
371 - failed transcodes (by error type)
372
3732. **performance**
374 - average transcoding time
375 - p50, p95, p99 latency
376 - throughput (transcodes/minute)
377
3783. **resource usage**
379 - CPU utilization
380 - memory usage
381 - disk I/O (temp files)
382
3834. **errors**
384 - authentication failures
385 - ffmpeg errors
386 - timeout errors
387 - 413 file too large
388
389### fly.io metrics
390
391```bash
392# view metrics dashboard
393fly dashboard -a plyr-transcoder
394
395# check recent requests
396fly logs -a plyr-transcoder | grep "POST /transcode"
397
398# monitor resource usage
399fly vm status -a plyr-transcoder
400```
401
402## troubleshooting
403
404### common issues
405
406**ffmpeg not found**:
407```
408error: ffmpeg command failed: No such file or directory
409```
410solution: ensure ffmpeg is installed in docker image (check Dockerfile)
411
412**authentication fails in production**:
413```
414error: 401 unauthorized
415```
416solution: verify `TRANSCODER_AUTH_TOKEN` is set on both transcoder and backend
417
418**timeouts on large files**:
419```
420error: request timeout after 120s
421```
422solution: increase timeout in backend client (`timeout=300.0`)
423
424**413 entity too large**:
425```
426error: 413 payload too large
427```
428solution: increase `TRANSCODER_MAX_UPLOAD_BYTES` or reject large files earlier
429
430**VM not starting automatically**:
431```
432error: no instances available
433```
434solution: check `auto_start_machines = true` in fly.toml
435
436## future enhancements
437
438### potential improvements
439
4401. **progress tracking**
441 - stream ffmpeg progress updates
442 - return progress via server-sent events
443 - enable client-side progress bar
444
4452. **format detection**
446 - auto-detect input format via ffprobe
447 - validate format before transcoding
448 - reject unsupported formats early
449
4503. **quality presets**
451 - high quality (320kbps MP3, 256kbps AAC)
452 - standard quality (192kbps)
453 - low quality (128kbps for previews)
454
4554. **metadata preservation**
456 - extract metadata from input
457 - apply metadata to output
458 - handle artwork/cover images
459
4605. **batch processing**
461 - accept multiple files
462 - process in parallel
463 - return as zip archive
464
4656. **caching**
466 - cache transcoded files by content hash
467 - serve cached versions instantly
468 - implement LRU eviction
469
470## references
471
472- source code: `transcoder/src/main.rs`
473- justfile: `transcoder/Justfile`
474- fly config: `transcoder/fly.toml`
475- dockerfile: `transcoder/Dockerfile`
476- ffmpeg docs: https://ffmpeg.org/documentation.html
477- fly.io docs: https://fly.io/docs/