1# audio transcoder service 2 3## overview 4 5the transcoder is a standalone rust-based HTTP service that handles audio format conversion using ffmpeg. it runs as a separate fly.io app to isolate CPU-intensive transcoding operations from the main backend API. 6 7## architecture 8 9### why separate service? 10 11ffmpeg operations are CPU-intensive and can block the event loop in async python applications. separating the transcoder provides: 12 13- **isolation**: transcoding doesn't affect API latency 14- **performance**: rust + tokio provides better concurrency for blocking operations 15- **scalability**: can scale transcoder independently from main backend 16- **resource allocation**: dedicated CPU/memory for transcoding work 17 18### technology stack 19 20- **rust**: high-performance systems language 21- **axum**: async web framework built on tokio 22- **ffmpeg**: industry-standard media processing 23- **fly.io**: deployment platform with auto-scaling 24 25## API 26 27### POST /transcode 28 29convert audio file to target format. 30 31**authentication**: bearer token via `X-Transcoder-Key` header 32 33**request**: multipart/form-data 34- `file`: audio file to transcode 35- `target` (optional query param): target format (default: "mp3") 36 37**example**: 38```bash 39curl -X POST https://plyr-transcoder.fly.dev/transcode?target=mp3 \ 40 -H "X-Transcoder-Key: $TRANSCODER_AUTH_TOKEN" \ 41 -F "file=@input.wav" \ 42 --output output.mp3 43``` 44 45**response**: transcoded audio file (binary) 46 47**headers**: 48- `Content-Type`: appropriate media type for target format 49- `Content-Disposition`: attachment with original filename + new extension 50 51**supported formats**: 52- mp3 (MPEG Layer 3) 53- m4a (AAC in MP4 container) 54- wav (PCM audio) 55- flac (lossless compression) 56- ogg (Vorbis codec) 57 58**status codes**: 59- 200: transcoding successful, returns audio file 60- 400: invalid input (unsupported format, missing file, etc.) 61- 401: missing or invalid authentication token 62- 413: file too large (>1GB) 63- 500: transcoding failed (ffmpeg error, I/O error, etc.) 64 65### GET /health 66 67health check endpoint (no authentication required). 68 69**response**: 70```json 71{ 72 "status": "ok" 73} 74``` 75 76## authentication 77 78### bearer token authentication 79 80the transcoder uses a simple bearer token authentication scheme via the `X-Transcoder-Key` header. 81 82**configuration**: 83```bash 84# set via fly secrets 85fly secrets set TRANSCODER_AUTH_TOKEN="your-secret-token-here" -a plyr-transcoder 86``` 87 88**local development**: 89```bash 90# .env file 91TRANSCODER_AUTH_TOKEN=dev-token-change-me 92 93# or run without auth (dev mode) 94# just run transcoder without setting token 95``` 96 97**security notes**: 98- token should be a random, high-entropy string (use `openssl rand -base64 32`) 99- main backend should store token in environment variables 100- health endpoint bypasses authentication 101- invalid/missing tokens return 401 unauthorized 102 103## transcoding process 104 105### workflow 106 1071. **receive upload**: client sends audio file via multipart form 1082. **create temp directory**: isolated workspace for this request 1093. **save input file**: write uploaded bytes to temp file 1104. **determine format**: sanitize and validate target format 1115. **run ffmpeg**: spawn ffmpeg process with appropriate codec settings 1126. **stream output**: return transcoded file directly to client 1137. **cleanup**: delete temp directory (automatic) 114 115### ffmpeg command 116 117the service constructs ffmpeg commands based on target format: 118 119```bash 120# example: convert to MP3 121ffmpeg -i input.wav -codec:a libmp3lame -qscale:a 2 -map_metadata 0 output.mp3 122 123# example: convert to M4A (AAC) 124ffmpeg -i input.wav -codec:a aac -b:a 192k -map_metadata 0 output.m4a 125 126# example: convert to FLAC (lossless) 127ffmpeg -i input.flac -codec:a flac -compression_level 8 -map_metadata 0 output.flac 128``` 129 130**flags explained**: 131- `-i input.wav`: input file 132- `-codec:a <codec>`: audio codec to use 133- `-qscale:a 2`: variable bitrate quality (0-9, lower = better) 134- `-b:a 192k`: constant bitrate (for AAC) 135- `-map_metadata 0`: preserve metadata (artist, title, etc.) 136- `-compression_level 8`: FLAC compression (0-12, higher = smaller file) 137 138### codec selection 139 140| format | codec | container | typical use case | 141|--------|-------|-----------|------------------| 142| mp3 | libmp3lame | MPEG | universal compatibility | 143| m4a | aac | MP4 | modern devices, good compression | 144| wav | pcm_s16le | WAV | lossless, uncompressed | 145| flac | flac | FLAC | lossless, compressed | 146| ogg | libvorbis | OGG | open format, good compression | 147 148## deployment 149 150### fly.io configuration 151 152**app name**: `plyr-transcoder` 153**region**: iad (us-east, washington DC) 154 155**fly.toml**: 156```toml 157app = "plyr-transcoder" 158primary_region = "iad" 159 160[http_service] 161 internal_port = 8080 162 force_https = true 163 auto_stop_machines = "stop" 164 auto_start_machines = true 165 min_machines_running = 0 166 167[[vm]] 168 cpu_kind = "shared" 169 cpus = 1 170 memory = "1gb" 171 172[env] 173 TRANSCODER_HOST = "0.0.0.0" 174 TRANSCODER_PORT = "8080" 175 TRANSCODER_MAX_UPLOAD_BYTES = "1073741824" # 1GB 176``` 177 178**key settings**: 179- **auto_stop_machines**: stops VM when idle (cost optimization) 180- **auto_start_machines**: starts VM on first request (zero cold-start within seconds) 181- **min_machines_running**: 0 (no always-on instances, purely on-demand) 182- **memory**: 1GB (sufficient for transcoding typical audio files) 183 184### deployment commands 185 186```bash 187# deploy from transcoder directory 188cd transcoder && fly deploy 189 190# check status 191fly status -a plyr-transcoder 192 193# view logs (blocking - use ctrl+c to exit) 194fly logs -a plyr-transcoder 195 196# scale up (for high traffic) 197fly scale count 2 -a plyr-transcoder 198 199# scale down (back to auto-scale) 200fly scale count 1 -a plyr-transcoder 201``` 202 203**note**: deployment is done manually from the transcoder directory, not via main backend CI/CD. 204 205### secrets management 206 207```bash 208# set authentication token 209fly secrets set TRANSCODER_AUTH_TOKEN="$(openssl rand -base64 32)" -a plyr-transcoder 210 211# list secrets (values hidden) 212fly secrets list -a plyr-transcoder 213 214# unset secret 215fly secrets unset TRANSCODER_AUTH_TOKEN -a plyr-transcoder 216``` 217 218## integration with main backend 219 220### backend configuration 221 222**note**: the main backend does not currently use the transcoder service. this is available for future use when transcoding features are needed (e.g., format conversion for browser compatibility). 223 224if needed in the future, add to `src/backend/config.py`: 225 226```python 227class TranscoderSettings(AppSettingsSection): 228 url: str = Field( 229 default="https://plyr-transcoder.fly.dev", 230 validation_alias="TRANSCODER_URL" 231 ) 232 auth_token: str = Field( 233 default="", 234 validation_alias="TRANSCODER_AUTH_TOKEN" 235 ) 236``` 237 238### calling from backend 239 240```python 241import httpx 242 243async def transcode_audio( 244 file: BinaryIO, 245 target_format: str = "mp3" 246) -> bytes: 247 """transcode audio file using transcoder service.""" 248 async with httpx.AsyncClient() as client: 249 response = await client.post( 250 f"{settings.transcoder.url}/transcode", 251 params={"target": target_format}, 252 files={"file": file}, 253 headers={"X-Transcoder-Key": settings.transcoder.auth_token}, 254 timeout=300.0 # 5 minutes for large files 255 ) 256 response.raise_for_status() 257 return response.content 258``` 259 260### error handling 261 262```python 263try: 264 transcoded = await transcode_audio(file, "mp3") 265except httpx.HTTPStatusError as e: 266 if e.response.status_code == 401: 267 logger.error("transcoder authentication failed") 268 raise HTTPException(500, "transcoding service unavailable") 269 elif e.response.status_code == 413: 270 raise HTTPException(413, "file too large for transcoding") 271 else: 272 logger.error(f"transcoding failed: {e}") 273 raise HTTPException(500, "transcoding failed") 274except httpx.TimeoutException: 275 logger.error("transcoding timed out") 276 raise HTTPException(504, "transcoding took too long") 277``` 278 279## local development 280 281### prerequisites 282 283- rust toolchain (install via `rustup`) 284- ffmpeg (install via `brew install ffmpeg` on macOS) 285 286### running locally 287 288```bash 289# from transcoder directory 290cd transcoder && cargo run 291 292# with custom port 293TRANSCODER_PORT=9000 cargo run 294 295# with debug logging 296RUST_LOG=debug cargo run 297``` 298 299**note**: the transcoder runs on port 8080 by default (configured in fly.toml). 300 301### testing locally 302 303```bash 304# start transcoder 305just transcoder run 306 307# test health endpoint 308curl http://localhost:8082/health 309 310# test transcoding (no auth required in dev mode) 311curl -X POST http://localhost:8082/transcode?target=mp3 \ 312 -F "file=@test.wav" \ 313 --output transcoded.mp3 314 315# test with authentication 316export TRANSCODER_AUTH_TOKEN="dev-token" 317cargo run & 318 319curl -X POST http://localhost:8082/transcode?target=mp3 \ 320 -H "X-Transcoder-Key: dev-token" \ 321 -F "file=@test.wav" \ 322 --output transcoded.mp3 323``` 324 325## performance characteristics 326 327### typical transcoding times 328 329transcoding performance depends on: 330- input file size and duration 331- source codec complexity 332- target codec and quality settings 333- available CPU 334 335**benchmarks** (shared-cpu-1x on fly.io): 336- 3-minute MP3 (5MB) → MP3: ~2-3 seconds 337- 3-minute WAV (30MB) → MP3: ~4-5 seconds 338- 10-minute FLAC (50MB) → MP3: ~10-15 seconds 339 340### resource usage 341 342**memory**: 343- base process: ~20MB 344- active transcoding: +100-200MB per request 345- 1GB VM supports 4-5 concurrent transcodes 346 347**CPU**: 348- ffmpeg uses 100% of allocated CPU 349- single-core sufficient for typical workload 350- multi-core would enable parallel processing 351 352### scaling considerations 353 354**when to scale up**: 355- average response time >30 seconds 356- frequent 503 errors (all VMs busy) 357- queue depth increasing 358 359**scaling options**: 3601. **horizontal**: increase machine count (`fly scale count 2`) 3612. **vertical**: increase memory/CPU (`fly scale vm shared-cpu-2x`) 3623. **regional**: deploy to multiple regions for geo-distribution 363 364## monitoring 365 366### metrics to track 367 3681. **transcoding success rate** 369 - total requests 370 - successful transcodes 371 - failed transcodes (by error type) 372 3732. **performance** 374 - average transcoding time 375 - p50, p95, p99 latency 376 - throughput (transcodes/minute) 377 3783. **resource usage** 379 - CPU utilization 380 - memory usage 381 - disk I/O (temp files) 382 3834. **errors** 384 - authentication failures 385 - ffmpeg errors 386 - timeout errors 387 - 413 file too large 388 389### fly.io metrics 390 391```bash 392# view metrics dashboard 393fly dashboard -a plyr-transcoder 394 395# check recent requests 396fly logs -a plyr-transcoder | grep "POST /transcode" 397 398# monitor resource usage 399fly vm status -a plyr-transcoder 400``` 401 402## troubleshooting 403 404### common issues 405 406**ffmpeg not found**: 407``` 408error: ffmpeg command failed: No such file or directory 409``` 410solution: ensure ffmpeg is installed in docker image (check Dockerfile) 411 412**authentication fails in production**: 413``` 414error: 401 unauthorized 415``` 416solution: verify `TRANSCODER_AUTH_TOKEN` is set on both transcoder and backend 417 418**timeouts on large files**: 419``` 420error: request timeout after 120s 421``` 422solution: increase timeout in backend client (`timeout=300.0`) 423 424**413 entity too large**: 425``` 426error: 413 payload too large 427``` 428solution: increase `TRANSCODER_MAX_UPLOAD_BYTES` or reject large files earlier 429 430**VM not starting automatically**: 431``` 432error: no instances available 433``` 434solution: check `auto_start_machines = true` in fly.toml 435 436## future enhancements 437 438### potential improvements 439 4401. **progress tracking** 441 - stream ffmpeg progress updates 442 - return progress via server-sent events 443 - enable client-side progress bar 444 4452. **format detection** 446 - auto-detect input format via ffprobe 447 - validate format before transcoding 448 - reject unsupported formats early 449 4503. **quality presets** 451 - high quality (320kbps MP3, 256kbps AAC) 452 - standard quality (192kbps) 453 - low quality (128kbps for previews) 454 4554. **metadata preservation** 456 - extract metadata from input 457 - apply metadata to output 458 - handle artwork/cover images 459 4605. **batch processing** 461 - accept multiple files 462 - process in parallel 463 - return as zip archive 464 4656. **caching** 466 - cache transcoded files by content hash 467 - serve cached versions instantly 468 - implement LRU eviction 469 470## references 471 472- source code: `transcoder/src/main.rs` 473- justfile: `transcoder/Justfile` 474- fly config: `transcoder/fly.toml` 475- dockerfile: `transcoder/Dockerfile` 476- ffmpeg docs: https://ffmpeg.org/documentation.html 477- fly.io docs: https://fly.io/docs/