audio transcoder service#

overview#

the transcoder is a standalone rust-based HTTP service that handles audio format conversion using ffmpeg. it runs as a separate fly.io app to isolate CPU-intensive transcoding operations from the main backend API.

architecture#

why separate service?#

ffmpeg operations are CPU-intensive and can block the event loop in async python applications. separating the transcoder provides:

isolation: transcoding doesn't affect API latency
performance: rust + tokio provides better concurrency for blocking operations
scalability: can scale transcoder independently from main backend
resource allocation: dedicated CPU/memory for transcoding work

technology stack#

rust: high-performance systems language
axum: async web framework built on tokio
ffmpeg: industry-standard media processing
fly.io: deployment platform with auto-scaling

API#

POST /transcode#

convert audio file to target format.

authentication: bearer token via X-Transcoder-Key header

request: multipart/form-data

file: audio file to transcode
target (optional query param): target format (default: "mp3")

example:

curl -X POST https://plyr-transcoder.fly.dev/transcode?target=mp3 \
  -H "X-Transcoder-Key: $TRANSCODER_AUTH_TOKEN" \
  -F "file=@input.wav" \
  --output output.mp3

response: transcoded audio file (binary)

headers:

Content-Type: appropriate media type for target format
Content-Disposition: attachment with original filename + new extension

supported formats:

mp3 (MPEG Layer 3)
m4a (AAC in MP4 container)
wav (PCM audio)
flac (lossless compression)
ogg (Vorbis codec)

status codes:

200: transcoding successful, returns audio file
400: invalid input (unsupported format, missing file, etc.)
401: missing or invalid authentication token
413: file too large (>1GB)
500: transcoding failed (ffmpeg error, I/O error, etc.)

GET /health#

health check endpoint (no authentication required).

response:

{
  "status": "ok"
}

authentication#

bearer token authentication#

the transcoder uses a simple bearer token authentication scheme via the X-Transcoder-Key header.

configuration:

# set via fly secrets
fly secrets set TRANSCODER_AUTH_TOKEN="your-secret-token-here" -a plyr-transcoder

local development:

# .env file
TRANSCODER_AUTH_TOKEN=dev-token-change-me

# or run without auth (dev mode)
# just run transcoder without setting token

security notes:

token should be a random, high-entropy string (use openssl rand -base64 32)
main backend should store token in environment variables
health endpoint bypasses authentication
invalid/missing tokens return 401 unauthorized

transcoding process#

workflow#

receive upload: client sends audio file via multipart form
create temp directory: isolated workspace for this request
save input file: write uploaded bytes to temp file
determine format: sanitize and validate target format
run ffmpeg: spawn ffmpeg process with appropriate codec settings
stream output: return transcoded file directly to client
cleanup: delete temp directory (automatic)

ffmpeg command#

the service constructs ffmpeg commands based on target format:

# example: convert to MP3
ffmpeg -i input.wav -codec:a libmp3lame -qscale:a 2 -map_metadata 0 output.mp3

# example: convert to M4A (AAC)
ffmpeg -i input.wav -codec:a aac -b:a 192k -map_metadata 0 output.m4a

# example: convert to FLAC (lossless)
ffmpeg -i input.flac -codec:a flac -compression_level 8 -map_metadata 0 output.flac

flags explained:

-i input.wav: input file
-codec:a <codec>: audio codec to use
-qscale:a 2: variable bitrate quality (0-9, lower = better)
-b:a 192k: constant bitrate (for AAC)
-map_metadata 0: preserve metadata (artist, title, etc.)
-compression_level 8: FLAC compression (0-12, higher = smaller file)

codec selection#

format	codec	container	typical use case
mp3	libmp3lame	MPEG	universal compatibility
m4a	aac	MP4	modern devices, good compression
wav	pcm_s16le	WAV	lossless, uncompressed
flac	flac	FLAC	lossless, compressed
ogg	libvorbis	OGG	open format, good compression

deployment#

fly.io configuration#

app name: plyr-transcoder region: iad (us-east, washington DC)

fly.toml:

app = "plyr-transcoder"
primary_region = "iad"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = "stop"
  auto_start_machines = true
  min_machines_running = 0

[[vm]]
  cpu_kind = "shared"
  cpus = 1
  memory = "1gb"

[env]
  TRANSCODER_HOST = "0.0.0.0"
  TRANSCODER_PORT = "8080"
  TRANSCODER_MAX_UPLOAD_BYTES = "1073741824"  # 1GB

key settings:

auto_stop_machines: stops VM when idle (cost optimization)
auto_start_machines: starts VM on first request (zero cold-start within seconds)
min_machines_running: 0 (no always-on instances, purely on-demand)
memory: 1GB (sufficient for transcoding typical audio files)

deployment commands#

# deploy from transcoder directory
cd transcoder && fly deploy

# check status
fly status -a plyr-transcoder

# view logs (blocking - use ctrl+c to exit)
fly logs -a plyr-transcoder

# scale up (for high traffic)
fly scale count 2 -a plyr-transcoder

# scale down (back to auto-scale)
fly scale count 1 -a plyr-transcoder

note: deployment is done manually from the transcoder directory, not via main backend CI/CD.

secrets management#

# set authentication token
fly secrets set TRANSCODER_AUTH_TOKEN="$(openssl rand -base64 32)" -a plyr-transcoder

# list secrets (values hidden)
fly secrets list -a plyr-transcoder

# unset secret
fly secrets unset TRANSCODER_AUTH_TOKEN -a plyr-transcoder

integration with main backend#

backend configuration#

note: the main backend does not currently use the transcoder service. this is available for future use when transcoding features are needed (e.g., format conversion for browser compatibility).

if needed in the future, add to src/backend/config.py:

class TranscoderSettings(AppSettingsSection):
    url: str = Field(
        default="https://plyr-transcoder.fly.dev",
        validation_alias="TRANSCODER_URL"
    )
    auth_token: str = Field(
        default="",
        validation_alias="TRANSCODER_AUTH_TOKEN"
    )

calling from backend#

import httpx

async def transcode_audio(
    file: BinaryIO,
    target_format: str = "mp3"
) -> bytes:
    """transcode audio file using transcoder service."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{settings.transcoder.url}/transcode",
            params={"target": target_format},
            files={"file": file},
            headers={"X-Transcoder-Key": settings.transcoder.auth_token},
            timeout=300.0  # 5 minutes for large files
        )
        response.raise_for_status()
        return response.content

error handling#

try:
    transcoded = await transcode_audio(file, "mp3")
except httpx.HTTPStatusError as e:
    if e.response.status_code == 401:
        logger.error("transcoder authentication failed")
        raise HTTPException(500, "transcoding service unavailable")
    elif e.response.status_code == 413:
        raise HTTPException(413, "file too large for transcoding")
    else:
        logger.error(f"transcoding failed: {e}")
        raise HTTPException(500, "transcoding failed")
except httpx.TimeoutException:
    logger.error("transcoding timed out")
    raise HTTPException(504, "transcoding took too long")

local development#

prerequisites#

rust toolchain (install via rustup)
ffmpeg (install via brew install ffmpeg on macOS)

running locally#

# from transcoder directory
cd transcoder && cargo run

# with custom port
TRANSCODER_PORT=9000 cargo run

# with debug logging
RUST_LOG=debug cargo run

note: the transcoder runs on port 8080 by default (configured in fly.toml).

testing locally#

# start transcoder
just transcoder run

# test health endpoint
curl http://localhost:8082/health

# test transcoding (no auth required in dev mode)
curl -X POST http://localhost:8082/transcode?target=mp3 \
  -F "file=@test.wav" \
  --output transcoded.mp3

# test with authentication
export TRANSCODER_AUTH_TOKEN="dev-token"
cargo run &

curl -X POST http://localhost:8082/transcode?target=mp3 \
  -H "X-Transcoder-Key: dev-token" \
  -F "file=@test.wav" \
  --output transcoded.mp3

performance characteristics#

typical transcoding times#

transcoding performance depends on:

input file size and duration
source codec complexity
target codec and quality settings
available CPU

benchmarks (shared-cpu-1x on fly.io):

3-minute MP3 (5MB) → MP3: ~2-3 seconds
3-minute WAV (30MB) → MP3: ~4-5 seconds
10-minute FLAC (50MB) → MP3: ~10-15 seconds

resource usage#

memory:

base process: ~20MB
active transcoding: +100-200MB per request
1GB VM supports 4-5 concurrent transcodes

CPU:

ffmpeg uses 100% of allocated CPU
single-core sufficient for typical workload
multi-core would enable parallel processing

scaling considerations#

when to scale up:

average response time >30 seconds
frequent 503 errors (all VMs busy)
queue depth increasing

scaling options:

horizontal: increase machine count (fly scale count 2)
vertical: increase memory/CPU (fly scale vm shared-cpu-2x)
regional: deploy to multiple regions for geo-distribution

monitoring#

metrics to track#

transcoding success rate
- total requests
- successful transcodes
- failed transcodes (by error type)
performance
- average transcoding time
- p50, p95, p99 latency
- throughput (transcodes/minute)
resource usage
- CPU utilization
- memory usage
- disk I/O (temp files)
errors
- authentication failures
- ffmpeg errors
- timeout errors
- 413 file too large

fly.io metrics#

# view metrics dashboard
fly dashboard -a plyr-transcoder

# check recent requests
fly logs -a plyr-transcoder | grep "POST /transcode"

# monitor resource usage
fly vm status -a plyr-transcoder

troubleshooting#

common issues#

ffmpeg not found:

error: ffmpeg command failed: No such file or directory

solution: ensure ffmpeg is installed in docker image (check Dockerfile)

authentication fails in production:

error: 401 unauthorized

solution: verify TRANSCODER_AUTH_TOKEN is set on both transcoder and backend

timeouts on large files:

error: request timeout after 120s

solution: increase timeout in backend client (timeout=300.0)

413 entity too large:

error: 413 payload too large

solution: increase TRANSCODER_MAX_UPLOAD_BYTES or reject large files earlier

VM not starting automatically:

error: no instances available

solution: check auto_start_machines = true in fly.toml

future enhancements#

potential improvements#

progress tracking
- stream ffmpeg progress updates
- return progress via server-sent events
- enable client-side progress bar
format detection
- auto-detect input format via ffprobe
- validate format before transcoding
- reject unsupported formats early
quality presets
- high quality (320kbps MP3, 256kbps AAC)
- standard quality (192kbps)
- low quality (128kbps for previews)
metadata preservation
- extract metadata from input
- apply metadata to output
- handle artwork/cover images
batch processing
- accept multiple files
- process in parallel
- return as zip archive
caching
- cache transcoded files by content hash
- serve cached versions instantly
- implement LRU eviction

references#

source code: transcoder/src/main.rs
justfile: transcoder/Justfile
fly config: transcoder/fly.toml
dockerfile: transcoder/Dockerfile
ffmpeg docs: https://ffmpeg.org/documentation.html
fly.io docs: https://fly.io/docs/