audit status (#421)

authored by zzstoatzz.io and committed by GitHub 124a1fc6 88cb471d

Changed files
+657 -473
.github
.status_history
+12 -19
.github/workflows/status-maintenance.yml
··· 70 70 determine: 71 71 - what is today's date? 72 72 - what shipped in the last week vs earlier? 73 - - does .status_history/ exist? 73 + - does .status_history/ exist? (this implies whether or not this is the first episode) 74 74 - how many lines is STATUS.md currently? 75 - - is this the FIRST episode (no .status_history/ exists)? 76 75 77 76 ## task 2: archive old sections (MANDATORY if over 250 lines) 78 77 ··· 84 83 5. preserve the document structure (keep "## recent work" header, "## immediate priorities", etc) 85 84 6. do NOT summarize archived content - move it verbatim 86 85 87 - example: if STATUS.md has sections from Nov 10, Nov 12, Nov 24, Nov 27, Dec 1: 86 + example (which happens to be from November 2025): if STATUS.md has sections from Nov 10, Nov 12, Nov 24, Nov 27, Dec 1: 88 87 - move Nov 10-24 content to .status_history/2025-11.md 89 88 - keep Nov 27 and Dec 1 content in STATUS.md 90 89 91 90 VERIFY: run `wc -l STATUS.md` after archiving. it MUST be under 250 lines. 92 91 93 - ## task 3: generate audio overview 92 + ## task 3: generate audio overview (if skip_audio is false) 94 93 95 94 skip_audio input: ${{ inputs.skip_audio }} 96 95 ··· 104 103 - Fetch https://atproto.com/guides/overview to understand ATProto primitives 105 104 - Fetch https://atproto.com/guides/lexicon to understand NSIDs and lexicons 106 105 107 - this context helps you explain things accurately without over-simplifying. 106 + this context helps you explain things accurately, and accessibly without over-simplifying. 108 107 109 108 ### determine episode type 110 109 ··· 132 131 ### tone requirements (CRITICAL) 133 132 134 133 the hosts should sound like two engineers who: 135 - - are mildly amused by the absurdity of building things 134 + - are skeptical, amused and somewhat intruiged by the absurdity of building things 136 135 - acknowledge problems and limitations honestly 137 - - don't use superlatives ("amazing", "incredible", "exciting") 138 - - don't congratulate the project or its creator 139 - - explain technical concepts through analogy, not hype 140 - - are slightly sardonic but not mean 136 + - don't over-use superlatives ("amazing", "incredible", "exciting") 137 + - explain technical concepts through analogy, not hypey jargon 141 138 142 139 BAD example: 143 140 "Host: Wow, they've done an incredible job building this decentralized music platform!" 144 141 "Cohost: Absolutely! The ATProto integration is amazing!" 145 142 146 - GOOD example: 147 - "Host: So someone built a music streaming thing on ATProto." 148 - "Cohost: Right, the protocol Bluesky uses. The idea being your playlists and play history live in your personal data server instead of Spotify's database." 149 - "Host: Which means if plyr.fm disappears tomorrow, you still have your data." 150 - "Cohost: In theory. Whether anyone else builds a client that reads it is another question." 151 - 152 - forbidden phrases: 143 + avoid excessive phrasing: 153 144 - "exciting", "amazing", "incredible", "impressive", "great job" 154 145 - "the team has done", "they've really", "fantastic work" 155 - - any variation of congratulating or praising the project 146 + - any variation of over-congratulating or over-sensationalizing the project 156 147 157 - pronunciation: "plyr.fm" = "player FM" (not "plir" or spelled out) 148 + pronunciation: "plyr.fm" is pronounced "player FM" (not "plir" or spelled out) 158 149 159 150 target length: 2-3 minutes spoken (~300-400 words) 160 151 ··· 172 163 4. git push -u origin status-maintenance-$(date +%Y%m%d) 173 164 5. gh pr create --title "chore: weekly status maintenance" --body "automated status archival and audio overview" 174 165 166 + add detail as desired to the PR body and title. 167 + add a label like "ai-generated" to the PR (create the label if it doesn't exist) 175 168 if nothing changed, report that no maintenance was needed. 176 169 177 170 env:
-441
.status_history/2025-11.md
··· 1 - # plyr.fm status archive - november 2025 2 - 3 - ### Queue hydration + ATProto token hardening (Nov 12, 2025) 4 - 5 - **Why:** queue endpoints were occasionally taking 2s+ and restore operations could 401 6 - when multiple requests refreshed an expired ATProto token simultaneously. 7 - 8 - **What shipped:** 9 - - Added persistent `image_url` on `Track` rows so queue hydration no longer probes R2 10 - for every track. Queue payloads now pull art directly from Postgres, with a one-time 11 - fallback for legacy rows. 12 - - Updated `_internal/queue.py` to backfill any missing URLs once (with caching) instead 13 - of per-request GETs. 14 - - Introduced per-session locks in `_refresh_session_tokens` so only one coroutine hits 15 - `oauth_client.refresh_session` at a time; others reuse the refreshed tokens. This 16 - removes the race that caused the batch restore flow to intermittently 500/401. 17 - 18 - **Impact:** queue tail latency dropped back under 500 ms in staging tests, ATProto restore flows are now reliable under concurrent use, and Logfire no longer shows 500s 19 - from the PDS. 20 - 21 - ### Liked tracks feature (PR #157, Nov 11, 2025) 22 - 23 - - ✅ server-side persistent collections 24 - - ✅ ATProto record publication for cross-platform visibility 25 - - ✅ UI for adding/removing tracks from liked collection 26 - - ✅ like counts displayed in track responses and analytics (#170) 27 - - ✅ analytics cards now clickable links to track detail pages (#171) 28 - - ✅ liked state shown on artist page tracks (#163) 29 - 30 - ### Upload streaming + progress UX (PR #182, Nov 11, 2025) 31 - 32 - - Frontend switched from `fetch` to `XMLHttpRequest` so we can display upload progress 33 - toasts (critical for >50 MB mixes on mobile). 34 - - Upload form now clears only after the request succeeds; failed attempts leave the 35 - form intact so users don't lose metadata. 36 - - Backend writes uploads/images to temp files in 8 MB chunks before handing them to the 37 - storage layer, eliminating whole-file buffering and iOS crashes for hour-long mixes. 38 - - Deployment verified locally and by rerunning the exact repro Stella hit (85 minute 39 - mix from mobile). 40 - 41 - ### transcoder API deployment (PR #156, Nov 11, 2025) 42 - 43 - **standalone Rust transcoding service** 🎉 44 - - **deployed**: https://plyr-transcoder.fly.dev/ 45 - - **purpose**: convert AIFF/FLAC/etc. to MP3 for browser compatibility 46 - - **technology**: Axum + ffmpeg + Docker 47 - - **security**: `X-Transcoder-Key` header authentication (shared secret) 48 - - **capacity**: handles 1GB uploads, tested with 85-minute AIFF files (~858MB → 195MB MP3 in 32 seconds) 49 - - **architecture**: 50 - - 2 Fly machines for high availability 51 - - auto-stop/start for cost efficiency 52 - - stateless design (no R2 integration yet) 53 - - 320kbps MP3 output with proper ID3 tags 54 - - **status**: deployed and tested, ready for integration into plyr.fm upload pipeline 55 - - **next steps**: wire into backend with R2 integration and job queue (see issue #153) 56 - 57 - ### AIFF/AIF browser compatibility fix (PR #152, Nov 11, 2025) 58 - 59 - **format validation improvements** 60 - - **problem discovered**: AIFF/AIF files only work in Safari, not Chrome/Firefox 61 - - browsers throw `MediaError code 4: MEDIA_ERR_SRC_NOT_SUPPORTED` 62 - - users could upload files but they wouldn't play in most browsers 63 - - **immediate solution**: reject AIFF/AIF uploads at both backend and frontend 64 - - removed AIFF/AIF from AudioFormat enum 65 - - added format hints to upload UI: "supported: mp3, wav, m4a" 66 - - client-side validation with helpful error messages 67 - - **long-term solution**: deployed standalone transcoder service (see above) 68 - - separate Rust/Axum service with ffmpeg 69 - - accepts all formats, converts to browser-compatible MP3 70 - - integration into upload pipeline pending (issue #153) 71 - 72 - **observability improvements**: 73 - - added logfire instrumentation to upload background tasks 74 - - added logfire spans to R2 storage operations 75 - - documented logfire querying patterns in `docs/logfire-querying.md` 76 - 77 - ### async I/O performance fixes (PRs #149-151, Nov 10-11, 2025) 78 - 79 - Eliminated event loop blocking across backend with three critical PRs: 80 - 81 - 1. **PR #149: async R2 reads** - converted R2 `head_object` operations from sync boto3 to async aioboto3 82 - - portal page load time: 2+ seconds → ~200ms 83 - - root cause: `track.image_url` was blocking on serial R2 HEAD requests 84 - 85 - 2. **PR #150: concurrent PDS resolution** - parallelized ATProto PDS URL lookups 86 - - homepage load time: 2-6 seconds → 200-400ms 87 - - root cause: serial `resolve_atproto_data()` calls (8 artists × 200-300ms each) 88 - - fix: `asyncio.gather()` for batch resolution, database caching for subsequent loads 89 - 90 - 3. **PR #151: async storage writes/deletes** - made save/delete operations non-blocking 91 - - R2: switched to `aioboto3` for uploads/deletes (async S3 operations) 92 - - filesystem: used `anyio.Path` and `anyio.open_file()` for chunked async I/O (64KB chunks) 93 - - impact: multi-MB uploads no longer monopolize worker thread, constant memory usage 94 - 95 - ### cover art support (PRs #123-126, #132-139) 96 - - ✅ track cover image upload and storage (separate R2 bucket) 97 - - ✅ image display on track pages and player 98 - - ✅ Open Graph meta tags for track sharing 99 - - ✅ mobile-optimized layouts with cover art 100 - - ✅ sticky bottom player on mobile with cover 101 - 102 - ### track detail pages (PR #164, Nov 12, 2025) 103 - 104 - - ✅ dedicated track detail pages with large cover art 105 - - ✅ play button updates queue state correctly (#169) 106 - - ✅ liked state loaded efficiently via server-side fetch 107 - - ✅ mobile-optimized layouts with proper scrolling constraints 108 - - ✅ origin validation for image URLs (#168) 109 - 110 - ### mobile UI improvements (PRs #159-185, Nov 11-12, 2025) 111 - 112 - - ✅ compact action menus and better navigation (#161) 113 - - ✅ improved mobile responsiveness (#159) 114 - - ✅ consistent button layouts across mobile/desktop (#176-181, #185) 115 - - ✅ always show play count and like count on mobile (#177) 116 - - ✅ login page UX improvements (#174-175) 117 - - ✅ liked page UX improvements (#173) 118 - - ✅ accent color for liked tracks (#160) 119 - 120 - ### queue management improvements (PRs #110-113, #115) 121 - - ✅ visual feedback on queue add/remove 122 - - ✅ toast notifications for queue actions 123 - - ✅ better error handling for queue operations 124 - - ✅ improved shuffle and auto-advance UX 125 - 126 - ### infrastructure and tooling 127 - - ✅ R2 bucket separation: audio-prod and images-prod (PR #124) 128 - - ✅ admin script for content moderation (`scripts/delete_track.py`) 129 - - ✅ bluesky attribution link in header 130 - - ✅ changelog target added (#183) 131 - - ✅ documentation updates (#158) 132 - - ✅ track metadata edits now persist correctly (#162) 133 - 134 - --- 135 - 136 - ## performance optimization session (Nov 12, 2025) 137 - 138 - ### issue: slow /tracks/liked endpoint 139 - 140 - **symptoms**: 141 - - `/tracks/liked` taking 600-900ms consistently 142 - - only ~25ms spent in database queries 143 - - mysterious 575ms gap with no spans in Logfire traces 144 - - endpoint felt sluggish compared to other pages 145 - 146 - **investigation**: 147 - - examined Logfire traces for `/tracks/liked` requests 148 - - found 5-6 liked tracks being returned per request 149 - - DB queries completing fast (track data, artist info, like counts all under 10ms each) 150 - - noticed R2 storage calls weren't appearing in traces despite taking majority of request time 151 - 152 - **root cause**: 153 - - PR #184 added `image_url` column to tracks table to eliminate N+1 R2 API calls 154 - - new tracks (uploaded after PR) have `image_url` populated at upload time ✅ 155 - - legacy tracks (15 tracks uploaded before PR) had `image_url = NULL` ❌ 156 - - fallback code called `track.get_image_url()` for NULL values 157 - - `get_image_url()` makes uninstrumented R2 `head_object` API calls to find image extensions 158 - - each track with NULL `image_url` = ~100-120ms of R2 API calls per request 159 - - 5 tracks × 120ms = ~600ms of uninstrumented latency 160 - 161 - **why R2 calls weren't visible**: 162 - - `storage.get_url()` method had no Logfire instrumentation 163 - - R2 API calls happening but not creating spans 164 - - appeared as mysterious gap in trace timeline 165 - 166 - **solution implemented**: 167 - 1. created `scripts/backfill_image_urls.py` to populate missing `image_url` values 168 - 2. ran script against production database with production R2 credentials 169 - 3. backfilled 11 tracks successfully (4 already done in previous partial run) 170 - 4. 3 tracks "failed" but actually have non-existent images (optional, expected) 171 - 5. script uses concurrent `asyncio.gather()` for performance 172 - 173 - **key learning: environment configuration matters**: 174 - - initial script runs failed silently because: 175 - - script used local `.env` credentials (dev R2 bucket) 176 - - production images stored in different R2 bucket (`images-prod`) 177 - - `get_url()` returned `None` when images not found in dev bucket 178 - - fix: passed production R2 credentials via environment variables: 179 - - `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` 180 - - `R2_IMAGE_BUCKET=images-prod` 181 - - `R2_PUBLIC_IMAGE_BUCKET_URL=https://pub-7ea7ea9a6f224f4f8c0321a2bb008c5a.r2.dev` 182 - 183 - **results**: 184 - - before: 15 tracks needed backfill, causing ~600-900ms latency on `/tracks/liked` 185 - - after: 13 tracks populated with `image_url`, 3 legitimately have no images 186 - - `/tracks/liked` now loads with 0 R2 API calls instead of 5-11 187 - - endpoint feels "really, really snappy" (user feedback) 188 - - performance improvement visible immediately after backfill 189 - 190 - **database cleanup: queue_state table bloat**: 191 - - discovered `queue_state` had 265% bloat (53 dead rows, 20 live rows) 192 - - ran `VACUUM (FULL, ANALYZE) queue_state` against production 193 - - result: 0 dead rows, table clean 194 - - configured autovacuum for queue_state to prevent future bloat: 195 - - frequent updates to this table make it prone to bloat 196 - - should tune `autovacuum_vacuum_scale_factor` to 0.05 (5% vs default 20%) 197 - 198 - **endpoint performance snapshot** (post-fix, last 10 minutes): 199 - - `GET /tracks/`: 410ms (down from 2+ seconds) 200 - - `GET /queue/`: 399ms (down from 2+ seconds) 201 - - `GET /tracks/liked`: now sub-200ms (down from 600-900ms) 202 - - `GET /preferences/`: 200ms median 203 - - `GET /auth/me`: 114ms median 204 - - `POST /tracks/{track_id}/play`: 34ms 205 - 206 - **PR #184 context**: 207 - - PR claimed "opportunistic backfill: legacy records update on first access" 208 - - but actual implementation never saved computed `image_url` back to database 209 - - fallback code only computed URLs on-demand, didn't persist them 210 - - this is why repeated visits kept hitting R2 API for same tracks 211 - - one-time backfill script was correct solution vs adding write logic to read endpoints 212 - 213 - **graceful ATProto recovery (PR #180)**: 214 - - reviewed recent work on handling tracks with missing `atproto_record_uri` 215 - - 4 tracks in production have NULL ATProto records (expected from upload failures) 216 - - system already handles this gracefully: 217 - - like buttons disabled with helpful tooltips 218 - - track owners can self-service restore via portal 219 - - `restore-record` endpoint recreates with correct TID timestamps 220 - - no action needed - existing recovery system working as designed 221 - 222 - **performance metrics pre/post all recent PRs**: 223 - - PR #184 (image_url storage): eliminated hundreds of R2 API calls per request 224 - - today's backfill: eliminated remaining R2 calls for legacy tracks 225 - - combined impact: queue/tracks endpoints now 5-10x faster than before PR #184 226 - - all endpoints now consistently sub-second response times 227 - 228 - **documentation created**: 229 - - `docs/neon-mcp-guide.md`: comprehensive guide for using Neon MCP 230 - - project/branch management 231 - - database schema inspection 232 - - SQL query patterns for plyr.fm 233 - - connection string generation 234 - - environment mapping (dev/staging/prod) 235 - - debugging workflows 236 - - `scripts/backfill_image_urls.py`: reusable for any future image_url gaps 237 - - dry-run mode for safety 238 - - concurrent R2 API calls 239 - - detailed error logging 240 - - production-tested 241 - 242 - **tools and patterns established**: 243 - - Neon MCP for database inspection and queries 244 - - Logfire arbitrary queries for performance analysis 245 - - production secret management via Fly.io 246 - - `flyctl ssh console` for environment inspection 247 - - backfill scripts with dry-run mode 248 - - environment variable overrides for production operations 249 - 250 - **system health indicators**: 251 - - ✅ no 5xx errors in recent spans 252 - - ✅ database queries all under 70ms p95 253 - - ✅ SSL connection pool issues resolved (no errors in recent traces) 254 - - ✅ queue_state table bloat eliminated 255 - - ✅ all track images either in DB or legitimately NULL 256 - - ✅ application feels fast and responsive 257 - 258 - **next steps**: 259 - 1. configure autovacuum for `queue_state` table (prevent future bloat) 260 - 2. add Logfire instrumentation to `storage.get_url()` for visibility 261 - 3. monitor `/tracks/liked` performance over next few days 262 - 4. consider adding similar backfill pattern for any future column additions 263 - 264 - --- 265 - 266 - ### copyright moderation system (PRs #382, #384, Nov 29-30, 2025) 267 - 268 - **motivation**: detect potential copyright violations in uploaded tracks to avoid DMCA issues and protect the platform. 269 - 270 - **what shipped**: 271 - - **moderation service** (Rust/Axum on Fly.io): 272 - - standalone service at `plyr-moderation.fly.dev` 273 - - integrates with AuDD enterprise API for audio fingerprinting 274 - - scans audio URLs and returns matches with metadata (artist, title, album, ISRC, timecode) 275 - - auth via `X-Moderation-Key` header 276 - - **backend integration** (PR #382): 277 - - `ModerationSettings` in config (service URL, auth token, timeout) 278 - - moderation client module (`backend/_internal/moderation.py`) 279 - - fire-and-forget background task on track upload 280 - - stores results in `copyright_scans` table 281 - - scan errors stored as "clear" so tracks aren't stuck unscanned 282 - - **flagging fix** (PR #384): 283 - - AuDD enterprise API returns no confidence scores (all 0) 284 - - changed from score threshold to presence-based flagging: `is_flagged = !matches.is_empty()` 285 - - removed unused `score_threshold` config 286 - - **backfill script** (`scripts/scan_tracks_copyright.py`): 287 - - scans existing tracks that haven't been checked 288 - - `--max-duration` flag to skip long DJ sets (estimated from file size) 289 - - `--dry-run` mode to preview what would be scanned 290 - - supports dev/staging/prod environments 291 - - **review workflow**: 292 - - `copyright_scans` table has `resolution`, `reviewed_at`, `reviewed_by`, `review_notes` columns 293 - - resolution values: `violation`, `false_positive`, `original_artist` 294 - - SQL queries for dashboard: flagged tracks, unreviewed flags, violations list 295 - 296 - **initial review results** (25 flagged tracks): 297 - - 8 violations (actual copyright issues) 298 - - 11 false positives (fingerprint noise) 299 - - 6 original artists (people uploading their own distributed music) 300 - 301 - **impact**: 302 - - automated copyright detection on upload 303 - - manual review workflow for flagged content 304 - - protection against DMCA takedown requests 305 - - clear audit trail with resolution status 306 - 307 - --- 308 - 309 - ### platform stats and media session integration (PRs #359-379, Nov 27-29, 2025) 310 - 311 - **motivation**: show platform activity at a glance, improve playback experience across devices, and give users control over their data. 312 - 313 - **what shipped**: 314 - - **platform stats endpoint and UI** (PRs #376, #378, #379): 315 - - `GET /stats` returns total plays, tracks, and artists 316 - - stats bar displays in homepage header (e.g., "1,691 plays • 55 tracks • 8 artists") 317 - - skeleton loading animation while fetching 318 - - responsive layout: visible in header on wide screens, collapses to menu on narrow 319 - - end-of-list animation on homepage 320 - - **Media Session API** (PR #371): 321 - - provides track metadata to CarPlay, lock screens, Bluetooth devices, macOS control center 322 - - artwork display with fallback to artist avatar 323 - - play/pause, prev/next, seek controls all work from system UI 324 - - position state syncs scrubbers on external interfaces 325 - - **browser tab title** (PR #374): 326 - - shows "track - artist • plyr.fm" while playing 327 - - persists across page navigation 328 - - reverts to page title when playback stops 329 - - **timed comments** (PR #359): 330 - - comments capture timestamp when added during playback 331 - - clickable timestamp buttons seek to that moment 332 - - compact scrollable comments section on track pages 333 - - **constellation integration** (PR #360): 334 - - queries constellation.microcosm.blue backlink index 335 - - enables network-wide like counts (not just plyr.fm internal) 336 - - environment-aware namespace handling 337 - - **account deletion** (PR #363): 338 - - explicit confirmation flow (type handle to confirm) 339 - - deletes all plyr.fm data (tracks, albums, likes, comments, preferences) 340 - - optional ATProto record cleanup with clear warnings about orphaned references 341 - 342 - **impact**: 343 - - platform stats give visitors immediate sense of activity 344 - - media session makes plyr.fm tracks controllable from car/lock screen/control center 345 - - timed comments enable discussion at specific moments in tracks 346 - - account deletion gives users full control over their data 347 - 348 - --- 349 - 350 - ### developer tokens with independent OAuth grants (PR #367, Nov 28, 2025) 351 - 352 - **motivation**: programmatic API access (scripts, CLIs, automation) needed tokens that survive browser logout and don't become stale when browser sessions refresh. 353 - 354 - **what shipped**: 355 - - **OAuth-based dev tokens**: each developer token gets its own OAuth authorization flow 356 - - user clicks "create token" → redirected to PDS for authorization → token created with independent credentials 357 - - tokens have their own DPoP keypair, access/refresh tokens - completely separate from browser session 358 - - **cookie isolation**: dev token exchange doesn't set browser cookie 359 - - added `is_dev_token` flag to ExchangeToken model 360 - - /auth/exchange skips Set-Cookie for dev token flows 361 - - prevents logout from deleting dev tokens (critical bug fixed during implementation) 362 - - **token management UI**: portal → "your data" → "developer tokens" 363 - - create with optional name and expiration (30/90/180/365 days or never) 364 - - list active tokens with creation/expiration dates 365 - - revoke individual tokens 366 - - **API endpoints**: 367 - - `POST /auth/developer-token/start` - initiates OAuth flow, returns auth_url 368 - - `GET /auth/developer-tokens` - list user's tokens 369 - - `DELETE /auth/developer-tokens/{prefix}` - revoke by 8-char prefix 370 - 371 - **security properties**: 372 - - tokens are full sessions with encrypted OAuth credentials (Fernet) 373 - - each token refreshes independently (no staleness from browser session refresh) 374 - - revokable individually without affecting browser or other tokens 375 - - explicit OAuth consent required at PDS for each token created 376 - 377 - **testing verified**: 378 - - created token → uploaded track → logged out → deleted track with token ✓ 379 - - browser logout doesn't affect dev tokens ✓ 380 - - token works across browser sessions ✓ 381 - - staging deployment tested end-to-end ✓ 382 - 383 - **documentation**: see `docs/authentication.md` "developer tokens" section 384 - 385 - --- 386 - 387 - ### oEmbed endpoint for Leaflet.pub embeds (PRs #355-358, Nov 25, 2025) 388 - 389 - **motivation**: plyr.fm tracks embedded in Leaflet.pub (via iframely) showed a black HTML5 audio box instead of our custom embed player. 390 - 391 - **what shipped**: 392 - - **oEmbed endpoint** (PR #355): `/oembed` returns proper embed HTML with iframe 393 - - follows oEmbed spec with `type: "rich"` and iframe in `html` field 394 - - discovery link in track page `<head>` for automatic detection 395 - - **iframely domain registration**: registered plyr.fm on iframely.com (free tier) 396 - - this was the key fix - iframely now returns our embed iframe as `links.player[0]` 397 - - API key: stored in 1password (iframely account) 398 - 399 - **debugging journey** (PRs #356-358): 400 - - initially tried `og:video` meta tags to hint iframe embed - didn't work 401 - - tried removing `og:audio` to force oEmbed fallback - resulted in no player link 402 - - discovered iframely requires domain registration to trust oEmbed providers 403 - - after registration, iframely correctly returns embed iframe URL 404 - 405 - **current state**: 406 - - oEmbed endpoint working: `curl https://api.plyr.fm/oembed?url=https://plyr.fm/track/92` 407 - - iframely returns `links.player[0].href = "https://plyr.fm/embed/track/92"` (our embed) 408 - - Leaflet.pub should show proper embeds (pending their cache expiry) 409 - 410 - **impact**: 411 - - plyr.fm tracks can be embedded in Leaflet.pub and other iframely-powered services 412 - - proper embed player with cover art instead of raw HTML5 audio 413 - 414 - --- 415 - 416 - ### export & upload reliability (PRs #337-344, Nov 24, 2025) 417 - 418 - **motivation**: exports were failing silently on large files (OOM), uploads showed incorrect progress, and SSE connections triggered false error toasts. 419 - 420 - **what shipped**: 421 - - **database-backed jobs** (PR #337): moved upload/export tracking from in-memory to postgres 422 - - jobs table persists state across server restarts 423 - - enables reliable progress tracking via SSE polling 424 - - **streaming exports** (PR #343): fixed OOM on large file exports 425 - - previously loaded entire files into memory via `response["Body"].read()` 426 - - now streams to temp files, adds to zip from disk (constant memory) 427 - - 90-minute WAV files now export successfully on 1GB VM 428 - - **progress tracking fix** (PR #340): upload progress was receiving bytes but treating as percentage 429 - - `UploadProgressTracker` now properly converts bytes to percentage 430 - - upload progress bar works correctly again 431 - - **UX improvements** (PRs #338-339, #341-342, #344): 432 - - export filename now includes date (`plyr-tracks-2025-11-24.zip`) 433 - - toast notification on track deletion 434 - - fixed false "lost connection" error when SSE completes normally 435 - - progress now shows "downloading track X of Y" instead of confusing count 436 - 437 - **impact**: 438 - - exports work for arbitrarily large files (limited by disk, not RAM) 439 - - upload progress displays correctly 440 - - job state survives server restarts 441 - - clearer progress messaging during exports
+645 -13
STATUS.md
··· 39 39 40 40 --- 41 41 42 - ## recent work 42 + **started**: October 28, 2025 (first commit: `454e9bc` - relay MVP with ATProto authentication) 43 + 44 + --- 43 45 44 - ### now-playing API (PR #416, Dec 1, 2025) 46 + ## development timeline 47 + 48 + ### December 2025 49 + 50 + #### now-playing API (PR #416, Dec 1) 45 51 46 52 **motivation**: expose what users are currently listening to via public API 47 53 ··· 63 69 64 70 --- 65 71 66 - ### admin UI improvements for moderation (PRs #408-414, Dec 1, 2025) 72 + #### admin UI improvements for moderation (PRs #408-414, Dec 1) 67 73 68 74 **motivation**: improve usability of copyright moderation admin UI based on real-world usage 69 75 ··· 94 100 95 101 --- 96 102 97 - ### ATProto labeler and admin UI improvements (PRs #385-395, Nov 29-Dec 1, 2025) 103 + ### November 2025 104 + 105 + #### ATProto labeler and admin UI (PRs #385-395, Nov 29-Dec 1) 98 106 99 107 **motivation**: integrate with ATProto labeling protocol for proper copyright violation signaling, and improve admin tooling for reviewing flagged content. 100 108 ··· 133 141 134 142 --- 135 143 144 + #### copyright moderation system (PRs #382, #384, Nov 29-30) 145 + 146 + **motivation**: detect potential copyright violations in uploaded tracks to avoid DMCA issues and protect the platform. 147 + 148 + **what shipped**: 149 + - **moderation service** (Rust/Axum on Fly.io): 150 + - standalone service at `plyr-moderation.fly.dev` 151 + - integrates with AuDD enterprise API for audio fingerprinting 152 + - scans audio URLs and returns matches with metadata (artist, title, album, ISRC, timecode) 153 + - auth via `X-Moderation-Key` header 154 + - **backend integration** (PR #382): 155 + - `ModerationSettings` in config (service URL, auth token, timeout) 156 + - moderation client module (`backend/_internal/moderation.py`) 157 + - fire-and-forget background task on track upload 158 + - stores results in `copyright_scans` table 159 + - scan errors stored as "clear" so tracks aren't stuck unscanned 160 + - **flagging fix** (PR #384): 161 + - AuDD enterprise API returns no confidence scores (all 0) 162 + - changed from score threshold to presence-based flagging: `is_flagged = !matches.is_empty()` 163 + - removed unused `score_threshold` config 164 + - **backfill script** (`scripts/scan_tracks_copyright.py`): 165 + - scans existing tracks that haven't been checked 166 + - `--max-duration` flag to skip long DJ sets (estimated from file size) 167 + - `--dry-run` mode to preview what would be scanned 168 + - supports dev/staging/prod environments 169 + - **review workflow**: 170 + - `copyright_scans` table has `resolution`, `reviewed_at`, `reviewed_by`, `review_notes` columns 171 + - resolution values: `violation`, `false_positive`, `original_artist` 172 + 173 + **initial review results** (25 flagged tracks): 174 + - 8 violations (actual copyright issues) 175 + - 11 false positives (fingerprint noise) 176 + - 6 original artists (people uploading their own distributed music) 177 + 178 + --- 179 + 180 + #### developer tokens with independent OAuth grants (PR #367, Nov 28) 181 + 182 + **motivation**: programmatic API access (scripts, CLIs, automation) needed tokens that survive browser logout and don't become stale when browser sessions refresh. 183 + 184 + **what shipped**: 185 + - **OAuth-based dev tokens**: each developer token gets its own OAuth authorization flow 186 + - user clicks "create token" → redirected to PDS for authorization → token created with independent credentials 187 + - tokens have their own DPoP keypair, access/refresh tokens - completely separate from browser session 188 + - **cookie isolation**: dev token exchange doesn't set browser cookie 189 + - added `is_dev_token` flag to ExchangeToken model 190 + - /auth/exchange skips Set-Cookie for dev token flows 191 + - prevents logout from deleting dev tokens (critical bug fixed during implementation) 192 + - **token management UI**: portal → "your data" → "developer tokens" 193 + - create with optional name and expiration (30/90/180/365 days or never) 194 + - list active tokens with creation/expiration dates 195 + - revoke individual tokens 196 + - **API endpoints**: 197 + - `POST /auth/developer-token/start` - initiates OAuth flow, returns auth_url 198 + - `GET /auth/developer-tokens` - list user's tokens 199 + - `DELETE /auth/developer-tokens/{prefix}` - revoke by 8-char prefix 200 + 201 + **security properties**: 202 + - tokens are full sessions with encrypted OAuth credentials (Fernet) 203 + - each token refreshes independently (no staleness from browser session refresh) 204 + - revokable individually without affecting browser or other tokens 205 + - explicit OAuth consent required at PDS for each token created 206 + 207 + **documentation**: see `docs/authentication.md` "developer tokens" section 208 + 209 + --- 210 + 211 + #### platform stats and media session integration (PRs #359-379, Nov 27-29) 212 + 213 + **motivation**: show platform activity at a glance, improve playback experience across devices, and give users control over their data. 214 + 215 + **what shipped**: 216 + - **platform stats endpoint and UI** (PRs #376, #378, #379): 217 + - `GET /stats` returns total plays, tracks, and artists 218 + - stats bar displays in homepage header (e.g., "1,691 plays • 55 tracks • 8 artists") 219 + - skeleton loading animation while fetching 220 + - responsive layout: visible in header on wide screens, collapses to menu on narrow 221 + - end-of-list animation on homepage 222 + - **Media Session API** (PR #371): 223 + - provides track metadata to CarPlay, lock screens, Bluetooth devices, macOS control center 224 + - artwork display with fallback to artist avatar 225 + - play/pause, prev/next, seek controls all work from system UI 226 + - position state syncs scrubbers on external interfaces 227 + - **browser tab title** (PR #374): 228 + - shows "track - artist • plyr.fm" while playing 229 + - persists across page navigation 230 + - reverts to page title when playback stops 231 + - **timed comments** (PR #359): 232 + - comments capture timestamp when added during playback 233 + - clickable timestamp buttons seek to that moment 234 + - compact scrollable comments section on track pages 235 + - **constellation integration** (PR #360): 236 + - queries constellation.microcosm.blue backlink index 237 + - enables network-wide like counts (not just plyr.fm internal) 238 + - environment-aware namespace handling 239 + - **account deletion** (PR #363): 240 + - explicit confirmation flow (type handle to confirm) 241 + - deletes all plyr.fm data (tracks, albums, likes, comments, preferences) 242 + - optional ATProto record cleanup with clear warnings about orphaned references 243 + 244 + --- 245 + 246 + #### oEmbed endpoint for Leaflet.pub embeds (PRs #355-358, Nov 25) 247 + 248 + **motivation**: plyr.fm tracks embedded in Leaflet.pub (via iframely) showed a black HTML5 audio box instead of our custom embed player. 249 + 250 + **what shipped**: 251 + - **oEmbed endpoint** (PR #355): `/oembed` returns proper embed HTML with iframe 252 + - follows oEmbed spec with `type: "rich"` and iframe in `html` field 253 + - discovery link in track page `<head>` for automatic detection 254 + - **iframely domain registration**: registered plyr.fm on iframely.com (free tier) 255 + - this was the key fix - iframely now returns our embed iframe as `links.player[0]` 256 + 257 + **debugging journey** (PRs #356-358): 258 + - initially tried `og:video` meta tags to hint iframe embed - didn't work 259 + - tried removing `og:audio` to force oEmbed fallback - resulted in no player link 260 + - discovered iframely requires domain registration to trust oEmbed providers 261 + - after registration, iframely correctly returns embed iframe URL 262 + 263 + --- 264 + 265 + #### export & upload reliability (PRs #337-344, Nov 24) 266 + 267 + **motivation**: exports were failing silently on large files (OOM), uploads showed incorrect progress, and SSE connections triggered false error toasts. 268 + 269 + **what shipped**: 270 + - **database-backed jobs** (PR #337): moved upload/export tracking from in-memory to postgres 271 + - jobs table persists state across server restarts 272 + - enables reliable progress tracking via SSE polling 273 + - **streaming exports** (PR #343): fixed OOM on large file exports 274 + - previously loaded entire files into memory via `response["Body"].read()` 275 + - now streams to temp files, adds to zip from disk (constant memory) 276 + - 90-minute WAV files now export successfully on 1GB VM 277 + - **progress tracking fix** (PR #340): upload progress was receiving bytes but treating as percentage 278 + - `UploadProgressTracker` now properly converts bytes to percentage 279 + - upload progress bar works correctly again 280 + - **UX improvements** (PRs #338-339, #341-342, #344): 281 + - export filename now includes date (`plyr-tracks-2025-11-24.zip`) 282 + - toast notification on track deletion 283 + - fixed false "lost connection" error when SSE completes normally 284 + - progress now shows "downloading track X of Y" instead of confusing count 285 + 286 + --- 287 + 288 + #### queue hydration + ATProto token hardening (Nov 12) 289 + 290 + **why**: queue endpoints were occasionally taking 2s+ and restore operations could 401 291 + when multiple requests refreshed an expired ATProto token simultaneously. 292 + 293 + **what shipped**: 294 + - added persistent `image_url` on `Track` rows so queue hydration no longer probes R2 295 + for every track. Queue payloads now pull art directly from Postgres, with a one-time 296 + fallback for legacy rows. 297 + - updated `_internal/queue.py` to backfill any missing URLs once (with caching) instead 298 + of per-request GETs. 299 + - introduced per-session locks in `_refresh_session_tokens` so only one coroutine hits 300 + `oauth_client.refresh_session` at a time; others reuse the refreshed tokens. This 301 + removes the race that caused the batch restore flow to intermittently 500/401. 302 + 303 + **impact**: queue tail latency dropped back under 500 ms in staging tests, ATProto restore flows are now reliable under concurrent use, and Logfire no longer shows 500s from the PDS. 304 + 305 + --- 306 + 307 + #### performance optimization session (Nov 12) 308 + 309 + **issue: slow /tracks/liked endpoint** 310 + 311 + **symptoms**: 312 + - `/tracks/liked` taking 600-900ms consistently 313 + - only ~25ms spent in database queries 314 + - mysterious 575ms gap with no spans in Logfire traces 315 + 316 + **root cause**: 317 + - PR #184 added `image_url` column to tracks table to eliminate N+1 R2 API calls 318 + - legacy tracks (15 tracks uploaded before PR) had `image_url = NULL` 319 + - fallback code called `track.get_image_url()` which makes uninstrumented R2 `head_object` API calls 320 + - 5 tracks × 120ms = ~600ms of uninstrumented latency 321 + 322 + **solution**: created `scripts/backfill_image_urls.py` to populate missing `image_url` values 323 + 324 + **results**: 325 + - `/tracks/liked` now sub-200ms (down from 600-900ms) 326 + - all endpoints now consistently sub-second response times 327 + 328 + **database cleanup**: 329 + - discovered `queue_state` had 265% bloat (53 dead rows, 20 live rows) 330 + - ran `VACUUM (FULL, ANALYZE) queue_state` against production 331 + 332 + --- 333 + 334 + #### track detail pages (PR #164, Nov 12) 335 + 336 + - ✅ dedicated track detail pages with large cover art 337 + - ✅ play button updates queue state correctly (#169) 338 + - ✅ liked state loaded efficiently via server-side fetch 339 + - ✅ mobile-optimized layouts with proper scrolling constraints 340 + - ✅ origin validation for image URLs (#168) 341 + 342 + --- 343 + 344 + #### liked tracks feature (PR #157, Nov 11) 345 + 346 + - ✅ server-side persistent collections 347 + - ✅ ATProto record publication for cross-platform visibility 348 + - ✅ UI for adding/removing tracks from liked collection 349 + - ✅ like counts displayed in track responses and analytics (#170) 350 + - ✅ analytics cards now clickable links to track detail pages (#171) 351 + - ✅ liked state shown on artist page tracks (#163) 352 + 353 + **status**: COMPLETE (issue #144 closed) 354 + 355 + --- 356 + 357 + #### upload streaming + progress UX (PR #182, Nov 11) 358 + 359 + - Frontend switched from `fetch` to `XMLHttpRequest` so we can display upload progress 360 + toasts (critical for >50 MB mixes on mobile). 361 + - Upload form now clears only after the request succeeds; failed attempts leave the 362 + form intact so users don't lose metadata. 363 + - Backend writes uploads/images to temp files in 8 MB chunks before handing them to the 364 + storage layer, eliminating whole-file buffering and iOS crashes for hour-long mixes. 365 + - Deployment verified locally and by rerunning the exact repro Stella hit (85 minute 366 + mix from mobile). 367 + 368 + --- 369 + 370 + #### transcoder API deployment (PR #156, Nov 11) 371 + 372 + **standalone Rust transcoding service** 🎉 373 + - **deployed**: https://plyr-transcoder.fly.dev/ 374 + - **purpose**: convert AIFF/FLAC/etc. to MP3 for browser compatibility 375 + - **technology**: Axum + ffmpeg + Docker 376 + - **security**: `X-Transcoder-Key` header authentication (shared secret) 377 + - **capacity**: handles 1GB uploads, tested with 85-minute AIFF files (~858MB → 195MB MP3 in 32 seconds) 378 + - **architecture**: 379 + - 2 Fly machines for high availability 380 + - auto-stop/start for cost efficiency 381 + - stateless design (no R2 integration yet) 382 + - 320kbps MP3 output with proper ID3 tags 383 + - **status**: deployed and tested, ready for integration into plyr.fm upload pipeline 384 + - **next steps**: wire into backend with R2 integration and job queue (see issue #153) 385 + 386 + --- 387 + 388 + #### AIFF/AIF browser compatibility fix (PR #152, Nov 11) 389 + 390 + **format validation improvements** 391 + - **problem discovered**: AIFF/AIF files only work in Safari, not Chrome/Firefox 392 + - browsers throw `MediaError code 4: MEDIA_ERR_SRC_NOT_SUPPORTED` 393 + - users could upload files but they wouldn't play in most browsers 394 + - **immediate solution**: reject AIFF/AIF uploads at both backend and frontend 395 + - removed AIFF/AIF from AudioFormat enum 396 + - added format hints to upload UI: "supported: mp3, wav, m4a" 397 + - client-side validation with helpful error messages 398 + - **long-term solution**: deployed standalone transcoder service (see above) 399 + - separate Rust/Axum service with ffmpeg 400 + - accepts all formats, converts to browser-compatible MP3 401 + - integration into upload pipeline pending (issue #153) 402 + 403 + **observability improvements**: 404 + - added logfire instrumentation to upload background tasks 405 + - added logfire spans to R2 storage operations 406 + - documented logfire querying patterns in `docs/logfire-querying.md` 407 + 408 + --- 409 + 410 + #### async I/O performance fixes (PRs #149-151, Nov 10-11) 411 + 412 + Eliminated event loop blocking across backend with three critical PRs: 413 + 414 + 1. **PR #149: async R2 reads** - converted R2 `head_object` operations from sync boto3 to async aioboto3 415 + - portal page load time: 2+ seconds → ~200ms 416 + - root cause: `track.image_url` was blocking on serial R2 HEAD requests 417 + 418 + 2. **PR #150: concurrent PDS resolution** - parallelized ATProto PDS URL lookups 419 + - homepage load time: 2-6 seconds → 200-400ms 420 + - root cause: serial `resolve_atproto_data()` calls (8 artists × 200-300ms each) 421 + - fix: `asyncio.gather()` for batch resolution, database caching for subsequent loads 422 + 423 + 3. **PR #151: async storage writes/deletes** - made save/delete operations non-blocking 424 + - R2: switched to `aioboto3` for uploads/deletes (async S3 operations) 425 + - filesystem: used `anyio.Path` and `anyio.open_file()` for chunked async I/O (64KB chunks) 426 + - impact: multi-MB uploads no longer monopolize worker thread, constant memory usage 427 + 428 + --- 429 + 430 + #### mobile UI improvements (PRs #159-185, Nov 11-12) 431 + 432 + - ✅ compact action menus and better navigation (#161) 433 + - ✅ improved mobile responsiveness (#159) 434 + - ✅ consistent button layouts across mobile/desktop (#176-181, #185) 435 + - ✅ always show play count and like count on mobile (#177) 436 + - ✅ login page UX improvements (#174-175) 437 + - ✅ liked page UX improvements (#173) 438 + - ✅ accent color for liked tracks (#160) 439 + 440 + --- 441 + 442 + ### October-November 2025 (early development) 443 + 444 + #### cover art support (PRs #123-126, #132-139, early Nov) 445 + - ✅ track cover image upload and storage (separate R2 bucket) 446 + - ✅ image display on track pages and player 447 + - ✅ Open Graph meta tags for track sharing 448 + - ✅ mobile-optimized layouts with cover art 449 + - ✅ sticky bottom player on mobile with cover 450 + 451 + --- 452 + 453 + #### queue management improvements (PRs #110-113, #115, late Oct-early Nov) 454 + - ✅ visual feedback on queue add/remove 455 + - ✅ toast notifications for queue actions 456 + - ✅ better error handling for queue operations 457 + - ✅ improved shuffle and auto-advance UX 458 + 459 + --- 460 + 461 + #### infrastructure and tooling (Oct-Nov) 462 + - ✅ R2 bucket separation: audio-prod and images-prod (PR #124) 463 + - ✅ admin script for content moderation (`scripts/delete_track.py`) 464 + - ✅ bluesky attribution link in header 465 + - ✅ changelog target added (#183) 466 + - ✅ documentation updates (#158) 467 + - ✅ track metadata edits now persist correctly (#162) 468 + 136 469 ## immediate priorities 137 470 138 471 ### high priority features ··· 147 480 - maintain original file hash for deduplication 148 481 - handle transcoding failures gracefully 149 482 150 - ### critical bugs 151 - 1. **upload reliability** (issue #147): upload returns 200 but file missing from R2, no error logged 152 - - priority: high (data loss risk) 153 - - need better error handling and retry logic in background upload task 483 + ### resolved bugs 484 + 1. ~~**upload reliability** (issue #147): upload returns 200 but file missing from R2, no error logged~~ 485 + - **status**: FIXED (issue #147 closed) 486 + - improved error handling and retry logic in background upload task 154 487 155 488 2. **database connection pool SSL errors**: intermittent failures on first request 156 489 - symptom: `/tracks/` returns 500 on first request, succeeds after 157 490 - fix: set `pool_pre_ping=True`, adjust `pool_recycle` for Neon timeouts 158 491 - documented in `docs/logfire-querying.md` 159 492 160 - --- 493 + ### performance optimizations 494 + 3. **persist concrete file extensions in database**: currently brute-force probing all supported formats on read 495 + - already know `Track.file_type` and image format during upload 496 + - eliminating repeated `exists()` checks reduces filesystem/R2 HEAD spam 497 + - improves audio streaming latency (`/audio/{file_id}` endpoint walks extensions sequentially) 498 + 499 + 4. **stream large uploads directly to storage**: current implementation reads entire file into memory before background task 500 + - multi-GB uploads risk OOM 501 + - stream from `UploadFile.file` → storage backend for constant memory usage 502 + 503 + ### new features 504 + 5. **content-addressable storage** (issue #146) 505 + - hash-based file storage for automatic deduplication 506 + - reduces storage costs when multiple artists upload same file 507 + - enables content verification 508 + 509 + ## open issues by timeline 510 + 511 + ### immediate 512 + - issue #153: audio transcoding pipeline (ffmpeg worker for AIFF/FLAC→MP3) 513 + 514 + ### short-term 515 + - issue #146: content-addressable storage (hash-based deduplication) 516 + - issue #24: implement play count abuse prevention 517 + - database connection pool tuning (SSL errors) 518 + - file extension persistence in database 519 + 520 + ### medium-term 521 + - issue #208: security - medium priority hardening tasks 522 + - issue #207: security - add comprehensive input validation 523 + - issue #46: consider removing init_db() from lifespan in favor of migration-only approach 524 + - issue #56: design public developer API and versioning 525 + - **note**: SDK (`plyrfm`) and MCP server (`plyrfm-mcp`) now available at https://github.com/zzstoatzz/plyr-python-client 526 + - `plyrfm` on PyPI - Python SDK + CLI for plyr.fm API 527 + - `plyrfm-mcp` on PyPI - MCP server, hosted at https://plyrfm.fastmcp.app/mcp 528 + - issue still open for formal API versioning and public documentation 529 + - issue #57: support multiple audio item types (voice memos/snippets) 530 + - issue #122: fullscreen player for immersive playback 531 + - issue #155: add track metadata (genres, tags, descriptions) 532 + - issue #166: content moderation for user-uploaded images 533 + - issue #167: DMCA safe harbor compliance 534 + - issue #186: liquid glass effects as user-configurable setting 535 + - issue #221: first-class albums (ATProto records) 536 + - issue #334: add 'share to bluesky' option for tracks 537 + - issue #373: lyrics field and Genius-style annotations 538 + - issue #393: moderation - represent confirmed takedown state in labeler 539 + 540 + ### long-term 541 + - migrate to plyr-owned lexicon (custom ATProto namespace with richer metadata) 542 + - publish to multiple ATProto AppViews for cross-platform visibility 543 + - explore ATProto-native notifications (replace Bluesky DM bot) 544 + - realtime queue syncing across devices via SSE/WebSocket 545 + - artist analytics dashboard improvements 546 + - issue #44: modern music streaming feature parity 161 547 162 548 ## technical state 163 549 164 - ### what's working 550 + ### architecture 551 + 552 + **backend** 553 + - language: Python 3.11+ 554 + - framework: FastAPI with uvicorn 555 + - database: Neon PostgreSQL (serverless, fully managed) 556 + - storage: Cloudflare R2 (S3-compatible object storage) 557 + - hosting: Fly.io (2x shared-cpu VMs, auto-scaling) 558 + - observability: Pydantic Logfire (traces, metrics, logs) 559 + - auth: ATProto OAuth 2.1 (forked SDK: github.com/zzstoatzz/atproto) 560 + 561 + **frontend** 562 + - framework: SvelteKit (latest v2.43.2) 563 + - runtime: Bun (fast JS runtime) 564 + - hosting: Cloudflare Pages (edge network) 565 + - styling: vanilla CSS with lowercase aesthetic 566 + - state management: Svelte 5 runes ($state, $derived, $effect) 567 + 568 + **deployment** 569 + - ci/cd: GitHub Actions 570 + - backend: automatic on main branch merge (fly.io deploy) 571 + - frontend: automatic on every push to main (cloudflare pages) 572 + - migrations: automated via fly.io release_command 573 + - environments: dev → staging → production (full separation) 574 + - versioning: nebula timestamp format (YYYY.MMDD.HHMMSS) 575 + 576 + **key dependencies** 577 + - atproto: forked SDK for OAuth and record management 578 + - sqlalchemy: async ORM for postgres 579 + - alembic: database migrations 580 + - boto3/aioboto3: R2 storage client 581 + - logfire: observability (FastAPI + SQLAlchemy instrumentation) 582 + - httpx: async HTTP client 583 + 584 + **what's working** 165 585 166 586 **core functionality** 167 587 - ✅ ATProto OAuth 2.1 authentication with encrypted state ··· 187 607 - ✅ image URL caching in database (eliminates N+1 R2 calls) 188 608 - ✅ format validation (rejects AIFF/AIF, accepts MP3/WAV/M4A with helpful error messages) 189 609 - ✅ standalone audio transcoding service deployed and verified (see issue #153) 610 + - ✅ Bluesky embed player UI changes implemented (pending upstream social-app PR) 190 611 - ✅ admin content moderation script for removing inappropriate uploads 191 612 - ✅ copyright moderation system (AuDD fingerprinting, review workflow, violation tracking) 192 613 - ✅ ATProto labeler for copyright violations (queryLabels, subscribeLabels XRPC endpoints) ··· 202 623 - ✅ long album title handling (100-char slugs, CSS truncation) 203 624 - ⏸ ATProto records for albums (deferred, see issue #221) 204 625 626 + **frontend architecture** 627 + - ✅ server-side data loading (`+page.server.ts`) for artist and album pages 628 + - ✅ client-side data loading (`+page.ts`) for auth-dependent pages 629 + - ✅ centralized auth manager (`lib/auth.svelte.ts`) 630 + - ✅ layout-level auth state (`+layout.ts`) shared across all pages 631 + - ✅ eliminated "flash of loading" via proper load functions 632 + - ✅ consistent auth patterns (no scattered localStorage calls) 633 + 205 634 **deployment (fully automated)** 206 635 - **production**: 207 636 - frontend: https://plyr.fm (cloudflare pages) ··· 217 646 - storage: cloudflare R2 (audio-stg bucket) 218 647 - deploy: push to main → automatic 219 648 649 + - **development**: 650 + - backend: localhost:8000 651 + - frontend: localhost:5173 652 + - database: neon postgresql (relay-dev) 653 + - storage: cloudflare R2 (audio-dev and images-dev buckets) 654 + 655 + - **developer tooling**: 656 + - `just serve` - run backend locally 657 + - `just dev` - run frontend locally 658 + - `just test` - run test suite 659 + - `just release` - create production release (backend + frontend) 660 + - `just release-frontend-only` - deploy only frontend changes (added Nov 13) 661 + 662 + ### what's in progress 663 + 664 + **immediate work** 665 + - investigating playback auto-start behavior (#225) 666 + - page refresh sometimes starts playing immediately 667 + - may be related to queue state restoration or localStorage caching 668 + - `autoplay_next` preference not being respected in all cases 669 + - liquid glass effects as user-configurable setting (#186) 670 + 671 + **active research** 672 + - transcoding pipeline architecture (see sandbox/transcoding-pipeline-plan.md) 673 + - content moderation systems (#166, #167, #393 - takedown state representation) 674 + - PWA capabilities and offline support (#165) 675 + 220 676 ### known issues 221 677 222 678 **player behavior** ··· 231 687 - no AIFF/AIF transcoding support (#153) 232 688 - no PWA installation prompts (#165) 233 689 - no fullscreen player view (#122) 234 - - no public API for third-party integrations (#56) 690 + 691 + **technical debt** 692 + - multi-tab playback synchronization could be more robust 693 + - queue state conflicts can occur with rapid operations 694 + 695 + ### technical decisions 696 + 697 + **why Python/FastAPI instead of Rust?** 698 + - rapid prototyping velocity during MVP phase 699 + - rich ecosystem for web APIs (fastapi, sqlalchemy, pydantic) 700 + - excellent async support with asyncio 701 + - lower barrier to contribution 702 + - trade-off: accepting higher latency for faster development 703 + - future: can migrate hot paths to Rust if needed (transcoding service already planned) 704 + 705 + **why Fly.io instead of AWS/GCP?** 706 + - simple deployment model (dockerfile → production) 707 + - automatic SSL/TLS certificates 708 + - built-in global load balancing 709 + - reasonable pricing for MVP ($5/month) 710 + - easy migration path to larger providers later 711 + - trade-off: vendor-specific features, less control 712 + 713 + **why Cloudflare R2 instead of S3?** 714 + - zero egress fees (critical for audio streaming) 715 + - S3-compatible API (easy migration if needed) 716 + - integrated CDN for fast delivery 717 + - significantly cheaper than S3 for bandwidth-heavy workloads 718 + 719 + **why forked atproto SDK?** 720 + - upstream SDK lacked OAuth 2.1 support 721 + - needed custom record management patterns 722 + - maintains compatibility with ATProto spec 723 + - contributes improvements back when possible 724 + 725 + **why SvelteKit instead of React/Next.js?** 726 + - Svelte 5 runes provide excellent reactivity model 727 + - smaller bundle sizes (critical for mobile) 728 + - less boilerplate than React 729 + - SSR + static generation flexibility 730 + - modern DX with TypeScript 235 731 236 - --- 732 + **why Neon instead of self-hosted Postgres?** 733 + - serverless autoscaling (no capacity planning) 734 + - branch-per-PR workflow (preview databases) 735 + - automatic backups and point-in-time recovery 736 + - generous free tier for MVP 737 + - trade-off: higher latency than co-located DB, but acceptable 738 + 739 + **why reject AIFF instead of transcoding immediately?** 740 + - MVP speed: transcoding requires queue infrastructure, ffmpeg setup, error handling 741 + - user communication: better to be upfront about limitations than silent failures 742 + - resource management: transcoding is CPU-intensive, needs proper worker architecture 743 + - future flexibility: can add transcoding as optional feature (high-quality uploads → MP3 delivery) 744 + - trade-off: some users can't upload AIFF now, but those who can upload MP3 have working experience 745 + 746 + **why async everywhere?** 747 + - event loop performance: single-threaded async handles high concurrency 748 + - I/O-bound workload: most time spent waiting on network/disk 749 + - recent work (PRs #149-151) eliminated all blocking operations 750 + - alternative: thread pools for blocking I/O, but increases complexity 751 + - trade-off: debugging async code harder than sync, but worth throughput gains 752 + 753 + **why anyio.Path over thread pools?** 754 + - true async I/O: `anyio` uses OS-level async file operations where available 755 + - constant memory: chunked reads/writes (64KB) prevent OOM on large files 756 + - thread pools: would work but less efficient, more context switching 757 + - trade-off: anyio API slightly different from stdlib `pathlib`, but cleaner async semantics 237 758 238 759 ## cost structure 239 760 ··· 273 794 - storage used: <1GB R2 274 795 - database size: <10MB postgres 275 796 797 + ## next session prep 798 + 799 + **context for new agent:** 800 + 1. Fixed R2 image upload path mismatch, ensuring images save with the correct prefix. 801 + 2. Implemented UI changes for the embed player: removed the Queue button and matched fonts to the main app. 802 + 3. Opened a draft PR to the upstream social-app repository for native Plyr.fm embed support. 803 + 4. Updated issue #153 (transcoding pipeline) with a clear roadmap for integration into the backend. 804 + 5. Developed a local verification script for the transcoder service for faster local iteration. 805 + 806 + **useful commands:** 807 + - `just backend run` - run backend locally 808 + - `just frontend dev` - run frontend locally 809 + - `just test` - run test suite (from `backend/` directory) 810 + - `gh issue list` - check open issues 811 + ## admin tooling 812 + 813 + ### content moderation 814 + script: `scripts/delete_track.py` 815 + - requires `ADMIN_*` prefixed environment variables 816 + - deletes audio file from R2 817 + - deletes cover image from R2 (if exists) 818 + - deletes database record (cascades to likes and queue entries) 819 + - notes ATProto records for manual cleanup (can't delete from other users' PDS) 820 + 821 + usage: 822 + ```bash 823 + # dry run 824 + uv run scripts/delete_track.py <track_id> --dry-run 825 + 826 + # delete with confirmation 827 + uv run scripts/delete_track.py <track_id> 828 + 829 + # delete without confirmation 830 + uv run scripts/delete_track.py <track_id> --yes 831 + 832 + # by URL 833 + uv run scripts/delete_track.py --url https://plyr.fm/track/34 834 + ``` 835 + 836 + required environment variables: 837 + - `ADMIN_DATABASE_URL` - production database connection 838 + - `ADMIN_AWS_ACCESS_KEY_ID` - R2 access key 839 + - `ADMIN_AWS_SECRET_ACCESS_KEY` - R2 secret 840 + - `ADMIN_R2_ENDPOINT_URL` - R2 endpoint 841 + - `ADMIN_R2_BUCKET` - R2 bucket name 842 + 843 + ## known issues 844 + 845 + ### non-blocking 846 + - cloudflare pages preview URLs return 404 (production works fine) 847 + - some "relay" references remain in docs and comments 848 + - ATProto like records can't be deleted when removing tracks (orphaned on users' PDS) 849 + 850 + ## for new contributors 851 + 852 + ### getting started 853 + 1. clone: `gh repo clone zzstoatzz/plyr.fm` 854 + 2. install dependencies: `uv sync && cd frontend && bun install` 855 + 3. run backend: `uv run uvicorn backend.main:app --reload` 856 + 4. run frontend: `cd frontend && bun run dev` 857 + 5. visit http://localhost:5173 858 + 859 + ### development workflow 860 + 1. create issue on github 861 + 2. create PR from feature branch 862 + 3. ensure pre-commit hooks pass 863 + 4. test locally 864 + 5. merge to main → deploys to staging automatically 865 + 6. verify on staging 866 + 7. create github release → deploys to production automatically 867 + 868 + ### key principles 869 + - type hints everywhere 870 + - lowercase aesthetic 871 + - generic terminology (use "items" not "tracks" where appropriate) 872 + - ATProto first 873 + - mobile matters 874 + - cost conscious 875 + - async everywhere (no blocking I/O) 876 + 877 + ### project structure 878 + ``` 879 + plyr.fm/ 880 + ├── backend/ # FastAPI app & Python tooling 881 + │ ├── src/backend/ # application code 882 + │ │ ├── api/ # public endpoints 883 + │ │ ├── _internal/ # internal services 884 + │ │ ├── models/ # database schemas 885 + │ │ └── storage/ # storage adapters 886 + │ ├── tests/ # pytest suite 887 + │ └── alembic/ # database migrations 888 + ├── frontend/ # SvelteKit app 889 + │ ├── src/lib/ # components & state 890 + │ └── src/routes/ # pages 891 + ├── moderation/ # Rust moderation service (ATProto labeler) 892 + │ ├── src/ # Axum handlers, AuDD client, label signing 893 + │ └── static/ # admin UI (html/css/js) 894 + ├── transcoder/ # Rust audio transcoding service 895 + ├── docs/ # documentation 896 + └── justfile # task runner (mods: backend, frontend, moderation, transcoder) 897 + ``` 898 + 899 + ## documentation 900 + 901 + - [deployment overview](docs/deployment/overview.md) 902 + - [configuration guide](docs/configuration.md) 903 + - [queue design](docs/queue-design.md) 904 + - [logfire querying](docs/logfire-querying.md) 905 + - [pdsx guide](docs/pdsx-guide.md) 906 + - [neon mcp guide](docs/neon-mcp-guide.md) 907 + 276 908 --- 277 909 278 - this is a living document. last updated 2025-12-02 after status maintenance. 910 + this is a living document. last updated 2025-12-01.
update.wav

This is a binary file and will not be displayed.