docs: update moderation docs with ATProto labeler architecture (#386)

- Update overview.md with current architecture diagram and admin UI options
- Update copyright-detection.md with actual schemas and label querying
- Add atproto-labeler.md with full labeler service documentation

๐Ÿค– Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>

authored by zzstoatzz.io Claude and committed by GitHub e3b5a5a2 04900332

Changed files
+515 -71
docs
+276
docs/moderation/atproto-labeler.md
··· 1 + # ATProto labeler service 2 + 3 + technical documentation for the moderation service's ATProto labeling capabilities. 4 + 5 + ## overview 6 + 7 + the moderation service (`moderation.plyr.fm`) acts as an ATProto labeler - a service that produces signed labels about content. labels are metadata objects that follow the `com.atproto.label.defs#label` schema and can be queried by any ATProto-compatible app. 8 + 9 + key distinction: **labels are signed data objects, not repository records**. they don't live in a user's repo - they're served directly by the labeler via XRPC endpoints. 10 + 11 + ## why labels? 12 + 13 + from [Bluesky's labeling architecture](https://docs.bsky.app/docs/advanced-guides/moderation): 14 + 15 + > "Labels are assertions made about content or accounts. They don't enforce anything on their own - clients decide how to interpret them." 16 + 17 + this enables **stackable moderation**: multiple labelers can label the same content, and clients can choose which labelers to trust and how to handle different label values. 18 + 19 + for plyr.fm, this means: 20 + - we produce `copyright-violation` labels when tracks are flagged 21 + - other ATProto apps can query our labels and apply their own policies 22 + - users/apps can choose to subscribe to our labeler or ignore it 23 + - we can revoke labels by emitting negations (`neg: true`) 24 + 25 + ## architecture 26 + 27 + ``` 28 + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” 29 + โ”‚ moderation service โ”‚ 30 + โ”‚ (moderation.plyr.fm) โ”‚ 31 + โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค 32 + โ”‚ โ”‚ 33 + โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ 34 + โ”‚ โ”‚ /scan โ”‚ โ”‚ /emit-label โ”‚ โ”‚ /xrpc/com.atproto. โ”‚ โ”‚ 35 + โ”‚ โ”‚ endpoint โ”‚ โ”‚ endpoint โ”‚ โ”‚ label.queryLabels โ”‚ โ”‚ 36 + โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ 37 + โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ 38 + โ”‚ โ–ผ โ–ผ โ–ผ โ”‚ 39 + โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ 40 + โ”‚ โ”‚ AuDD โ”‚ โ”‚ sign โ”‚ โ”‚ query labels โ”‚ โ”‚ 41 + โ”‚ โ”‚ client โ”‚ โ”‚ label โ”‚ โ”‚ from postgres โ”‚ โ”‚ 42 + โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ 43 + โ”‚ โ”‚ โ”‚ 44 + โ”‚ โ–ผ โ”‚ 45 + โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ 46 + โ”‚ โ”‚ labels โ”‚ โ”‚ 47 + โ”‚ โ”‚ table โ”‚ โ”‚ 48 + โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ 49 + โ”‚ โ”‚ 50 + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ 51 + ``` 52 + 53 + ## endpoints 54 + 55 + ### POST /scan 56 + 57 + scans audio for copyright matches via AuDD. 58 + 59 + ```bash 60 + curl -X POST https://moderation.plyr.fm/scan \ 61 + -H "X-Moderation-Key: $MODERATION_AUTH_TOKEN" \ 62 + -H "Content-Type: application/json" \ 63 + -d '{"audio_url": "https://r2.plyr.fm/audio/abc123.mp3"}' 64 + ``` 65 + 66 + response: 67 + 68 + ```json 69 + { 70 + "matches": [ 71 + { 72 + "artist": "Taylor Swift", 73 + "title": "Love Story", 74 + "score": 95, 75 + "isrc": "USRC10701234" 76 + } 77 + ], 78 + "is_flagged": true, 79 + "highest_score": 95, 80 + "raw_response": { ... } 81 + } 82 + ``` 83 + 84 + ### POST /emit-label 85 + 86 + creates a signed ATProto label. 87 + 88 + ```bash 89 + curl -X POST https://moderation.plyr.fm/emit-label \ 90 + -H "X-Moderation-Key: $MODERATION_AUTH_TOKEN" \ 91 + -H "Content-Type: application/json" \ 92 + -d '{ 93 + "uri": "at://did:plc:abc123/fm.plyr.track/xyz789", 94 + "val": "copyright-violation", 95 + "cid": "bafyreiabc123" 96 + }' 97 + ``` 98 + 99 + the service: 100 + 1. creates label with current timestamp 101 + 2. signs with labeler's secp256k1 private key (DAG-CBOR encoded) 102 + 3. stores in `labels` table with monotonic sequence number 103 + 104 + ### GET /xrpc/com.atproto.label.queryLabels 105 + 106 + standard ATProto XRPC endpoint for querying labels. 107 + 108 + ```bash 109 + # query by URI pattern 110 + curl "https://moderation.plyr.fm/xrpc/com.atproto.label.queryLabels?uriPatterns=at://did:plc:*" 111 + 112 + # query by source (labeler DID) 113 + curl "https://moderation.plyr.fm/xrpc/com.atproto.label.queryLabels?sources=did:plc:plyr-labeler" 114 + 115 + # query by cursor (pagination) 116 + curl "https://moderation.plyr.fm/xrpc/com.atproto.label.queryLabels?cursor=123&limit=50" 117 + ``` 118 + 119 + response: 120 + 121 + ```json 122 + { 123 + "cursor": "456", 124 + "labels": [ 125 + { 126 + "ver": 1, 127 + "src": "did:plc:plyr-labeler", 128 + "uri": "at://did:plc:abc123/fm.plyr.track/xyz789", 129 + "cid": "bafyreiabc123", 130 + "val": "copyright-violation", 131 + "neg": false, 132 + "cts": "2025-11-30T12:00:00.000Z", 133 + "sig": "base64-encoded-secp256k1-signature" 134 + } 135 + ] 136 + } 137 + ``` 138 + 139 + ## label signing 140 + 141 + labels are signed using DAG-CBOR serialization with secp256k1 keys (same as ATProto repo commits). 142 + 143 + signing process: 144 + 1. construct label object without `sig` field 145 + 2. encode as DAG-CBOR (deterministic CBOR) 146 + 3. compute SHA-256 hash of encoded bytes 147 + 4. sign hash with labeler's secp256k1 private key 148 + 5. attach signature as `sig` field 149 + 150 + this allows any client to verify labels came from our labeler by checking the signature against our public key (in our DID document). 151 + 152 + ## label values 153 + 154 + current supported values: 155 + 156 + | val | meaning | when emitted | 157 + |-----|---------|--------------| 158 + | `copyright-violation` | track flagged for potential copyright infringement | scan returns matches | 159 + 160 + future values could include: 161 + - `explicit` - explicit content marker 162 + - `spam` - suspected spam upload 163 + - `dmca-takedown` - formal DMCA notice received 164 + 165 + ## negation labels 166 + 167 + to revoke a label, emit the same label with `neg: true`: 168 + 169 + ```json 170 + { 171 + "uri": "at://did:plc:abc123/fm.plyr.track/xyz789", 172 + "val": "copyright-violation", 173 + "neg": true 174 + } 175 + ``` 176 + 177 + use cases: 178 + - false positive resolved after manual review 179 + - artist provided proof of licensing 180 + - DMCA counter-notice accepted 181 + 182 + ## database schema 183 + 184 + ```sql 185 + CREATE TABLE labels ( 186 + id BIGSERIAL PRIMARY KEY, 187 + seq BIGSERIAL UNIQUE NOT NULL, -- monotonic for subscribeLabels cursor 188 + src TEXT NOT NULL, -- labeler DID 189 + uri TEXT NOT NULL, -- target AT URI 190 + cid TEXT, -- optional target CID 191 + val TEXT NOT NULL, -- label value 192 + neg BOOLEAN NOT NULL DEFAULT FALSE, -- negation flag 193 + cts TIMESTAMPTZ NOT NULL, -- creation timestamp 194 + exp TIMESTAMPTZ, -- optional expiration 195 + sig BYTEA NOT NULL, -- signature bytes 196 + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() 197 + ); 198 + 199 + CREATE INDEX idx_labels_uri ON labels(uri); 200 + CREATE INDEX idx_labels_src ON labels(src); 201 + CREATE INDEX idx_labels_seq ON labels(seq); 202 + CREATE INDEX idx_labels_val ON labels(val); 203 + ``` 204 + 205 + ## deployment 206 + 207 + the moderation service runs on Fly.io: 208 + 209 + ```bash 210 + # deploy 211 + cd moderation && fly deploy 212 + 213 + # check logs 214 + fly logs -a plyr-moderation 215 + 216 + # secrets 217 + fly secrets set -a plyr-moderation \ 218 + LABELER_DID=did:plc:xxx \ 219 + LABELER_SIGNING_KEY=hex-private-key \ 220 + DATABASE_URL=postgres://... \ 221 + AUDD_API_KEY=xxx \ 222 + MODERATION_AUTH_TOKEN=xxx 223 + ``` 224 + 225 + ## integration with backend 226 + 227 + the backend calls the moderation service in two places: 228 + 229 + 1. **scan on upload** (`_internal/moderation.py:scan_track_for_copyright`) 230 + - POST to `/scan` with R2 URL 231 + - store result in `copyright_scans` table 232 + 233 + 2. **emit label on flag** (`_internal/moderation.py:_store_scan_result`) 234 + - if `is_flagged` and track has `atproto_record_uri` 235 + - POST to `/emit-label` with track's AT URI and CID 236 + 237 + ```python 238 + async def _emit_copyright_label(uri: str, cid: str | None) -> None: 239 + async with httpx.AsyncClient(timeout=10.0) as client: 240 + await client.post( 241 + f"{settings.moderation.labeler_url}/emit-label", 242 + json={"uri": uri, "val": "copyright-violation", "cid": cid}, 243 + headers={"X-Moderation-Key": settings.moderation.auth_token}, 244 + ) 245 + ``` 246 + 247 + ## troubleshooting 248 + 249 + ### label not appearing in queries 250 + 251 + 1. check moderation service logs for emit errors 252 + 2. verify track has `atproto_record_uri` set 253 + 3. query labels table directly: 254 + ```sql 255 + SELECT * FROM labels WHERE uri LIKE '%track_rkey%'; 256 + ``` 257 + 258 + ### signature verification failing 259 + 260 + 1. ensure `LABELER_SIGNING_KEY` matches DID document's public key 261 + 2. check DAG-CBOR encoding is deterministic 262 + 3. verify hash algorithm is SHA-256 263 + 264 + ### scan returning empty matches 265 + 266 + AuDD requires actual audio fingerprints. common issues: 267 + - audio too short (< 3 seconds usable) 268 + - microphone recordings don't match source audio 269 + - very low bitrate or corrupted files 270 + 271 + ## references 272 + 273 + - [ATProto Labeling Spec](https://atproto.com/specs/label) 274 + - [Bluesky Moderation Guide](https://docs.bsky.app/docs/advanced-guides/moderation) 275 + - [DAG-CBOR Spec](https://ipld.io/specs/codecs/dag-cbor/spec/) 276 + - [AuDD API Docs](https://docs.audd.io/)
+115 -63
docs/moderation/copyright-detection.md
··· 8 8 upload completes 9 9 โ”‚ 10 10 โ–ผ 11 - โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” 12 - โ”‚ backend โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ AuDD API โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ database โ”‚ 13 - โ”‚ (background) โ”‚ โ”‚ โ”‚ โ”‚ (copyright_ โ”‚ 14 - โ”‚ โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚ โ”‚ โ”‚ flags) โ”‚ 15 - โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ 16 - โ”‚ 17 - โ–ผ 18 - music recognition 19 - against licensed 20 - database 11 + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” 12 + โ”‚ backend โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ moderation โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ AuDD API โ”‚ 13 + โ”‚ (background) โ”‚ โ”‚ service โ”‚ โ”‚ โ”‚ 14 + โ”‚ โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚ (Rust) โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚ โ”‚ 15 + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ 16 + โ”‚ โ”‚ 17 + โ”‚ โ”‚ if flagged 18 + โ–ผ โ–ผ 19 + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” 20 + โ”‚ copyright_ โ”‚ โ”‚ ATProto label โ”‚ 21 + โ”‚ scans table โ”‚ โ”‚ emission โ”‚ 22 + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ 21 23 ``` 22 24 23 25 1. track upload completes, file stored in R2 24 - 2. background job sends R2 URL to AuDD API 25 - 3. AuDD scans file against their music database 26 - 4. results stored in `copyright_flags` table 27 - 5. admin can query flagged tracks 26 + 2. backend calls moderation service `/scan` endpoint with R2 URL 27 + 3. moderation service calls AuDD API for music recognition 28 + 4. results returned to backend, stored in `copyright_scans` table 29 + 5. if flagged, backend calls `/emit-label` to create ATProto label 30 + 6. label stored in moderation service's `labels` table 28 31 29 32 ## AuDD API 30 33 ··· 81 84 82 85 ## database schema 83 86 87 + ### backend: copyright_scans table 88 + 84 89 ```sql 85 - CREATE TABLE copyright_flags ( 90 + CREATE TABLE copyright_scans ( 86 91 id SERIAL PRIMARY KEY, 87 92 track_id INTEGER NOT NULL REFERENCES tracks(id) ON DELETE CASCADE, 88 93 89 - -- status: pending | scanning | clear | flagged | error 90 - status VARCHAR(20) NOT NULL DEFAULT 'pending', 94 + is_flagged BOOLEAN NOT NULL DEFAULT FALSE, 95 + highest_score INTEGER NOT NULL DEFAULT 0, 96 + matches JSONB NOT NULL DEFAULT '[]', -- [{artist, title, score, isrc}] 97 + raw_response JSONB NOT NULL DEFAULT '{}', -- full API response 91 98 92 - -- AuDD data 93 - audd_response JSONB, -- full API response 94 - matched_tracks JSONB, -- [{artist, title, score, isrc}] 95 - confidence_score INTEGER, -- highest match score (0-100) 99 + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), 96 100 97 - -- timestamps 98 - created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), 99 - scanned_at TIMESTAMPTZ, 100 - resolved_at TIMESTAMPTZ, 101 + UNIQUE(track_id) 102 + ); 103 + ``` 101 104 102 - -- metadata 103 - scanned_by VARCHAR(50), -- 'audd', 'manual' 104 - error_message TEXT, 105 + ### moderation service: labels table 105 106 106 - UNIQUE(track_id) 107 + ```sql 108 + CREATE TABLE labels ( 109 + id BIGSERIAL PRIMARY KEY, 110 + seq BIGSERIAL UNIQUE NOT NULL, -- monotonic sequence for subscriptions 111 + src TEXT NOT NULL, -- labeler DID 112 + uri TEXT NOT NULL, -- target AT URI 113 + cid TEXT, -- optional target CID 114 + val TEXT NOT NULL, -- label value (e.g., "copyright-violation") 115 + neg BOOLEAN NOT NULL DEFAULT FALSE, -- negation (for revoking labels) 116 + cts TIMESTAMPTZ NOT NULL, -- creation timestamp 117 + exp TIMESTAMPTZ, -- optional expiration 118 + sig BYTEA NOT NULL, -- secp256k1 signature 119 + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() 107 120 ); 108 121 ``` 109 122 110 - ### status meanings 123 + ### scan result states 111 124 112 - | status | description | 113 - |--------|-------------| 114 - | `pending` | awaiting scan | 115 - | `scanning` | scan in progress | 116 - | `clear` | no matches above threshold | 117 - | `flagged` | matches found above threshold | 118 - | `error` | scan failed | 125 + | is_flagged | highest_score | meaning | 126 + |------------|---------------|---------| 127 + | `false` | 0 | no matches found | 128 + | `false` | 0 | scan failed (error in raw_response) | 129 + | `true` | > 0 | matches found, label emitted | 119 130 120 131 ## configuration 132 + 133 + ### backend environment variables 121 134 122 135 ```bash 123 - # required 124 - AUDD_API_TOKEN=your_token_here 136 + # moderation service connection 137 + MODERATION_SERVICE_URL=https://moderation.plyr.fm 138 + MODERATION_AUTH_TOKEN=shared_secret_token 139 + MODERATION_TIMEOUT_SECONDS=300 140 + MODERATION_ENABLED=true 125 141 126 - # optional (have defaults) 127 - AUDD_API_URL=https://api.audd.io/ 128 - AUDD_TIMEOUT_SECONDS=300 142 + # labeler URL (for emitting labels after scan) 143 + MODERATION_LABELER_URL=https://moderation.plyr.fm 144 + ``` 129 145 130 - # scan behavior 131 - MODERATION_SCORE_THRESHOLD=70 # flag if score >= this 132 - MODERATION_AUTO_SCAN=true # scan on upload 133 - MODERATION_ENABLED=true # master switch 146 + ### moderation service environment variables 147 + 148 + ```bash 149 + # AuDD API 150 + AUDD_API_KEY=your_audd_token 151 + 152 + # database 153 + DATABASE_URL=postgres://... 154 + 155 + # labeler identity 156 + LABELER_DID=did:plc:your-labeler-did 157 + LABELER_SIGNING_KEY=hex-encoded-secp256k1-private-key 158 + 159 + # auth 160 + MODERATION_AUTH_TOKEN=shared_secret_token 134 161 ``` 135 162 136 163 ## interpreting results ··· 196 223 ORDER BY t.created_at DESC; 197 224 ``` 198 225 226 + ## querying labels 227 + 228 + labels can be queried via standard ATProto XRPC endpoints: 229 + 230 + ```bash 231 + # query labels for a specific track 232 + curl "https://moderation.plyr.fm/xrpc/com.atproto.label.queryLabels?uriPatterns=at://did:plc:artist/fm.plyr.track/*" 233 + 234 + # query all labels from our labeler 235 + curl "https://moderation.plyr.fm/xrpc/com.atproto.label.queryLabels?sources=did:plc:plyr-labeler" 236 + ``` 237 + 238 + response: 239 + 240 + ```json 241 + { 242 + "labels": [ 243 + { 244 + "src": "did:plc:plyr-labeler", 245 + "uri": "at://did:plc:artist/fm.plyr.track/abc123", 246 + "val": "copyright-violation", 247 + "cts": "2025-11-30T12:00:00.000Z", 248 + "sig": "base64-encoded-signature" 249 + } 250 + ] 251 + } 252 + ``` 253 + 199 254 ## future considerations 200 255 201 256 ### batch scanning existing tracks ··· 206 261 async with get_session() as session: 207 262 unscanned = await session.execute( 208 263 select(Track) 209 - .outerjoin(CopyrightFlag) 210 - .where(CopyrightFlag.id.is_(None)) 264 + .outerjoin(CopyrightScan) 265 + .where(CopyrightScan.id.is_(None)) 211 266 ) 212 267 for track in unscanned.scalars(): 213 268 await scan_track_for_copyright(track.id, track.r2_url) 214 269 ``` 215 270 216 - ### ATProto labels 217 - 218 - future integration could publish copyright status as ATProto labels: 219 - 220 - ```json 221 - { 222 - "$type": "com.atproto.label.defs#label", 223 - "src": "did:plc:plyr-moderation", 224 - "uri": "at://did:plc:artist/fm.plyr.track/abc123", 225 - "val": "copyright-flagged", 226 - "cts": "2025-11-24T12:00:00Z" 227 - } 228 - ``` 271 + ### label subscriptions 229 272 230 - this would allow other apps in the ATProto ecosystem to see and act on our moderation signals. 273 + the moderation service exposes `com.atproto.label.subscribeLabels` for real-time label streaming. apps can subscribe to receive new labels as they're created. 231 274 232 275 ### user-facing appeals 233 276 ··· 235 278 1. artist sees flag on their track 236 279 2. artist submits dispute with evidence (license, original work proof) 237 280 3. admin reviews dispute 238 - 4. flag status updated to `resolved` or `confirmed` 281 + 4. if resolved: emit negation label (`neg: true`) to revoke the original 282 + 283 + ### admin dashboard 284 + 285 + considerations for where to build the admin UI: 286 + - **option A**: add to main frontend (plyr.fm/admin) - simpler, reuse existing auth 287 + - **option B**: separate UI on moderation service - isolated, but needs its own auth 288 + - **option C**: use Ozone - Bluesky's open-source moderation tool, already built for ATProto labels 289 + 290 + see [overview.md](./overview.md) for architecture discussion.
+124 -8
docs/moderation/overview.md
··· 91 91 - configurable thresholds per user/context 92 92 - integration with ATProto labeling 93 93 94 + ## architecture 95 + 96 + ### current implementation 97 + 98 + ``` 99 + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” 100 + โ”‚ upload flow โ”‚ 101 + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ 102 + 103 + track upload 104 + โ”‚ 105 + โ–ผ 106 + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” 107 + โ”‚ plyr backend โ”‚โ”€โ”€โ”€โ”€โ”€โ–ถโ”‚ moderation โ”‚โ”€โ”€โ”€โ”€โ”€โ–ถโ”‚ AuDD โ”‚ 108 + โ”‚ (FastAPI) โ”‚ โ”‚ service (Rust) โ”‚ โ”‚ (recognition) โ”‚ 109 + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ 110 + โ”‚ โ”‚ 111 + โ”‚ โ–ผ 112 + โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” 113 + โ”‚ โ”‚ if flagged: โ”‚ 114 + โ”‚ โ”‚ emit ATProto โ”‚ 115 + โ”‚ โ”‚ label โ”‚ 116 + โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ 117 + โ”‚ โ”‚ 118 + โ–ผ โ–ผ 119 + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” 120 + โ”‚ copyright_scansโ”‚ โ”‚ labels table โ”‚ 121 + โ”‚ (postgres) โ”‚ โ”‚ (postgres) โ”‚ 122 + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ 123 + ``` 124 + 125 + ### components 126 + 127 + 1. **plyr backend** - triggers scans on upload, stores results in `copyright_scans` 128 + 2. **moderation service** - Rust service that wraps AuDD and emits ATProto labels 129 + 3. **ATProto labeler** - signed labels queryable via `com.atproto.label.queryLabels` 130 + 131 + ### ATProto label integration 132 + 133 + labels are **signed data objects** (not repository records) that follow the AT Protocol labeling spec. when a track is flagged: 134 + 135 + 1. backend stores scan result in `copyright_scans` table 136 + 2. backend calls moderation service `/emit-label` endpoint 137 + 3. moderation service creates signed label with DID key 138 + 4. label stored in moderation service's `labels` table 139 + 5. label queryable via standard ATProto XRPC endpoints 140 + 141 + this means other apps in the ATProto ecosystem can query our labels and apply their own enforcement policies. 142 + 143 + ```json 144 + { 145 + "$type": "com.atproto.label.defs#label", 146 + "src": "did:plc:plyr-labeler", 147 + "uri": "at://did:plc:artist/fm.plyr.track/abc123", 148 + "val": "copyright-violation", 149 + "cts": "2025-11-30T12:00:00Z", 150 + "sig": "<secp256k1 signature>" 151 + } 152 + ``` 153 + 94 154 ## what we're building 95 155 96 - ### phase 1: detection infrastructure 156 + ### phase 1: detection infrastructure โœ… 97 157 98 - - `copyright_flags` table storing scan results 99 - - AuDD integration for music recognition 158 + - `copyright_scans` table storing scan results 159 + - AuDD integration via moderation service 100 160 - background job triggered on upload 101 - - admin endpoints to query flagged tracks 161 + - ATProto label emission for flagged tracks 102 162 103 - ### phase 2: visibility 163 + ### phase 2: visibility (in progress) 104 164 105 165 - admin dashboard for reviewing flags 106 - - stats and trends 107 - - manual rescan capability 166 + - stats and trends via Logfire 167 + - label query endpoints 108 168 109 169 ### phase 3: user-facing (future) 110 170 111 171 - artists see flags on their own tracks 112 172 - dispute/appeal workflow 113 173 - notification on flag status change 174 + - label negation for resolved disputes 175 + 176 + ## admin UI considerations 177 + 178 + the admin interface for managing moderation needs to live somewhere. three options: 179 + 180 + ### option A: main frontend (plyr.fm/admin) 181 + 182 + **pros:** 183 + - reuse existing auth (session cookies, artist roles) 184 + - shared component library 185 + - single deployment 186 + - direct database access to both `tracks` and `copyright_scans` 187 + 188 + **cons:** 189 + - admin code bundled with user-facing app 190 + - moderation logic spread across frontend + backend 191 + - harder to open-source separately 192 + 193 + ### option B: separate UI on moderation service 194 + 195 + **pros:** 196 + - isolated deployment 197 + - moderation service becomes self-contained 198 + - could expose admin API alongside XRPC endpoints 199 + 200 + **cons:** 201 + - needs its own auth system 202 + - Rust service now needs to serve HTML/JS (or add another service) 203 + - queries `labels` table but needs to call backend API for track details 204 + 205 + ### option C: use Ozone 206 + 207 + [Ozone](https://github.com/bluesky-social/ozone) is Bluesky's open-source moderation tool, designed for ATProto labelers. 208 + 209 + **pros:** 210 + - battle-tested, feature-complete 211 + - team review workflows built-in 212 + - ATProto-native (speaks labeler protocol) 213 + - would work with our existing label endpoints 214 + 215 + **cons:** 216 + - designed for Bluesky's needs, not music-specific 217 + - may need customization for copyright review workflow 218 + - another service to deploy 219 + 220 + ### recommendation 221 + 222 + **option A (main frontend)** is simplest for MVP: 223 + - add `/admin` routes protected by role check 224 + - query `copyright_scans` + `tracks` for review UI 225 + - admin can emit negation labels via backend API 226 + - later: extract to separate service if needed 227 + 228 + the moderation service stays focused on scanning + labeling. the backend + frontend handle the human review workflow. 114 229 115 230 ## references 116 231 ··· 121 236 122 237 ## related documentation 123 238 124 - - [copyright-detection.md](./copyright-detection.md) - technical implementation details 239 + - [copyright-detection.md](./copyright-detection.md) - scan flow and database schema 240 + - [atproto-labeler.md](./atproto-labeler.md) - labeler service endpoints and signing