skywatch-phash#
Perceptual hash-based image moderation service for Bluesky/ATProto. Detects known harassment images using phash fingerprinting and automatically applies labels and reports.
How it works#
- Subscribes to Bluesky firehose via Jetstream
- Extracts images from posts and computes perceptual hashes
- Compares against known harassment image hashes using Hamming distance
- On match, executes configured moderation actions (label/report post and/or account)
- Caches phashes in Redis to avoid re-fetching viral images
Features#
- Fast matching - Hamming distance threshold for fuzzy matching (handles crops, filters, etc)
- Caching - Redis-backed phash cache (24hr TTL by default)
- Deduplication - Prevents duplicate labels/reports via Redis claims (7-day TTL)
- Allowlisting - Skip checks for trusted accounts via
ignoreDIDfield - Rate limiting - Configurable delay between moderation API calls
- Metrics - Tracks cache hits, matches, labels applied, etc
Setup#
Prerequisites#
- Bun runtime
- Redis server
- Bluesky labeler account with app password
Installation#
bun install
Configuration#
Copy .env.example to .env and configure:
# Required
LABELER_DID=did:plc:your-labeler-did
LABELER_HANDLE=your-labeler.bsky.social
LABELER_PASSWORD=your-app-password
# Optional (defaults shown)
JETSTREAM_URL=wss://jetstream1.us-east.fire.hose.cam/subscribe
REDIS_URL=redis://localhost:6379
PROCESSING_CONCURRENCY=10
CACHE_ENABLED=true
CACHE_TTL_SECONDS=86400
OZONE_URL=https://ozone.skywatch.blue
OZONE_PDS=https://blewit.us-west.host.bsky.network
MOD_DID=did:plc:e4elbtctnfqocyfcml6h2lf7
RATE_LIMIT_MS=100
Adding phash rules#
Edit rules/blobs.ts:
export const BLOB_CHECKS: BlobCheck[] = [
{
phashes: ["0f1e2d3c4b5a6978", "1a2b3c4d5e6f7890"],
label: "harassment-image",
comment: "Known harassment meme detected",
reportAcct: false,
labelAcct: false,
reportPost: true,
toLabel: true,
hammingThreshold: 5,
ignoreDID: ["did:plc:trusted-account"],
},
];
Hamming threshold guide:
- 0 = Exact match only (very strict)
- 1-2 = Nearly identical images (minor compression artifacts)
- 3-4 = Very similar images (slight edits, crops)
- 5-8 = Similar images (moderate edits)
- 10+ = Loosely similar images (too permissive)
To generate a phash from an image:
bun run phash /path/to/image.png
Running#
Development#
bun run dev
Production#
bun run start
Docker#
docker compose up -d
Testing#
bun test # run all tests
bun run typecheck # type checking
bun run lint # linting
VM Requirements#
Minimal:
- 2GB RAM
- 2 vCPUs
- 10GB disk
Recommended:
- 4GB RAM
- 2-4 vCPUs
- 20GB disk
Scale PROCESSING_CONCURRENCY based on available RAM (each concurrent image process uses ~50-200MB).
License#
MIT