leaflet-search#
search for leaflet.
live: leaflet-search.pages.dev
how it works#
- tap syncs leaflet content from the network
- backend indexes content into SQLite FTS5 via Turso, serves search API
- site static frontend on Cloudflare Pages
MCP server#
search is also exposed as an MCP server for AI agents like Claude Code:
claude mcp add-json leaflet '{"type": "http", "url": "https://leaflet-search-by-zzstoatzz.fastmcp.app/mcp"}'
see mcp/README.md for local setup and usage details.
api#
GET /search?q=<query>&tag=<tag> # full-text search with query, tag, or both
GET /similar?uri=<at-uri> # find similar documents via vector embeddings
GET /tags # list all tags with counts
GET /popular # popular search queries
GET /stats # document/publication counts
GET /health # health check
search returns three entity types: article (document in a publication), looseleaf (standalone document), publication (newsletter itself). tag filtering applies to documents only.
/similar uses Voyage AI embeddings with brute-force cosine similarity (~0.15s for 3500 docs).
stack#
- Fly.io hosts backend + tap
- Turso cloud SQLite with vector support
- Voyage AI embeddings (voyage-3-lite)
- Tap syncs leaflet content from ATProto firehose
- Zig HTTP server, search API, content indexing
- Cloudflare Pages static frontend
embeddings#
documents are embedded using Voyage AI's voyage-3-lite model (512 dimensions). new documents from the firehose don't automatically get embeddings - they need to be backfilled periodically.
backfill embeddings#
requires TURSO_URL, TURSO_TOKEN, and VOYAGE_API_KEY in .env:
# check how many docs need embeddings
./scripts/backfill-embeddings --dry-run
# run the backfill (uses batching + concurrency)
./scripts/backfill-embeddings --batch-size 50
the script:
- fetches docs where
embedding IS NULL - batches them to Voyage API (50 docs/batch default)
- writes embeddings to Turso in batched transactions
- runs 8 concurrent workers
note: we use brute-force cosine similarity instead of a vector index. Turso's DiskANN index has ~60s write latency per row, making it impractical for incremental updates. brute-force on 3500 vectors runs in ~0.15s which is fine for this scale.