semantic search via multimodal embeddings (understands meaning and visual content)
keyword search via BM25 full-text search (finds exact filename matches)

architecture#

setup#

install dependencies:
- rust toolchain
- python 3.11+ with uv
copy environment variables:
```
cp .env.example .env
```
set your api keys in .env:
- VOYAGE_API_TOKEN - for generating embeddings
- TURBOPUFFER_API_KEY - for vector storage

to populate the vector store with bufos:

just re-index

this will:

run the server locally:

cargo run

the app will be available at http://localhost:8080

deploy to fly.io:

fly launch  # first time
fly secrets set VOYAGE_API_TOKEN=your_token
fly secrets set TURBOPUFFER_API_KEY=your_key
just deploy

the search API supports these parameters:

query: search text (required)
top_k: number of results (default: 10)
alpha: fusion weight (default: 0.7)
- 1.0 = pure semantic (best for conceptual queries like "happy", "apocalyptic")
- 0.7 = default (balances semantic understanding with exact matches)
- 0.5 = balanced (equal weight to both signals)
- 0.0 = pure keyword (best for exact filename searches)

example: /api/search?query=jumping&top_k=5&alpha=0.5

all bufo images are processed through early fusion multimodal embeddings:

filename text extracted (e.g., "bufo-jumping-on-bed" → "bufo jumping on bed")
combined with image content in single embedding request
voyage-multimodal-3 creates 1024-dim vectors capturing both text and visual features
uploaded to turbopuffer with BM25-enabled name field for keyword search

semantic branch: query embedded using voyage-multimodal-3 with input_type="query"
keyword branch: BM25 full-text search against bufo names
fusion: weighted combination using alpha parameter
- score = α * semantic + (1-α) * keyword
- both scores normalized to 0-1 range before fusion
ranking: results sorted by fused score, top_k returned

semantic alone: misses exact filename matches (e.g., "happy" might not find "bufo-is-happy")
keyword alone: no semantic understanding (e.g., "happy" won't find "excited" or "smiling")
hybrid: gets the best of both worlds