···77- [`reference/api.md`](reference/api.md) — Go search API service
88- [`reference/app.md`](reference/app.md) — Ionic Vue mobile app
99- [`reference/lexicons.md`](reference/lexicons.md) — Tangled AT Protocol record types
1010+- [`reference/resync.md`](reference/resync.md) — Backfill and repo-resync recovery playbook
10111112## Specs
1213
+185
docs/reference/resync.md
···11+---
22+title: Backfill & Resync Playbook
33+updated: 2026-03-25
44+---
55+66+Twister's search index has three recovery paths. Choose based on what broke.
77+88+| Situation | Recovery path |
99+| ----------------------------------------------------- | -------------------------------------------- |
1010+| FTS index corrupted or drifted from stored documents | `twister reindex` |
1111+| Documents missing — never received via Tap | `twister backfill` + let the indexer consume |
1212+| Documents missing — received but fields empty/wrong | `twister enrich` |
1313+| Full index loss — DB dropped or migrated | backfill then reindex then enrich |
1414+| Tap cursor too far ahead — events skipped after a gap | cursor reset via `sync_state` table |
1515+1616+---
1717+1818+## Paths Overview
1919+2020+**Tap** is the authoritative ingest and backfill path. Documents reach the index
2121+when the `indexer` consumes events from Tap. Completeness depends on which DIDs
2222+Tap is tracking.
2323+2424+**Read-through indexing** closes gaps on demand: when the API fetches a record
2525+not yet in the index, it enqueues a background job. This supplements Tap but is
2626+not a substitute for it.
2727+2828+**JetStream** feeds only the activity cache (`/activity`). It does not contribute
2929+to the search index.
3030+3131+---
3232+3333+## Commands
3434+3535+### `twister indexer`
3636+3737+Runs the Tap consumer. Must be running continuously for real-time indexing.
3838+Persists cursor to `sync_state` table under consumer name `indexer-tap-v1`.
3939+4040+### `twister backfill`
4141+4242+Discovers users via follow graph from seed DIDs/handles, checks Tap status for
4343+each, and registers untracked repos with Tap `/repos/add`.
4444+4545+```sh
4646+# dry-run first
4747+twister backfill --seeds seeds.txt --max-hops 2 --dry-run
4848+4949+# real run
5050+twister backfill --seeds seeds.txt --max-hops 2 \
5151+ --concurrency 5 --batch-size 10 --batch-delay 1s
5252+```
5353+5454+Safe to re-run. Discovery deduplicates and `repos/add` is idempotent.
5555+5656+### `twister reindex`
5757+5858+Re-upserts stored documents into the FTS table and runs `optimize`. Does not
5959+re-fetch from upstream — only re-processes what is already in the DB.
6060+6161+```sh
6262+twister reindex # all documents
6363+twister reindex --collection sh.tangled.repo
6464+twister reindex --did did:plc:abc123
6565+twister reindex --dry-run # preview without writing
6666+```
6767+6868+Run this when: FTS results are stale after a schema migration, after a bulk
6969+document import, or whenever search quality seems inconsistent with stored data.
7070+7171+### `twister enrich`
7272+7373+Resolves missing `author_handle`, `repo_name`, and `web_url` via XRPC for
7474+documents already in the DB.
7575+7676+```sh
7777+twister enrich # all documents
7878+twister enrich --collection sh.tangled.repo.issue
7979+twister enrich --did did:plc:abc123
8080+twister enrich --dry-run
8181+```
8282+8383+Run this when: search results show documents with empty author handles, or
8484+after deploying enrichment logic changes.
8585+8686+---
8787+8888+## Scenario Playbooks
8989+9090+### FTS index out of sync
9191+9292+Documents exist in the DB but search returns wrong/stale results.
9393+9494+```sh
9595+twister reindex --dry-run # confirm scope
9696+twister reindex # re-upsert + FTS optimize
9797+```
9898+9999+Verify with `GET /search?q=<known-term>`.
100100+101101+### Documents missing from search
102102+103103+Fetch a known record directly. If it returns from `/actors/{handle}/repos/{repo}`
104104+but does not appear in `/search`, the document was never indexed.
105105+106106+1. Check if the DID is tracked by Tap. If not, run `backfill`:
107107+108108+ ```sh
109109+ twister backfill --seeds <handle-or-did> --max-hops 0
110110+ ```
111111+112112+2. Once Tap is tracking the DID, the `indexer` will deliver historical events.
113113+ Monitor progress via `GET /admin/status` (requires `ENABLE_ADMIN_ENDPOINTS=true`).
114114+115115+3. If you need the record indexed immediately, fetch it through the API — the
116116+ read-through indexer will enqueue it automatically.
117117+118118+### Enrichment gaps
119119+120120+Documents appear in search but `author_handle` or `repo_name` is empty.
121121+122122+```sh
123123+twister enrich --dry-run # preview what would be resolved
124124+twister enrich # apply
125125+twister reindex # re-sync FTS after field updates
126126+```
127127+128128+### Full index recovery
129129+130130+Use this sequence after a DB drop, migration to a new Turso database, or other
131131+full-loss event.
132132+133133+1. Confirm migrations ran: `twister api --local` performs `store.Migrate` on startup.
134134+2. Register repos with Tap:
135135+136136+ ```sh
137137+ twister backfill --seeds seeds.txt --max-hops 2 --dry-run
138138+ twister backfill --seeds seeds.txt --max-hops 2
139139+ ```
140140+141141+3. Start the indexer and let it consume: `twister indexer`
142142+4. Once backfill is complete, enrich fields and re-sync FTS:
143143+144144+ ```sh
145145+ twister enrich
146146+ twister reindex
147147+ ```
148148+149149+5. Verify: `GET /admin/status` for cursor progress, `GET /readyz` for DB health.
150150+151151+### Tap cursor reset
152152+153153+If the indexer cursor is ahead of what Tap will deliver (e.g., after a Tap
154154+instance reset), events will be skipped until the cursor catches up.
155155+156156+To reset the cursor and reprocess from the beginning of Tap's retention window:
157157+158158+```sql
159159+DELETE FROM sync_state WHERE consumer_name = 'indexer-tap-v1';
160160+```
161161+162162+Then restart the `indexer`. It will start from the head of the stream and
163163+process all events Tap delivers.
164164+165165+> **Note:** This does not cause duplicate documents — `UpsertDocument` is
166166+> idempotent. It may reprocess a large backlog depending on Tap retention.
167167+168168+---
169169+170170+## Checking Status
171171+172172+With `ENABLE_ADMIN_ENDPOINTS=true`:
173173+174174+```sh
175175+curl -H "Authorization: Bearer $ADMIN_AUTH_TOKEN" \
176176+ http://localhost:8080/admin/status
177177+```
178178+179179+Response includes:
180180+181181+- `tap.cursor` — last Tap event ID processed by the indexer
182182+- `tap.updated_at` — when the cursor was last advanced
183183+- `jetstream.cursor` — JetStream timestamp cursor (activity cache only)
184184+- `documents` — total searchable document count
185185+- `pending_jobs` — read-through indexing jobs not yet processed
+2-2
docs/roadmap.md
···2626- [x] Add a JetStream cache consumer with a persisted timestamp cursor
2727- [x] Seed the JetStream cursor to `now - 24h` on first boot and rewind slightly on reconnect
2828- [x] Store and serve bounded recent activity from the local cache
2929-- [ ] Keep Tap as the authoritative indexing and bulk backfill path
3030-- [ ] Define a controlled backfill and repo-resync playbook for recovery (`docs/references/resync.md`)
2929+- [x] Keep Tap as the authoritative indexing and bulk backfill path
3030+- [x] Define a controlled backfill and repo-resync playbook for recovery (`docs/reference/resync.md`)
31313232## API: Constellation Integration
3333
+10
packages/api/README.md
···174174| `GET /identity/did/{did}` | `https://plc.directory/{did}` or `/.well-known/did.json` |
175175| `GET /backlinks/count` | Constellation `getBacklinksCount` (cached) |
176176| `WS /activity/stream` | `wss://jetstream2.us-east.bsky.network/subscribe` |
177177+178178+## Admin endpoints
179179+180180+Available when `ENABLE_ADMIN_ENDPOINTS=true`. Require `Authorization: Bearer <ADMIN_AUTH_TOKEN>` when
181181+`ADMIN_AUTH_TOKEN` is set.
182182+183183+| Route | Description |
184184+| ---------------------- | -------------------------------------------------------- |
185185+| `GET /admin/status` | Tap cursor, JetStream cursor, document count, job queue |
186186+| `POST /admin/reindex` | Re-sync all (or filtered) documents into the FTS index |