a love letter to tangled (android, iOS, and a search API)
at main 182 lines 8.5 kB view raw view rendered
1--- 2title: Data Sources & Integration 3updated: 2026-03-25 4--- 5 6Twisted pulls data from five external sources and authenticates users via Bluesky OAuth. Each source has a distinct role — no single source is authoritative for everything. 7 8## Source Overview 9 10| Source | What it provides | Access pattern | 11| --- | --- | --- | 12| **Tangled XRPC (Knots)** | Git data — file trees, blobs, commits, branches, diffs, tags | Direct XRPC calls to the knot hosting each repo | 13| **AT Protocol (PDS)** | User records — profiles, repos, issues, PRs, comments, stars, follows | `com.atproto.repo.getRecord` / `listRecords` on user's PDS | 14| **Constellation** | Social signals — star counts, follower counts, reaction counts, backlink lists | Public JSON API at `constellation.microcosm.blue` | 15| **Tap** | Real-time firehose of AT Protocol record events for authoritative indexing | WebSocket consumer, feeds the search index | 16| **JetStream** | Recent JSON activity stream for cached feed data | WebSocket consumer, feeds a bounded recent-activity cache | 17 18## Constellation 19 20[Constellation](https://constellation.microcosm.blue) is a public, self-hosted index of AT Protocol backlinks. It answers "who linked to this?" across the entire network — making it the right source for aggregated social signals instead of maintaining our own counters. 21 22### Key Endpoints 23 24**`GET /xrpc/blue.microcosm.links.getBacklinks`** — Get records linking to a target. 25 26- `subject` (required) — The target (AT-URI, DID, or URL) 27- `source` (required) — Collection and path, e.g. `sh.tangled.feed.star:subject.uri` 28- `did` — Filter to specific users (repeatable) 29- `limit` — Default 16, max 100 30- `reverse` — Reverse ordering 31 32**`GET /xrpc/blue.microcosm.links.getBacklinksCount`** — Count of links to a target. 33 34- `subject`, `source` — Same as above 35 36**`GET /xrpc/blue.microcosm.links.getManyToManyCounts`** — Secondary link counts in many-to-many relationships. 37 38- `subject`, `source`, `pathToOther` (required) 39- `did`, `otherSubject`, `limit` (optional) 40 41### Usage in Twisted 42 43| Need | Constellation call | 44| ------------------------- | ---------------------------------------------------------------------------------------- | 45| Star count for a repo | `getBacklinksCount(subject=repo_at_uri, source=sh.tangled.feed.star:subject.uri)` | 46| Who starred a repo | `getBacklinks(subject=repo_at_uri, source=sh.tangled.feed.star:subject.uri)` | 47| Follower count for a user | `getBacklinksCount(subject=user_did, source=sh.tangled.graph.follow:subject)` | 48| Who follows a user | `getBacklinks(subject=user_did, source=sh.tangled.graph.follow:subject)` | 49| Reaction count on content | `getBacklinksCount(subject=content_at_uri, source=sh.tangled.feed.reaction:subject.uri)` | 50 51This replaces the need to index and count interaction records ourselves. Our Tap pipeline still indexes interaction records for search and graph discovery, but Constellation is the source of truth for counts and lists. 52 53### Integration Notes 54 55- No authentication required. Constellation asks for a user-agent header with project name and contact. 56- Responses are paginated via cursor. Plan for multiple pages when listing (e.g., all followers). 57- The API is read-only — social actions (star, follow, react) are still AT Protocol record writes to the user's PDS. 58 59## Tangled XRPC (Knots) 60 61Knots are Tangled's git hosting servers. Each repo lives on a specific knot, identified by the knot DID in the repo's AT Protocol record. 62 63### Endpoints Used 64 65- `sh.tangled.repo.tree` — File tree for a ref 66- `sh.tangled.repo.blob` — File content 67- `sh.tangled.repo.log` — Commit history 68- `sh.tangled.repo.branches` / `sh.tangled.repo.tags` — Refs 69- `sh.tangled.repo.getDefaultBranch` — Default branch name 70- `sh.tangled.repo.diff` / `sh.tangled.repo.compare` — Diffs 71- `sh.tangled.repo.languages` — Language breakdown 72- `sh.tangled.knot.version` — Knot software version 73 74### Routing 75 76The app resolves which knot hosts a repo by reading the repo's AT Protocol record (which contains the knot DID), then resolving the knot DID to its service endpoint. XRPC calls go directly to that knot. 77 78The Tangled appview at `tangled.org` serves HTML only — there is no JSON API at the appview level. 79 80## AT Protocol (PDS) 81 82Standard AT Protocol record access for reading and writing user data. 83 84### Read Operations 85 86- `com.atproto.repo.getRecord` — Fetch a single record by collection + rkey 87- `com.atproto.repo.listRecords` — List records in a collection with pagination 88 89Used for: profiles, repo metadata, issues, PRs, comments, stars, follows, reactions. 90 91### Write Operations (Authenticated) 92 93- `com.atproto.repo.createRecord` — Create a new record (star, follow, react, issue, comment) 94- `com.atproto.repo.deleteRecord` — Delete a record (unstar, unfollow) 95 96All writes go to the authenticated user's PDS using their OAuth session. 97 98### Identity Resolution 99 100- Handle → DID via `com.atproto.identity.resolveHandle` 101- DID → DID document via PLC Directory (`plc.directory`) or `.well-known/did.json` 102- DID document → PDS endpoint (from `#atprotoPersonalDataServer` service) 103 104## Tap (Firehose) 105 106Tap provides a filtered firehose of AT Protocol events. Our indexer consumes Tap via WebSocket, indexing records into the search database. 107 108### What We Index via Tap 109 110- Repos, issues, PRs, comments, strings, profiles — for full-text search 111- Follows — for graph discovery during backfill 112- Issue state and PR status changes — for state filtering in search 113 114### What We Don't Need to Count via Tap 115 116Stars, followers, reactions — Constellation handles counts and lists. We still process these events for graph discovery but don't need to maintain our own counters. 117 118### Role In The Search Plan 119 120Tap remains the authoritative ingestion and backfill path for searchable documents. If search correctness depends on complete historical coverage, Tap or a repo resync path is the right source. 121 122### Tap Protocol 123 124- WebSocket connection with cursor-based resume 125- Events contain: operation (create/update/delete), DID, collection, rkey, CID, record payload 126- Acks required after processing each event 127- Backfill via `/repos/add` endpoint to request historical data for specific users 128 129## JetStream 130 131JetStream is a lighter JSON stream derived from the firehose. It is useful for recent activity and developer ergonomics, but it is not the authoritative source for search indexing. 132 133### Usage In Twisted 134 135- Recent activity cache for the Activity tab 136- Collection-filtered stream for `sh.tangled.*` events 137- Cursor-based resume using event timestamps 138 139### Constraints 140 141- Use JetStream for recent, cached activity only 142- Do not rely on it as the only historical backfill mechanism 143- Keep retention bounded and reconnect idempotent 144 145### Role In The Search Plan 146 147- Seed the cursor to roughly 24 hours ago on first boot 148- Persist the last processed timestamp and rewind slightly on reconnect 149- Cache normalized activity locally so clients do not each need a raw upstream stream 150- Keep Tap as the source of truth for search indexing and bulk backfill 151 152## Bluesky OAuth 153 154Authentication uses AT Protocol OAuth via `@atcute/oauth-browser-client`. 155 156### Flow 157 1581. User enters their handle 1592. App resolves handle → DID → PDS → authorization server metadata 1603. App initiates OAuth with requested scopes 1614. User authorizes in browser, redirected back to app 1625. App exchanges code for tokens 1636. Session provides `dpopFetch` for authenticated XRPC calls 164 165### Scopes 166 167The app requests scopes for: 168 169- `sh.tangled.feed.star` — Star/unstar repos 170- `sh.tangled.graph.follow` — Follow/unfollow users 171- `sh.tangled.feed.reaction` — Add reactions 172- `sh.tangled.actor.profile` — Edit profile 173- `sh.tangled.repo.issue` / `sh.tangled.repo.issue.comment` — Create issues and comments 174- `sh.tangled.repo.pull.comment` — Comment on PRs 175 176### Capacitor Integration 177 178On native platforms, OAuth callback uses a deep link URL scheme registered with Capacitor. The app listens via `App.addListener('appUrlOpen', ...)` to catch the redirect. 179 180### Session Management 181 182Tokens are stored in secure storage (encrypted localStorage on web, Capacitor Secure Storage on native). Sessions auto-refresh. The app supports multiple accounts with an account switcher.