title: Data Sources & Integration updated: 2026-03-25#
Twisted pulls data from five external sources and authenticates users via Bluesky OAuth. Each source has a distinct role — no single source is authoritative for everything.
Source Overview#
| Source | What it provides | Access pattern |
|---|---|---|
| Tangled XRPC (Knots) | Git data — file trees, blobs, commits, branches, diffs, tags | Direct XRPC calls to the knot hosting each repo |
| AT Protocol (PDS) | User records — profiles, repos, issues, PRs, comments, stars, follows | com.atproto.repo.getRecord / listRecords on user's PDS |
| Constellation | Social signals — star counts, follower counts, reaction counts, backlink lists | Public JSON API at constellation.microcosm.blue |
| Tap | Real-time firehose of AT Protocol record events for authoritative indexing | WebSocket consumer, feeds the search index |
| JetStream | Recent JSON activity stream for cached feed data | WebSocket consumer, feeds a bounded recent-activity cache |
Constellation#
Constellation is a public, self-hosted index of AT Protocol backlinks. It answers "who linked to this?" across the entire network — making it the right source for aggregated social signals instead of maintaining our own counters.
Key Endpoints#
GET /xrpc/blue.microcosm.links.getBacklinks — Get records linking to a target.
subject(required) — The target (AT-URI, DID, or URL)source(required) — Collection and path, e.g.sh.tangled.feed.star:subject.uridid— Filter to specific users (repeatable)limit— Default 16, max 100reverse— Reverse ordering
GET /xrpc/blue.microcosm.links.getBacklinksCount — Count of links to a target.
subject,source— Same as above
GET /xrpc/blue.microcosm.links.getManyToManyCounts — Secondary link counts in many-to-many relationships.
subject,source,pathToOther(required)did,otherSubject,limit(optional)
Usage in Twisted#
| Need | Constellation call |
|---|---|
| Star count for a repo | getBacklinksCount(subject=repo_at_uri, source=sh.tangled.feed.star:subject.uri) |
| Who starred a repo | getBacklinks(subject=repo_at_uri, source=sh.tangled.feed.star:subject.uri) |
| Follower count for a user | getBacklinksCount(subject=user_did, source=sh.tangled.graph.follow:subject) |
| Who follows a user | getBacklinks(subject=user_did, source=sh.tangled.graph.follow:subject) |
| Reaction count on content | getBacklinksCount(subject=content_at_uri, source=sh.tangled.feed.reaction:subject.uri) |
This replaces the need to index and count interaction records ourselves. Our Tap pipeline still indexes interaction records for search and graph discovery, but Constellation is the source of truth for counts and lists.
Integration Notes#
- No authentication required. Constellation asks for a user-agent header with project name and contact.
- Responses are paginated via cursor. Plan for multiple pages when listing (e.g., all followers).
- The API is read-only — social actions (star, follow, react) are still AT Protocol record writes to the user's PDS.
Tangled XRPC (Knots)#
Knots are Tangled's git hosting servers. Each repo lives on a specific knot, identified by the knot DID in the repo's AT Protocol record.
Endpoints Used#
sh.tangled.repo.tree— File tree for a refsh.tangled.repo.blob— File contentsh.tangled.repo.log— Commit historysh.tangled.repo.branches/sh.tangled.repo.tags— Refssh.tangled.repo.getDefaultBranch— Default branch namesh.tangled.repo.diff/sh.tangled.repo.compare— Diffssh.tangled.repo.languages— Language breakdownsh.tangled.knot.version— Knot software version
Routing#
The app resolves which knot hosts a repo by reading the repo's AT Protocol record (which contains the knot DID), then resolving the knot DID to its service endpoint. XRPC calls go directly to that knot.
The Tangled appview at tangled.org serves HTML only — there is no JSON API at the appview level.
AT Protocol (PDS)#
Standard AT Protocol record access for reading and writing user data.
Read Operations#
com.atproto.repo.getRecord— Fetch a single record by collection + rkeycom.atproto.repo.listRecords— List records in a collection with pagination
Used for: profiles, repo metadata, issues, PRs, comments, stars, follows, reactions.
Write Operations (Authenticated)#
com.atproto.repo.createRecord— Create a new record (star, follow, react, issue, comment)com.atproto.repo.deleteRecord— Delete a record (unstar, unfollow)
All writes go to the authenticated user's PDS using their OAuth session.
Identity Resolution#
- Handle → DID via
com.atproto.identity.resolveHandle - DID → DID document via PLC Directory (
plc.directory) or.well-known/did.json - DID document → PDS endpoint (from
#atprotoPersonalDataServerservice)
Tap (Firehose)#
Tap provides a filtered firehose of AT Protocol events. Our indexer consumes Tap via WebSocket, indexing records into the search database.
What We Index via Tap#
- Repos, issues, PRs, comments, strings, profiles — for full-text search
- Follows — for graph discovery during backfill
- Issue state and PR status changes — for state filtering in search
What We Don't Need to Count via Tap#
Stars, followers, reactions — Constellation handles counts and lists. We still process these events for graph discovery but don't need to maintain our own counters.
Role In The Search Plan#
Tap remains the authoritative ingestion and backfill path for searchable documents. If search correctness depends on complete historical coverage, Tap or a repo resync path is the right source.
Tap Protocol#
- WebSocket connection with cursor-based resume
- Events contain: operation (create/update/delete), DID, collection, rkey, CID, record payload
- Acks required after processing each event
- Backfill via
/repos/addendpoint to request historical data for specific users
JetStream#
JetStream is a lighter JSON stream derived from the firehose. It is useful for recent activity and developer ergonomics, but it is not the authoritative source for search indexing.
Usage In Twisted#
- Recent activity cache for the Activity tab
- Collection-filtered stream for
sh.tangled.*events - Cursor-based resume using event timestamps
Constraints#
- Use JetStream for recent, cached activity only
- Do not rely on it as the only historical backfill mechanism
- Keep retention bounded and reconnect idempotent
Role In The Search Plan#
- Seed the cursor to roughly 24 hours ago on first boot
- Persist the last processed timestamp and rewind slightly on reconnect
- Cache normalized activity locally so clients do not each need a raw upstream stream
- Keep Tap as the source of truth for search indexing and bulk backfill
Bluesky OAuth#
Authentication uses AT Protocol OAuth via @atcute/oauth-browser-client.
Flow#
- User enters their handle
- App resolves handle → DID → PDS → authorization server metadata
- App initiates OAuth with requested scopes
- User authorizes in browser, redirected back to app
- App exchanges code for tokens
- Session provides
dpopFetchfor authenticated XRPC calls
Scopes#
The app requests scopes for:
sh.tangled.feed.star— Star/unstar repossh.tangled.graph.follow— Follow/unfollow userssh.tangled.feed.reaction— Add reactionssh.tangled.actor.profile— Edit profilesh.tangled.repo.issue/sh.tangled.repo.issue.comment— Create issues and commentssh.tangled.repo.pull.comment— Comment on PRs
Capacitor Integration#
On native platforms, OAuth callback uses a deep link URL scheme registered with Capacitor. The app listens via App.addListener('appUrlOpen', ...) to catch the redirect.
Session Management#
Tokens are stored in secure storage (encrypted localStorage on web, Capacitor Secure Storage on native). Sessions auto-refresh. The app supports multiple accounts with an account switcher.