pull requests: design exploration#
⚠️ CRITICAL: THIS DESIGN CANNOT BE IMPLEMENTED ⚠️
Date discovered: 2025-10-11 Reason: Pull requests in tangled use a fundamentally different architecture than issues.
Why PR support via atproto records doesn't work#
After implementing this design and testing, we discovered that tangled's appview does not ingest pull records from the firehose. Here's the architectural difference:
Issues (✅ Works):#
- Jetstream subscription:
appview/state/state.go:109subscribes tosh.tangled.repo.issue - Firehose ingester:
appview/ingester.go:79-80hasingestIssue()function - Pattern: atproto record is source of truth → firehose keeps database synchronized
- Result: MCP tools work! Create issue via atproto → appears on tangled.org
Pulls (❌ Broken):#
- No jetstream subscription:
tangled.RepoPullNSIDis missing from subscription list - No firehose ingester: No
ingestPull()function exists inappview/ingester.go - Pattern: Database is source of truth → atproto record is decorative
- Web UI flow (
appview/pulls/pulls.go:1196):- Creates DB entry FIRST via
db.NewPull() - THEN creates atproto record as "announcement"
- Creates DB entry FIRST via
- Result: MCP-created PRs are orphan records that exist on PDS but never appear on tangled.org
Why this design exists#
Looking at the code (appview/pulls/pulls.go), pulls have:
- Submissions array: Multiple rounds of patches (DB-only concept)
- Size concerns: Patches can be megabytes (expensive in atproto)
- Complexity: Appview manages pull state in its database with features not in atproto schema
- Pragmatism: They wanted to ship pulls quickly, made DB the source of truth
Evidence#
# Issues are subscribed in jetstream
grep -n "RepoIssueNSID" sandbox/tangled-core/appview/state/state.go
# 109: tangled.RepoIssueNSID,
# Pulls are NOT subscribed
grep -n "RepoPullNSID" sandbox/tangled-core/appview/state/state.go
# (no results)
# Issues have an ingester
grep -A 2 "RepoIssueNSID:" sandbox/tangled-core/appview/ingester.go
# case tangled.RepoIssueNSID:
# err = i.ingestIssue(ctx, e)
# Pulls do NOT have an ingester
grep "RepoPullNSID:" sandbox/tangled-core/appview/ingester.go
# (no results)
Conclusion#
Pull request support cannot be added to tangled-mcp without changes to tangled-core (adding firehose consumer). The design below is architecturally sound but incompatible with tangled's current implementation.
design principle: gh CLI parity#
goal: tangled-mcp pull request tools should be a subset of gh pr commands with matching semantics
- users familiar with
ghshould feel at home - parameters should match where possible
- we implement what tangled's atproto schema supports
- we don't try to exceed gh's surface area
gh pr commands (reference)#
general commands (gh)#
gh pr create- create a PR with title, body, labels, draft stategh pr list- list PRs with filters (state, labels, author, base, head)gh pr view- show details of a single PRgh pr close- close a PRgh pr reopen- reopen a closed PRgh pr edit- edit title, body, labels, base branchgh pr merge- merge a PR
what we can support (tangled MCP)#
| gh command | tangled tool | notes |
|---|---|---|
gh pr create |
create_repo_pull |
✅ title, body, base, head, labels, draft |
gh pr list |
list_repo_pulls |
✅ state, labels, limit filtering |
gh pr view |
get_repo_pull |
✅ full details of one PR |
gh pr close |
close_repo_pull |
✅ via status update |
gh pr reopen |
reopen_repo_pull |
✅ via status update |
gh pr edit |
update_repo_pull |
✅ title, body, labels |
gh pr merge |
merge_repo_pull |
✅ via status update (logical, not git merge) |
gh pr comment |
❌ not v1 | need sh.tangled.repo.pull.comment support |
gh pr diff |
❌ not v1 | could show patch field |
gh pr checks |
❌ not supported | no CI concept in tangled |
gh pr review |
❌ not v1 | need review records |
current state#
what we have (issues)#
- collection:
sh.tangled.repo.issue(stored on user's PDS) - extra fields:
issueId(sequential),owner(creator DID) - labels: separate
sh.tangled.label.oprecords, applied/removed via ops - operations: create, update, delete, list, list_labels
- no state tracking: we don't use
sh.tangled.repo.issue.statusyet - no comments: we don't use
sh.tangled.repo.issue.commentyet
what we have (branches)#
- knot XRPC:
sh.tangled.repo.listBranchesquery via knot - read-only: no branch creation/deletion yet
- operations: list only
pull request schema (from lexicons)#
core record: sh.tangled.repo.pull#
{
target: {
repo: string (at-uri), // where it's merging to
branch: string
},
source: {
branch: string, // where it's coming from
sha: string (40 chars), // commit hash
repo?: string (at-uri) // optional: for cross-repo pulls
},
title: string,
body?: string,
patch: string, // git diff format
createdAt: datetime
}
state tracking: sh.tangled.repo.pull.status#
{
pull: string (at-uri), // reference to pull record
status: "open" | "closed" | "merged"
}
comments: sh.tangled.repo.pull.comment#
{
pull: string (at-uri),
body: string,
createdAt: datetime
}
design questions#
1. sequential IDs (pullId)#
question: should pulls have pullId like issues have issueId?
considerations:
- human-friendly references: "PR #42" vs AT-URI
- need to maintain counter per-repo (same pattern as issues)
- easier for users to reference in comments/descriptions
- tangled.org URLs probably expect this:
tangled.org/@owner/repo/pulls/42
recommendation: yes, add pullId field following issue pattern
2. resources vs tools#
current: 1 resource (tangled://status), 6 tools
question: should we expose repos/issues/pulls as MCP resources?
resources are for:
- read-only data
- things that change over time
- content the LLM should "know about" contextually
- example:
tangled://repo/{owner}/{repo}/issues→ feed LLM current issues
tools are for:
- actions (create, update, delete)
- queries with parameters
- returning specific structured data
potential resources:
tangled://repo/{owner}/{repo}→ repo metadata, default branch, descriptiontangled://repo/{owner}/{repo}/issues→ current open issuestangled://repo/{owner}/{repo}/pulls→ current open pullstangled://repo/{owner}/{repo}/branches→ all branches
recommendation:
- keep tools for actions (create/update/delete)
- add resources for "current state" views
- resources update when queried (not live/streaming)
3. patch generation#
question: where does the patch field come from?
options: a) client generates: user provides git diff output
- pros: simple, no server-side git ops
- cons: user burden, error-prone
b) knot XRPC: new endpoint sh.tangled.repo.generatePatch
- pros: server generates correct diff
- cons: requires knot changes
c) hybrid: accept both user-provided patch OR branch refs
- if patch provided: use it
- if only branches: call knot to generate
- pros: flexible
- cons: more complex
recommendation: start with (a), add (b) later as knot capability
4. cross-repo pulls (forks)#
question: how to handle source.repo different from target.repo?
use case: fork workflow
- user forks
owner-a/repotoowner-b/repo - makes changes on fork
- opens pull from
owner-b/repo:feature→owner-a/repo:main
challenges:
- need to resolve both repos (source and target)
- patch generation across repos
- permissions: who can create pulls?
recommendation:
- v1: same-repo pulls only (
source.repooptional, defaults to target) - v2: add cross-repo support once we understand patterns
5. state management#
question: do we track state separately or in-record?
issues: currently don't use sh.tangled.repo.issue.status
pulls: lexicon has sh.tangled.repo.pull.status
pattern from labels:
- labels are separate ops
- ops can be applied/reverted
- current state = sum of all ops
state is simpler:
- open → closed → merged (mostly linear)
- probably doesn't need full ops history
- could just update a field on pull record OR use status records
recommendation:
- use
sh.tangled.repo.pull.statusrecords (follow lexicon) - easier to track state changes over time
- consistent with label pattern
- can query "all status changes for pull X"
draft state: gh treats draft as a separate boolean, but tangled's lexicon shows:
sh.tangled.repo.pull.status.opensh.tangled.repo.pull.status.closedsh.tangled.repo.pull.status.merged
solution: we could:
- (a) add custom
draftfield to pull record (not in lexicon, might break) - (b) treat draft as metadata in status record
- (c) add
sh.tangled.repo.pull.status.draftas custom state
recommendation: (c) - add draft as a custom state value
- fits existing pattern
draft→opentransition when ready- backwards compatible (lexicon uses
knownValuesnot enum)
6. interconnections#
entities that reference each other:
- issues mention pulls: "closes #42"
- pulls mention issues: "fixes #123"
- both have labels, comments
- pulls reference commits/branches
question: how to expose these relationships?
options: a) inline: include referenced entities in responses
list_repo_pullsreturns issues it closes- bloats responses
b) separate queries: tools to fetch relationships
get_pull_related_issues(pull_id)get_issue_related_pulls(issue_id)- more API calls but cleaner
c) resources: expose as graphs
tangled://repo/{owner}/{repo}/graph→ all entities + edges- LLM can traverse
- ambitious
recommendation: start with (b), consider (c) as resources mature
7. comments#
question: support issue/pull comments now or later?
considerations:
- both have
commentcollections - valuable for context (PR review discussions)
- adds complexity (list, create, update, delete comments)
recommendation:
- v1: skip comments, focus on core pull CRUD
- v2: add comments once pull basics work
- keeps initial scope manageable
tool signatures (gh-style)#
create_repo_pull#
def create_repo_pull(
repo: str, # gh: -R, --repo
title: str, # gh: -t, --title
body: str | None = None, # gh: -b, --body
base: str = "main", # gh: -B, --base (target branch)
head: str, # gh: -H, --head (source branch)
source_sha: str, # commit hash (required, no gh equiv - we need it for atproto)
patch: str, # git diff (required, no gh equiv - atproto schema)
labels: list[str] | None = None, # gh: -l, --label
draft: bool = False, # gh: -d, --draft
) -> CreatePullResult:
"""
create a pull request
similar to: gh pr create --title "..." --body "..." --base main --head feature --label bug --draft
"""
update_repo_pull#
def update_repo_pull(
repo: str, # gh: -R, --repo
pull_id: int, # gh: <number>
title: str | None = None, # gh: -t, --title
body: str | None = None, # gh: -b, --body
base: str | None = None, # gh: -B, --base (change target branch)
add_labels: list[str] | None = None, # gh: --add-label
remove_labels: list[str] | None = None, # gh: --remove-label
) -> UpdatePullResult:
"""
edit a pull request
similar to: gh pr edit 42 --title "..." --add-label bug --remove-label wontfix
"""
list_repo_pulls#
def list_repo_pulls(
repo: str, # gh: -R, --repo
state: str = "open", # gh: -s, --state {open|closed|merged|all}
labels: list[str] | None = None, # gh: -l, --label
base: str | None = None, # gh: -B, --base (filter by target branch)
head: str | None = None, # gh: -H, --head (filter by source branch)
draft: bool | None = None, # gh: -d, --draft (filter by draft state)
limit: int = 30, # gh: -L, --limit (default 30)
cursor: str | None = None, # pagination
) -> ListPullsResult:
"""
list pull requests
similar to: gh pr list --state open --label bug --limit 50
"""
get_repo_pull#
def get_repo_pull(
repo: str, # gh: -R, --repo
pull_id: int, # gh: <number>
) -> PullInfo:
"""
view a pull request
similar to: gh pr view 42
"""
close_repo_pull#
def close_repo_pull(
repo: str, # gh: -R, --repo
pull_id: int, # gh: <number>
) -> UpdatePullResult:
"""
close a pull request (sets status to closed)
similar to: gh pr close 42
"""
reopen_repo_pull#
def reopen_repo_pull(
repo: str, # gh: -R, --repo
pull_id: int, # gh: <number>
) -> UpdatePullResult:
"""
reopen a closed pull request (sets status back to open)
similar to: gh pr reopen 42
"""
merge_repo_pull#
def merge_repo_pull(
repo: str, # gh: -R, --repo
pull_id: int, # gh: <number>
) -> UpdatePullResult:
"""
mark a pull request as merged (sets status to merged)
note: this is a logical merge (status change), not an actual git merge
similar to: gh pr merge 42 (but without the git operation)
"""
proposed roadmap#
phase 1: core pr operations (gh parity)#
7 tools matching gh pr commands:
-
create_repo_pull (matches
gh pr create)- parameters: repo, title, body, base, head, source_sha, patch, labels, draft
- generates pullId (like issueId)
- creates
sh.tangled.repo.pullrecord - creates initial
sh.tangled.repo.pull.statusrecord (open or draft) - applies labels if provided
- returns CreatePullResult with pullId and URL
-
update_repo_pull (matches
gh pr edit)- parameters: repo, pull_id, title, body, base, add_labels, remove_labels
- updates pull record via putRecord + swap
- handles incremental label changes (add/remove pattern)
- returns UpdatePullResult with pullId and URL
-
list_repo_pulls (matches
gh pr list)- parameters: repo, state, labels, base, head, draft, limit, cursor
- queries
sh.tangled.repo.pull+ correlates with status - filters by state (open/closed/merged/all), labels, branches, draft
- default limit 30 (matching gh)
- includes labels for each pull
- returns ListPullsResult with pulls and cursor
-
get_repo_pull (matches
gh pr view)- parameters: repo, pull_id
- fetches single pull with full details (target, source, patch, status, labels)
- returns PullInfo model
-
close_repo_pull (matches
gh pr close)- parameters: repo, pull_id
- creates new status record with "closed"
- returns UpdatePullResult
-
reopen_repo_pull (matches
gh pr reopen)- parameters: repo, pull_id
- creates new status record with "open"
- only works if current status is "closed"
- returns UpdatePullResult
-
merge_repo_pull (matches
gh pr mergelogically)- parameters: repo, pull_id
- creates new status record with "merged"
- note: logical merge only (no git operation)
- returns UpdatePullResult
types (following issue pattern):
PullInfomodel (uri, cid, pullId, title, body, target, source, createdAt, labels, status)CreatePullResult(repo, pull_id, url)UpdatePullResult(repo, pull_id, url)ListPullsResult(pulls, cursor)
new module:
src/tangled_mcp/_tangled/_pulls.py(parallel to _issues.py)src/tangled_mcp/types/_pulls.py(parallel to _issues.py)
phase 2: labels + better state#
enhancements:
- pulls support labels (reuse
_validate_labels,_apply_labels) list_pull_status_history(repo, pull_id)→ all status changes- pull status in URL:
tangled.org/@owner/repo/pulls/42shows status
phase 3: resources#
add resources:
tangled://repo/{owner}/{repo}/pulls/open→ current open PRstangled://repo/{owner}/{repo}/pulls/{pull_id}→ specific pull context- helps LLM understand "what PRs exist for this repo?"
phase 4: cross-repo + comments#
ambitious:
- cross-repo pull support (forks)
- comment creation/listing
- patch generation via knot XRPC
open questions#
1. draft state implementation#
question: should we use sh.tangled.repo.pull.status.draft as a custom state?
- option a: separate draft field on pull record (not in lexicon)
- option b: draft as custom status value
sh.tangled.repo.pull.status.draft - recommended: option b - fits lexicon pattern,
draft→opentransition
2. patch format (v1 scope)#
question: how do users provide the patch field?
- v1: user provides git diff string (simple, no server dependency)
- v2: could add knot XRPC to generate from branch refs
- gh equivalent:
gh pr createauto-generates from local branch
3. ready-for-review transition#
question: gh pr ready marks draft PR as ready - how to support?
- option a: separate
ready_repo_pull(repo, pull_id)tool - option b: use
update_repo_pullwith status change - recommended: option a - matches gh command structure
4. URL format confirmation#
assumption: https://tangled.org/@owner/repo/pulls/42
- matches issue pattern
- need to confirm with tangled.org routing
5. merge semantics#
clarification: merge_repo_pull is logical only (status change)
- does NOT perform git merge operation
- tangled may handle actual merge separately
- gh does both (status + git operation)
implementation notes#
pattern consistency with issues#
# issues pattern (working)
def create_issue(repo_id, title, body, labels):
# 1. resolve repo to AT-URI
# 2. find next issueId
# 3. create record with tid rkey
# 4. apply labels if provided
# 5. return {uri, cid, issueId}
# pulls pattern (proposed)
def create_pull(repo_id, target_branch, source_branch, source_sha, title, body, patch, labels):
# 1. resolve repo to AT-URI
# 2. find next pullId
# 3. create pull record with {target, source, patch, ...}
# 4. create status record (open)
# 5. apply labels if provided
# 6. return {uri, cid, pullId}
state tracking pattern#
def _get_current_pull_status(client, pull_uri):
"""get latest status for a pull by querying status records"""
status_records = client.com.atproto.repo.list_records(
collection="sh.tangled.repo.pull.status",
repo=client.me.did,
)
# find all status records for this pull
pull_statuses = [
r for r in status_records.records
if getattr(r.value, "pull", None) == pull_uri
]
# return most recent (last created)
if pull_statuses:
latest = max(pull_statuses, key=lambda r: getattr(r.value, "createdAt", ""))
return getattr(latest.value, "status", "open")
return "open" # default
label reuse#
# labels work the same for issues and pulls
# _apply_labels takes a subject_uri (issue or pull)
_apply_labels(client, pull_uri, labels, repo_labels, current_labels)
# so label ops are generic:
{
"$type": "sh.tangled.label.op",
"subject": "at://did/sh.tangled.repo.pull/abc123", # or issue URI
"add": [...],
"delete": [...]
}
summary: gh pr → tangled MCP mapping#
v1 feature matrix (phase 1)#
| feature | gh command | tangled tool | parameters | notes |
|---|---|---|---|---|
| create PR | gh pr create |
create_repo_pull |
title, body, base, head, source_sha, patch, labels, draft | ✅ full parity except auto-patch |
| edit PR | gh pr edit |
update_repo_pull |
title, body, base, add_labels, remove_labels | ✅ full parity |
| list PRs | gh pr list |
list_repo_pulls |
state, labels, base, head, draft, limit | ✅ full parity (default limit 30) |
| view PR | gh pr view |
get_repo_pull |
pull_id | ✅ full parity |
| close PR | gh pr close |
close_repo_pull |
pull_id | ✅ status change only |
| reopen PR | gh pr reopen |
reopen_repo_pull |
pull_id | ✅ status change only |
| merge PR | gh pr merge |
merge_repo_pull |
pull_id | ⚠️ logical only (no git merge) |
| mark ready | gh pr ready |
ready_repo_pull |
pull_id | ✅ draft → open transition |
not in v1 (future)#
gh pr comment→ needsh.tangled.repo.pull.commentsupportgh pr diff→ could showpatchfieldgh pr review→ need review recordsgh pr checks→ no CI concept in tangled
key differences from gh#
- patch field: users provide git diff string (gh auto-generates)
- merge: logical status change only (gh performs git merge)
- source_sha: required parameter (gh infers from branch)
- pullId: explicit numeric ID (gh uses number or branch name)
preserved gh patterns#
- parameter names match gh flags where possible
- state values:
open,closed,merged,draft - filtering: state, labels, base, head, draft
- default limit: 30 (matching gh pr list)
- incremental label updates: add/remove pattern