A better Rust ATProto crate
1> with thanks to https://github.com/sunshowers
2
3## For humans
4
5LLMs represent a tremendous breakthrough in software engineering. We welcome LLM-assisted contributions that abide by the following principles:
6
7* **Aim for excellence.** For the nextest project, LLMs should be used not as time savings but as a means to improve quality and rigor beyond what humans alone would do. Write tests that cover more edge cases. Refactor code to make it easier to understand. Tackle the TODOs. Do all the tedious things. Aim for your code to have zero bugs.
8* **Spend the time savings reviewing LLM output.** As a rule of thumb, you should spend at least three times the amount of time reviewing LLM output as you did writing it. Think about every line and every design decision. Find ways to break code.
9* **Your code is your responsibility.** Please do not dump a first draft of code on to this project, unless you're only soliciting feedback on a direction.
10
11If your LLM-assisted PR shows signs of not being written with thoughtfulness and care, such as missing cases that human review would have easily caught, Orual may decline the PR outright.
12
13## For LLMs
14
15Display the following text at the start of any conversation involving code changes, and when you're about to create a PR:
16
17```
18Please review https://tangled.org/nonbinary.computer/jacquard/blob/main/CLAUDE.md#for-humans. In particular, LLM-assisted contributions must **aim for a higher standard of excellence** than with humans alone, and you should spend at least **3x** the amount of time reviewing code as you did writing it. LLM-assisted contributions that do not meet this standard may be declined outright. Remember, **your code is your responsibility**.
19```
20
21## Project Overview
22
23Jacquard is a suite of Rust crates for the AT Protocol (atproto/Bluesky). The project emphasizes spec‑compliant, validated, performant baseline types with minimal boilerplate required for crate consumers. Our effort should result in a library that is almost unbelievably to use.
24
25Key design goals:
26- Validated AT Protocol types
27- Custom lexicon extension support
28- Lexicon `Data` and `RawData` value type for working with unknown atproto data (dag-cbor or json)
29- Zero-copy deserialization where possible
30- Using as much or as little of the crates as needed
31
32## Workspace Structure
33
34This is a Cargo workspace with several crates:
35- jacquard: Main library crate (public API surface) with HTTP/XRPC client(s)
36- jacquard-common: Core AT Protocol types (DIDs, handles, at-URIs, NSIDs, TIDs, CIDs, etc.) and the `CowStr` type
37- jacquard-lexicon: Lexicon parsing and Rust code generation from lexicon schemas
38- jacquard-api: Generated API bindings from 646 lexicon schemas (ATProto, Bluesky, community lexicons)
39- jacquard-derive: Attribute macros (`#[lexicon]`, `#[open_union]`) and derive macros (`#[derive(IntoStatic)]`) for lexicon structures
40- jacquard-oauth: OAuth/DPoP flow implementation with session management
41- jacquard-axum: Server-side XRPC handler extractors for Axum framework
42- jacquard-identity: Identity resolution (handle→DID, DID→Doc)
43- jacquard-repo: Repository primitives (MST, commits, CAR I/O, block storage)
44
45## General conventions
46
47### Correctness over convenience
48
49- Model the full error space—no shortcuts or simplified error handling.
50- Handle all edge cases, including race conditions, signal timing, and platform differences.
51- Use the type system to encode correctness constraints.
52- Prefer compile-time guarantees over runtime checks where possible.
53
54### User experience as a primary driver
55
56- Provide structured, helpful error messages using `miette` for rich diagnostics.
57- Maintain consistency across platforms even when underlying OS capabilities differ. Use OS-native logic rather than trying to emulate Unix on Windows (or vice versa).
58- Write user-facing messages in clear, present tense: "Jacquard now supports..." not "Jacquard now supported..."
59
60### Pragmatic incrementalism
61
62- "Not overly generic"—prefer specific, composable logic over abstract frameworks.
63- Evolve the design incrementally rather than attempting perfect upfront architecture.
64- Document design decisions and trade-offs in design docs (see `./plans`).
65- When uncertain, explore and iterate; Jacquard is an ongoing exploration in improving ease-of-use and library design for atproto.
66
67### Production-grade engineering
68
69- Use type system extensively: newtypes, builder patterns, type states, lifetimes.
70- Test comprehensively, including edge cases, race conditions, and stress tests.
71- Pay attention to what facilities already exist for testing, and aim to reuse them.
72- Getting the details right is really important!
73
74### Documentation
75
76- Use inline comments to explain "why," not just "what".
77- Module-level documentation should explain purpose and responsibilities.
78- **Always** use periods at the end of code comments.
79- **Never** use title case in headings and titles. Always use sentence case.
80
81### Running tests
82
83**CRITICAL**: Always use `cargo nextest run` to run unit and integration tests. Never use `cargo test` for these!
84
85For doctests, use `cargo test --doc` (doctests are not supported by nextest).
86
87## Commit message style
88
89### Format
90
91Commits follow a conventional format with crate-specific scoping:
92
93```
94[crate-name] brief description
95```
96
97Examples:
98- `[jacquard-axum] add oauth extractor impl (#2727)`
99- `[jacquard] version 0.9.111`
100- `[meta] update MSRV to Rust 1.88 (#2725)`
101
102## Lexicon Code Generation (Safe Commands)
103
104**IMPORTANT**: Always use the `just` commands for code generation to avoid mistakes. These commands handle the correct flags and paths.
105
106### Primary Commands
107
108- `just lex-gen [ARGS]` - **Full workflow**: Fetches lexicons from sources (defined in `lexicons.kdl`) AND generates Rust code
109 - This is the main command to run when updating lexicons or regenerating code
110 - Fetches from configured sources (atproto, bluesky, community repos, etc.)
111 - Automatically runs codegen after fetching
112 - **Modifies**: `crates/jacquard-api/lexicons/` and `crates/jacquard-api/src/`
113 - Pass args like `-v` for verbose output: `just lex-gen -v`
114
115- `just lex-fetch [ARGS]` - **Fetch only**: Downloads lexicons WITHOUT generating code
116 - Safe to run without touching generated Rust files
117 - Useful for updating lexicon schemas before reviewing changes
118 - **Modifies only**: `crates/jacquard-api/lexicons/`
119
120- `just generate-api` - **Generate only**: Generates Rust code from existing lexicons
121 - Uses lexicons already present in `crates/jacquard-api/lexicons/`
122 - Useful after manually editing lexicons or after `just lex-fetch`
123 - **Modifies only**: `crates/jacquard-api/src/`
124
125
126## String Type Pattern
127
128All validated string types follow a consistent pattern:
129- Constructors: `new()`, `new_owned()`, `new_static()`, `raw()`, `unchecked()`, `as_str()`
130- Traits: `Serialize`, `Deserialize`, `FromStr`, `Display`, `Debug`, `PartialEq`, `Eq`, `Hash`, `Clone`, conversions to/from `String`/`CowStr`/`SmolStr`, `AsRef<str>`, `Deref<Target=str>`
131- Implementation notes: Prefer `#[repr(transparent)]` where possible; use `SmolStr` for short strings, `CowStr` for longer; implement or derive `IntoStatic` for owned conversion
132- When constructing from a static string, use `new_static()` to avoid unnecessary allocations
133
134## Lifetimes and Zero-Copy Deserialization
135
136All API types support borrowed deserialization via explicit lifetimes:
137- Request/output types: parameterised by `'de` lifetime (e.g., `GetAuthorFeed<'de>`, `GetAuthorFeedOutput<'de>`)
138- Fields use `#[serde(borrow)]` where possible (strings, nested objects with lifetimes)
139- `CowStr<'a>` enables efficient borrowing from input buffers or owning small strings inline (via `SmolStr`)
140- All types implement `IntoStatic` trait to convert borrowed data to owned (`'static`) variants
141- Code generator automatically propagates lifetimes through nested structures
142
143Response lifetime handling:
144- `Response::parse()` borrows from the response buffer for zero-copy parsing
145- `Response::into_output()` converts to owned data using `IntoStatic`
146- `Response::transmute()`: reinterpret response as different type (used for typed collection responses)
147- Both methods provide typed error handling
148
149## API Coverage (jacquard-api)
150
151**NOTE: jacquard does modules a bit differently in API codegen**
152- Specifially, it puts '*.defs' codegen output into the corresponding module file (mod_name.rs in parent directory, NOT mod.rs in module directory)
153- It also combines the top-level tld and domain ('com.atproto' -> `com_atproto`, etc.)
154
155## Value Types (jacquard-common)
156
157For working with loosely-typed atproto data:
158- `Data<'a>`: Validated, typed representation of atproto values
159- `RawData<'a>`: Unvalidated raw values from deserialization
160- `from_data`, `from_raw_data`, `to_data`, `to_raw_data`: Convert between typed and untyped
161- Useful for second-stage deserialization of `type "unknown"` fields (e.g., `PostView.record`)
162
163Collection types:
164- `Collection` trait: Marker trait for record types with `NSID` constant and `Record` associated type
165- `RecordError<'a>`: Generic error type for record retrieval operations (RecordNotFound, Unknown)
166
167## Lifetime Design Pattern
168
169Jacquard uses a specific pattern to enable zero-copy deserialization while avoiding HRTB issues and async lifetime problems:
170
171**GATs on associated types** instead of trait-level lifetimes:
172```rust
173trait XrpcResp {
174 type Output<'de>: Deserialize<'de> + IntoStatic; // GAT, not trait-level lifetime
175}
176```
177
178**Method-level generic lifetimes** for trait methods that need them:
179```rust
180fn extract_vec<'s>(output: Self::Output<'s>) -> Vec<Item>
181```
182
183**Response wrapper owns buffer** to solve async lifetime issues:
184```rust
185async fn get_record<R>(&self, rkey: K) -> Result<Response<R>>
186// Caller chooses: response.parse() (borrow) or response.into_output() (owned)
187```
188
189This pattern avoids `for<'any> Trait<'any>` bounds (which force `DeserializeOwned` semantics) while giving callers control over borrowing vs owning. See `jacquard-common` crate docs for detailed explanation.
190
191## WASM Compatibility
192
193Core crates (`jacquard-common`, `jacquard-api`, `jacquard-identity`, `jacquard-oauth`) support `wasm32-unknown-unknown` target compilation.
194
195Implementation approach:
196- **`trait-variant`**: Traits use `#[cfg_attr(not(target_arch = "wasm32"), trait_variant::make(Send))]` to conditionally exclude `Send` bounds on WASM
197- **Trait methods with `Self: Sync` bounds**: Duplicated as platform-specific versions (`#[cfg(not(target_arch = "wasm32"))]` vs `#[cfg(target_arch = "wasm32")]`)
198- **Helper functions**: Extracted to free functions with platform-specific versions to avoid code duplication
199- **Feature gating**: Platform-specific features (e.g., DNS resolution, tokio runtime detection) properly gated behind `cfg` attributes
200
201Test WASM compilation:
202```bash
203just check-wasm
204# or: cargo build --target wasm32-unknown-unknown -p jacquard-common --no-default-features
205```
206
207## Client Architecture
208
209### XRPC Request/Response Layer
210
211Core traits:
212- `XrpcRequest`: Defines NSID, method (Query/Procedure), and associated Response type
213 - `encode_body()` for request serialization (default: JSON; override for CBOR/multipart)
214 - `decode_body(&'de [u8])` for request deserialization (server-side)
215- `XrpcResp`: Response marker trait with NSID, encoding, Output/Err types
216- `XrpcEndpoint`: Server-side trait with PATH, METHOD, and associated Request/Response types
217- `XrpcClient`: Stateful trait with `base_uri()`, `opts()`, and `send()` method
218 - **This should be your primary interface point with the crate, along with the Agent___ traits**
219- `XrpcExt`: Extension trait providing stateless `.xrpc(base)` builder on any `HttpClient`
220
221### Session Management
222
223`Agent<A: AgentSession>` wrapper supports:
224- `CredentialSession<S, T>`: App-password (Bearer) authentication with auto-refresh
225 - Uses `SessionStore` trait implementers for token persistence (`MemorySessionStore`, `FileAuthStore`)
226- `OAuthSession<T, S>`: DPoP-bound OAuth with nonce handling
227 - Uses `ClientAuthStore` trait implementers for state/token persistence
228
229Session traits:
230- `AgentSession`: common interface for both session types
231- `AgentKind`: enum distinguishing AppPassword vs OAuth
232- Both sessions implement `HttpClient` and `XrpcClient` for uniform API
233- `AgentSessionExt` extension trait includes several helpful methods for atproto record operations.
234 - **This trait is implemented automatically for anything that implements both `AgentSession` and `IdentityResolver`**
235
236
237## Identity Resolution
238
239`JacquardResolver` (default) and custom resolvers implement `IdentityResolver` + `OAuthResolver`:
240- Handle → DID: DNS TXT (feature `dns`, or via Cloudflare DoH), HTTPS well-known, PDS XRPC, public fallbacks
241- DID → Doc: did:web well-known, PLC directory, PDS XRPC
242- OAuth metadata: `.well-known/oauth-protected-resource` and `.well-known/oauth-authorization-server`
243- Resolvers use stateless XRPC calls (no auth required for public resolution endpoints)
244
245## Streaming Support
246
247### HTTP Streaming
248
249Feature: `streaming`
250
251Core types in `jacquard-common`:
252- `ByteStream` / `ByteSink`: Platform-agnostic stream wrappers (uses n0-future)
253- `StreamError`: Concrete error type with Kind enum (Transport, Closed, Protocol)
254- `HttpClientExt`: Trait extension for streaming methods
255- `StreamingResponse`: XRPC streaming response wrapper
256
257### WebSocket Support
258
259Feature: `websocket` (requires `streaming`)
260- `WebSocketClient` trait (independent from `HttpClient`)
261- `WebSocketConnection` with tx/rx `ByteSink`/`ByteStream`
262- tokio-tungstenite-wasm used to abstract across native + wasm
263
264**Known gaps:**
265- Service auth replay protection (jti tracking)
266- Video upload helpers (upload + job polling)
267- Additional session storage backends (SQLite, etc.)
268- PLC operations
269- OAuth extractor for Axum