atproto blogging
1# How to make atproto actually easy
2
3Jacquard is a Rust library, or rather a suite of libraries, intended to make it much simpler to get started with atproto development, without sacrificing flexibility or performance. How it does that is relatively clever, and I think benefits from some explaining, because it doesn't really come across in descriptions like "a better Rust atproto library, with much less boilerplate". Descriptions like those especially don't really communicate that Jacquard is not simpler because someone wrote all the code for you, or had Claude do it. Jacquard is simpler because it is designed in a way which makes things simple that almost every other atproto library seems to make difficult.
4
5> The [Jacquard machine](https://en.wikipedia.org/wiki/Jacquard_machine) was one of the earliest devices you might call "programmable" in the sense we normally mean, allowing a series of punched cards to automatically control a mechanical weaving loom.
6
7First, let's talk boilerplate. An extremely common thing for people writing code for atproto to have to do is to write friendly helper methods over API bindings generated from lexicons. In the official Bluesky Typescript library you get a couple of layers of `**Agent` wrapper classes which provide convenient helpers for common methods, mostly hand-written, because the autogenerated API bindings are verbose to call and don't necessarily handle all the eventualities. There is a *lot* of code dedicated to handling updates to Bluesky preferences. Among the worst for required boilerplate is ATrium, the most widely-used set of Rust libraries for atproto, which mirrors the Typescript SDK in many ways, not all good. This results in pretty much anyone using ATrium needing to implement their own more ergonomic helpers, and often reimplementing chunks of the library for things like session management (particularly if they want to use their own lexicons), because certain important internal types aren't exported. This is boilerplate, and while LLMs are often pretty good at doing that for you these days, it still clutters your codebase.
8
9The problem with needing handwritten helpers to do things conveniently is that when you venture off the beaten path you end up needing to reinvent the wheel a lot. This is a big barrier for people looking to "just do things" on atproto. You need to figure out OAuth, you need to write all those convenience functions, etc. especially if you're working with your own lexicons rather than just using Bluesky's.
10
11>There are other libraries which handle some of these things better, but nothing (especially not in Rust) which got all the way there in a way that fit how I like to work, and how I think a lot of other Rust developers would like to work. Jacquard is the answer to the question a lot of my Rust atproto developer friends were asking.
12
13Here's the canonical example. Compare to the ATrium [Bluesky SDK](https://docs.rs/bsky-sdk/latest/bsky_sdk/index.html#moderation) example, which doesn't handle OAuth. There are some convenient helpers used here to elide OAuth setup stuff (helpers which [ATrium's OAuth implementation](https://github.com/atrium-rs/atrium/blob/main/atrium-oauth/README.md) lacks) but even [without those](https://tangled.org/@nonbinary.computer/jacquard/blob/main/examples/oauth_timeline.rs), [it's](https://tangled.org/@nonbinary.computer/jacquard/blob/main/crates/jacquard-oauth/src/loopback.rs) [not](https://docs.rs/jacquard-oauth/0.4.0/jacquard_oauth/atproto/struct.AtprotoClientMetadata.html) [that](https://docs.rs/jacquard-oauth/0.4.0/jacquard_oauth/client/struct.OAuthClient.html) [verbose](https://docs.rs/jacquard-oauth/0.4.0/jacquard_oauth/client/struct.OAuthSession.html), and the actual main action, fetching the timeline, is simply calling a single function with a generated API struct, then handling the result. Nothing here is Bluesky-specific that wasn't generated in seconds by Jacquard's lexicon API [code generation](https://tangled.org/@nonbinary.computer/jacquard/tree/main/crates/jacquard-lexicon).
14
15```rust
16#[tokio::main]
17async fn main() -> miette::Result<()> {
18 let args = Args::parse();
19 // Build an OAuth client with file-backed auth store and default localhost config
20 let oauth = OAuthClient::with_default_config(FileAuthStore::new(&args.store));
21 // Authenticate with a PDS, using a loopback server to handle the callback flow
22 let session = oauth
23 .login_with_local_server(
24 args.input.clone(),
25 Default::default(),
26 LoopbackConfig::default(),
27 )
28 .await?;
29 // Wrap in Agent and fetch the timeline
30 let agent: Agent<_> = Agent::from(session);
31 let timeline = agent
32 .send(GetTimeline::new().limit(5).build())
33 .await?
34 .into_output()?;
35 for (i, post) in timeline.feed.iter().enumerate() {
36 println!("\n{}. by @{}", i + 1, post.post.author.handle);
37 println!(
38 " {}",
39 serde_json::to_string_pretty(&post.post.record).into_diagnostic()?
40 );
41 }
42 Ok(())
43}
44```
45## Just `.send()` it
46
47Jacquard has a couple of `.send()` methods. One is stateless. it's the output of a method that creates a request builder, implemented as an extension trait, `XrpcExt`, on any http client which implements a very simple HttpClient trait. You can use a bare `reqwest::Client` to make XRPC requests. You call `.xrpc(base_url)` and get an `XrpcCall` struct. `XrpcCall` is a builder, which allows you to pass authentication, atproto proxy settings, labeler headings, and set other options for the final request. There's also a similar trait `DpopExt` in the `jacquard-oauth` crate, which handles that form of authenticated request in a similar way. For basic stuff, this works great, and it's a useful building block for more complex logic, or when one size does **not** in fact fit all.
48
49```rust
50use jacquard_common::xrpc::XrpcExt;
51use jacquard_common::http_client::HttpClient;
52/// ...
53let http = reqwest::Client::new();
54let base = url::Url::parse("https://public.api.bsky.app")?;
55let resp = http.xrpc(base).send(&request).await?;
56```
57The other, `XrpcClient`, is stateful, and can be implemented on anything with a bit of internal state to store the base URI (the URL of the PDS being contacted) and the default options. It's the one you're most likely to interact with doing normal atproto API client stuff. The Agent struct in the initial example implements that trait, as does the session struct it wraps, and the `.send()` method used is that trait method.
58
59>`XrpcClient` implementers don't *have* to implement token auto-refresh and so on, but realistically they *should* implement at least a basic version. There is an `AgentSession` trait which does require full session/state management.
60
61Here is the entire text of `XrpcCall::send()`. [`build_http_request()`](https://tangled.org/@nonbinary.computer/jacquard/blob/main/crates/jacquard-common/src/xrpc.rs#L400) and [`process_response()`](https://tangled.org/@nonbinary.computer/jacquard/blob/main/crates/jacquard-common/src/xrpc.rs#L344) are public functions and can be used in other crates. The first does more or less what it says on the tin. The second does less than you might think. It mostly surfaces authentication errors at an earlier level so you don't have to fully parse the response to know if there was an error or not.
62
63```rust
64pub async fn send<R>(
65 self,
66 request: &R,
67 ) -> XrpcResult<Response<<R as XrpcRequest<'s>>::Response>>
68 where
69 R: XrpcRequest,
70 {
71 let http_request = build_http_request(&self.base, request, &self.opts)
72 .map_err(TransportError::from)?;
73 let http_response = self
74 .client
75 .send_http(http_request)
76 .await
77 .map_err(|e| TransportError::Other(Box::new(e)))?;
78 process_response(http_response)
79 }
80```
81>A core goal of Jacquard is to not only provide an easy interface to atproto, but to also make it very easy to build something that fits your needs, and making "helper" functions like those part of the API surface is a big part of that, as are "stateless" implementations like `XrpcExt` and `XrpcCall`.
82
83`.send()` works for any endpoint and any type that implements the required traits, regardless of what crate it's defined in. There's no `KnownRecords` enum which defines a complete set of known records, and no restriction of Service endpoints in the agent/client, or anything like that, nothing that privileges any set of lexicons or way of working with the library, as much as possible. There's one primary method and you can put pretty much anything relevant into it. Whatever atproto API you need to call, just `.send()` it. Okay there are a couple of additional helpers, but we're focusing on the core one, because pretty much everything else is just wrapping the above `send()` in one way or another, and they use the same pattern.
84
85## Punchcard Instructions
86
87So how does this work? How does `send()` and its helper functions know what to do? The answer shouldn't be surprising to anyone familiar with Rust. It's traits! Specifically, the following traits, which have generated implementations for every lexicon type ingested by Jacquard's API code generation, but which honestly aren't hard to just implement yourself (more tedious than anything). XrpcResp is always implemented on a unit/marker struct with no fields. They provide all the request-specific instructions to the functions.
88
89```rust
90pub trait XrpcRequest: Serialize {
91 const NSID: &'static str;
92 /// XRPC method (query/GET or procedure/POST)
93 const METHOD: XrpcMethod;
94 type Response: XrpcResp;
95 /// Encode the request body for procedures.
96 fn encode_body(&self) -> Result<Vec<u8>, EncodeError> {
97 Ok(serde_json::to_vec(self)?)
98 }
99 /// Decode the request body for procedures. (Used server-side)
100 fn decode_body<'de>(body: &'de [u8]) -> Result<Box<Self>, DecodeError>
101 where
102 Self: Deserialize<'de>
103 {
104 let body: Self = serde_json::from_slice(body).map_err(|e| DecodeError::Json(e))?;
105 Ok(Box::new(body))
106 }
107}
108pub trait XrpcResp {
109 const NSID: &'static str;
110 /// Output encoding (MIME type)
111 const ENCODING: &'static str;
112 type Output<'de>: Deserialize<'de> + IntoStatic;
113 type Err<'de>: Error + Deserialize<'de> + IntoStatic;
114}
115```
116Here are the implementations for [`GetTimeline`](https://tangled.org/@nonbinary.computer/jacquard/blob/main/crates/jacquard-api/src/app_bsky/feed/get_timeline.rs). You'll also note that `send()` doesn't return the fully decoded response on success. It returns a Response struct which has a generic parameter that must implement the XrpcResp trait above. Here's its definition. It's essentially just a cheaply cloneable byte buffer and a type marker.
117
118```rust
119pub struct Response<R: XrpcResp> {
120 buffer: Bytes,
121 status: StatusCode,
122 _marker: PhantomData<R>,
123}
124
125impl<R: XrpcResp> Response<R> {
126 pub fn parse<'s>(
127 &'s self
128 ) -> Result<<Resp as XrpcResp>::Output<'s>, XrpcError<<Resp as XrpcResp>::Err<'s>>> {
129 // Borrowed parsing into Output or Err
130 }
131 pub fn into_output(
132 self
133 ) -> Result<<Resp as XrpcResp>::Output<'static>, XrpcError<<Resp as XrpcResp>::Err<'static>>>
134 where ...
135 { /* Owned parsing into Output or Err */ }
136}
137```
138You decode the response (or the endpoint-specific error) out of this, borrowing from the buffer or taking ownership so you can drop the buffer. There are two reasons for this. One is separation of concerns. By two-staging the parsing, it's easier to distinguish network and authentication problems from application-level errors. The second is lifetimes and borrowed deserialization. This is a bit of a long, technical aside, so if you want to jump over it, skip down to "**So What?**"
139
140---
141
142### Working with Lifetimes and Zero-Copy Deserialization
143
144Jacquard is designed around zero-copy/borrowed deserialization: types like [`Post<'a>`](https://tangled.org/@nonbinary.computer/jacquard/blob/main/crates/jacquard-api/src/app_bsky/feed/post.rs) can borrow strings and other data directly from the response buffer instead of allocating owned copies. This is great for performance, but it creates some interesting challenges, especially in async contexts. So how do you specify the lifetime of the borrow?
145
146The naive approach would be to put a lifetime parameter on the trait itself:
147
148```rust
149
150trait NaiveXrpcRequest<'de> {
151 type Output: Deserialize<'de>;
152 // ...
153}
154```
155
156This looks reasonable until you try to use it in a generic context. If you have a function that works with *any* lifetime, you need a Higher-ranked trait bound:
157
158```rust
159fn parse<R>(response: &[u8]) ... // return type
160where
161 R: for<'any> XrpcRequest<'any>
162{ /* deserialize from response... */ }
163```
164
165The `for<'any>` bound says "this type must implement `XrpcRequest` for *every possible lifetime*", which, for `Deserialize`, is effectively the same as requiring `DeserializeOwned`. You've probably just thrown away your zero-copy optimization, and furthermore that trait bound just straight-up won't work on most of the types in Jacquard. The vast majority of them have either a custom Deserialize implementation which will borrow if it can, a `#[serde(borrow)]` attribute on one or more fields, or an equivalent lifetime bound attribute, associated with the Deserialize derive macro. You will get "Deserialize implementation not general enough" if you try. And no, you cannot have an additional deserialize implementation for the `'static` lifetime due to how serde works.
166
167If you instead try something like the below function signature and specify a specific lifetime, it will compile in isolation, but when you go to use it, the Rust compiler will not generally be able to figure out the lifetimes at the call site, and will complain about things being dropped while still borrowed, even if you convert the response to an owned/ `'static` lifetime version of the type.
168
169```rust
170fn parse<'s, R: XrpcRequest<'s>>(response: &'s [u8]) ... // return type with the same lifetime
171{ /* deserialize from response... */ }
172```
173
174It gets worse with async. If you want to return borrowed data from an async method, where does the lifetime come from? The response buffer needs to outlive the borrow, but the buffer is consumed or potentially has to have an unbounded lifetime. You end up with confusing and frustrating errors because the compiler can't prove the buffer will stay alive or that you have taken ownership of the parts of it you care about. And even if you don't return borrowed data, holding anything across an await point makes determining bounds for things like the Send autotrait (important if you're working with crates like Axum) impossible for the compiler. You *could* do some lifetime laundering with `unsafe`, but that road leads to potential soundness issues, and besides, you don't actually *need* to tell `rustc` to "trust me, bro", you can, with some cleverness, explain this to the compiler in a way that it can reason about perfectly well.
175
176#### Explaining where the buffer goes to `rustc`
177
178The fix is to use Generic Associated Types (GATs) on the trait's associated types, while keeping the trait itself lifetime-free:
179
180```rust
181pub trait XrpcResp {
182 const NSID: &'static str;
183 /// Output encoding (MIME type)
184 const ENCODING: &'static str;
185 type Output<'de>: Deserialize<'de> + IntoStatic;
186 type Err<'de>: Error + Deserialize<'de> + IntoStatic;
187}
188```
189
190Now you can write trait bounds without HRTBs, and with lifetime bounds that are actually possible for Jacquard's borrowed deserializing types to meet:
191
192```rust
193fn parse<'s, R: XrpcResp>(response: &'s [u8]) /* return type with same lifetime */ {
194 // Compiler can pick a concrete lifetime for R::Output<'_> or have it specified easily
195}
196```
197
198Methods that need lifetimes use method-level generic parameters:
199
200```rust
201// This is part of a trait from jacquard itself, used to genericize updates to things like the Bluesky
202// preferences union, so that if you implement a similar lexicon type in your app, you don't have
203// to special-case it. Instead you can do a relatively simple trait implementation and then call
204// .update_vec() with a modifier function or .update_vec_item() with a single item you want to set.
205
206pub trait VecUpdate {
207 type GetRequest: XrpcRequest;
208 type PutRequest: XrpcRequest;
209 // ... more stuff
210
211 // Method-level lifetime, not trait-level
212 fn extract_vec<'s>(
213 output: <Self::GetRequest<'s> as XrpcRequest<'s>>::Output<'s>
214 ) -> Vec<Self::Item>;
215 // ... more stuff
216}
217```
218
219The compiler can monomorphize for concrete lifetimes instead of trying to prove bounds hold for *all* lifetimes at once, or struggle to figure out when you're done with a buffer. `XrpcResp` being separate and lifetime-free lets async methods like `.send()` return a `Response` that owns the response buffer, and then the *caller* decides the lifetime strategy:
220
221```rust
222// Zero-copy: borrow from the owned buffer
223let output: R::Output<'_> = response.parse()?;
224
225// Owned: convert to 'static via IntoStatic
226let output: R::Output<'static> = response.into_output()?;
227```
228
229The async method doesn't need to know or care about lifetimes for the most part - it just returns the `Response`. The caller gets full control over whether to use borrowed or owned data. It can even decide after the fact that it doesn't want to parse out the API response type that it asked for. Instead it can call `.parse_data()` or `.parse_raw()` on the response to get loosely typed, validated data or minimally typed maximally accepting data values out.
230
231## So what?
232
233Well, most importantly, what this means is that people using Jacquard have to write a lot less code, and I developing Jacquard also have to write a lot less code to support a wide variety of use cases. Jacquard's code generation handles all the trait implementation housekeeping and marker structs for `jacquard-api` and for the most part you can just use the generated stuff as is. It also means that even if you don't care about zero-copy deserialization or strong typing and just want things to be easy, things are in fact easy. Just put `'static` for your lifetime bounds on potentially borrowed Jacquard types, derive `IntoStatic` and call `.into_static()` to take ownership if needed, and forget about it. Use atproto string types like they're strings. Use loosely typed data values that actually know about atproto primitives like `at://` uris or DIDs, handles, CIDs or blobs rather than just `serde_json::Value` or `ipld_core::ipld::Ipld`. And if you're working with posts from, for example, [Bridgy Fed](https://fed.brid.gy/), which injects extra fields which aren't in the official Bluesky lexicon that carry the original ActivityPub data into federated Mastodon posts, you can access those fields easily via the `extra_data` field that the `#[lexicon]` attribute macro adds to record types.
234
235So yeah. If you're writing atproto stuff in Rust, and you don't need stuff that's not implemented yet (like moderation filtering and easy service auth), consider using Jacquard. It's pretty cool. I just released version 0.5.0, which has a number of nice additions and improves the documentation a fair bit. There are a number of [examples](https://tangled.org/@nonbinary.computer/jacquard/tree/main/examples) in the Tangled repository.
236
237And if you got this far and like the library, I do accept [sponsorships](https://github.com/sponsors/orual) on GitHub.