+12
.claude/settings.local.json
+12
.claude/settings.local.json
+2
-1
.gitignore
+2
-1
.gitignore
+115
CLAUDE.md
+115
CLAUDE.md
···
1
+
# CLAUDE.md
2
+
3
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+
## Project Overview
6
+
7
+
Jacquard is a suite of Rust crates for the AT Protocol (atproto/Bluesky). The project emphasizes spec-compliant, validated, performant baseline types with minimal boilerplate. Key design goals:
8
+
9
+
- Validated AT Protocol types including typed at:// URIs
10
+
- Custom lexicon extension support
11
+
- Lexicon `Value` type for working with unknown atproto data (dag-cbor or json)
12
+
- Using as much or as little of the crates as needed
13
+
14
+
## Workspace Structure
15
+
16
+
This is a Cargo workspace with several crates:
17
+
18
+
- **jacquard-common**: Core AT Protocol types (DIDs, handles, at-URIs, NSIDs, TIDs, CIDs, etc.) and the `CowStr` type for efficient string handling
19
+
- **jacquard-lexicon**: Lexicon parsing and Rust code generation from lexicon schemas
20
+
- **jacquard-api**: Generated API bindings (currently empty/in development)
21
+
- **jacquard-derive**: Derive macros for lexicon structures
22
+
- **jacquard**: Main binary (currently minimal)
23
+
24
+
## Development Commands
25
+
26
+
### Using Nix (preferred)
27
+
```bash
28
+
# Enter dev shell
29
+
nix develop
30
+
31
+
# Build
32
+
nix build
33
+
34
+
# Run
35
+
nix develop -c cargo run
36
+
```
37
+
38
+
### Using Cargo/Just
39
+
```bash
40
+
# Build
41
+
cargo build
42
+
43
+
# Run tests
44
+
cargo test
45
+
46
+
# Run specific test
47
+
cargo test <test_name>
48
+
49
+
# Run specific package tests
50
+
cargo test -p <package_name>
51
+
52
+
# Run
53
+
cargo run
54
+
55
+
# Auto-recompile and run
56
+
just watch [ARGS]
57
+
58
+
# Format and lint all
59
+
just pre-commit-all
60
+
```
61
+
62
+
## String Type Pattern
63
+
64
+
The codebase uses a consistent pattern for validated string types. Each type should have:
65
+
66
+
### Constructors
67
+
- `new()`: Construct from a string slice with appropriate lifetime (borrows)
68
+
- `new_owned()`: Construct from `impl AsRef<str>`, taking ownership
69
+
- `new_static()`: Construct from `&'static str` using `SmolStr`/`CowStr`'s static constructor (no allocation)
70
+
- `raw()`: Same as `new()` but panics instead of returning `Result`
71
+
- `unchecked()`: Same as `new()` but doesn't validate (marked `unsafe`)
72
+
- `as_str()`: Return string slice
73
+
74
+
### Traits
75
+
All string types should implement:
76
+
- `Serialize` + `Deserialize` (custom impl for latter, sometimes for former)
77
+
- `FromStr`, `Display`
78
+
- `Debug`, `PartialEq`, `Eq`, `Hash`, `Clone`
79
+
- `From<T> for String`, `CowStr`, `SmolStr`
80
+
- `From<String>`, `From<CowStr>`, `From<SmolStr>`, or `TryFrom` if likely to fail
81
+
- `AsRef<str>`
82
+
- `Deref` with `Target = str` (usually)
83
+
84
+
### Implementation Details
85
+
- Use `#[repr(transparent)]` when possible (exception: at-uri type and components)
86
+
- Use `SmolStr` directly as inner type if most instances will be under 24 bytes
87
+
- Use `CowStr` for longer strings to allow borrowing from input
88
+
- Implement `IntoStatic` trait to take ownership of string types
89
+
90
+
## Code Style
91
+
92
+
- Avoid comments for self-documenting code
93
+
- Comments should not detail fixes when refactoring
94
+
- Professional writing within source code and comments only
95
+
- Prioritize long-term maintainability over implementation speed
96
+
97
+
## Testing
98
+
99
+
- Write test cases for all critical code
100
+
- Tests can be run per-package or workspace-wide
101
+
- Use `cargo test <name>` to run specific tests
102
+
- Current test coverage: 89 tests in jacquard-common
103
+
104
+
## Current State & Next Steps
105
+
106
+
### Completed
107
+
- ✅ Comprehensive validation tests for all core string types (handle, DID, NSID, TID, record key, AT-URI, datetime, language, identifier)
108
+
- ✅ Validated implementations against AT Protocol specs and TypeScript reference implementation
109
+
- ✅ String type interface standardization (Language now has `new_static()`, Datetime has full conversion traits)
110
+
- ✅ Data serialization: Full serialize/deserialize for `Data<'_>`, `Array`, `Object` with format-specific handling (JSON vs CBOR)
111
+
- ✅ CidLink wrapper type with automatic `{"$link": "cid"}` serialization in JSON
112
+
- ✅ Integration test with real Bluesky thread data validates round-trip correctness
113
+
114
+
### Next Steps
115
+
1. **Lexicon Code Generation**: Begin work on lexicon-to-Rust code generation now that core types are stable
+80
-8
Cargo.lock
+80
-8
Cargo.lock
···
438
438
checksum = "d0881ea181b1df73ff77ffaaf9c7544ecc11e82fba9b5f27b262a3c73a332555"
439
439
440
440
[[package]]
441
+
name = "either"
442
+
version = "1.15.0"
443
+
source = "registry+https://github.com/rust-lang/crates.io-index"
444
+
checksum = "48c757948c5ede0e46177b7add2e67155f70e33c07fea8284df6576da70b3719"
445
+
446
+
[[package]]
441
447
name = "enum_dispatch"
442
448
version = "0.3.13"
443
449
source = "registry+https://github.com/rust-lang/crates.io-index"
···
728
734
checksum = "7943c866cc5cd64cbc25b2e01621d07fa8eb2a1a23160ee81ce38704e97b8ecf"
729
735
730
736
[[package]]
737
+
name = "itertools"
738
+
version = "0.14.0"
739
+
source = "registry+https://github.com/rust-lang/crates.io-index"
740
+
checksum = "2b192c782037fadd9cfa75548310488aabdbf3d2da73885b31bd0abd03351285"
741
+
dependencies = [
742
+
"either",
743
+
]
744
+
745
+
[[package]]
731
746
name = "itoa"
732
747
version = "1.0.15"
733
748
source = "registry+https://github.com/rust-lang/crates.io-index"
···
742
757
]
743
758
744
759
[[package]]
760
+
name = "jacquard-api"
761
+
version = "0.1.0"
762
+
763
+
[[package]]
745
764
name = "jacquard-common"
746
765
version = "0.1.0"
747
766
dependencies = [
···
755
774
"miette",
756
775
"multibase",
757
776
"multihash",
777
+
"num-traits",
758
778
"ouroboros",
759
779
"rand",
760
780
"regex",
···
768
788
]
769
789
770
790
[[package]]
791
+
name = "jacquard-derive"
792
+
version = "0.1.0"
793
+
dependencies = [
794
+
"heck 0.5.0",
795
+
"itertools",
796
+
"jacquard-common",
797
+
"jacquard-lexicon",
798
+
"prettyplease",
799
+
"proc-macro2",
800
+
"quote",
801
+
"serde",
802
+
"serde_json",
803
+
"serde_repr",
804
+
"serde_with",
805
+
"syn 2.0.106",
806
+
]
807
+
808
+
[[package]]
771
809
name = "jacquard-lexicon"
772
810
version = "0.1.0"
811
+
dependencies = [
812
+
"heck 0.5.0",
813
+
"itertools",
814
+
"jacquard-common",
815
+
"prettyplease",
816
+
"proc-macro2",
817
+
"quote",
818
+
"serde",
819
+
"serde_json",
820
+
"serde_repr",
821
+
"serde_with",
822
+
"syn 2.0.106",
823
+
]
773
824
774
825
[[package]]
775
826
name = "js-sys"
···
958
1009
]
959
1010
960
1011
[[package]]
1012
+
name = "prettyplease"
1013
+
version = "0.2.37"
1014
+
source = "registry+https://github.com/rust-lang/crates.io-index"
1015
+
checksum = "479ca8adacdd7ce8f1fb39ce9ecccbfe93a3f1344b3d0d97f20bc0196208f62b"
1016
+
dependencies = [
1017
+
"proc-macro2",
1018
+
"syn 2.0.106",
1019
+
]
1020
+
1021
+
[[package]]
961
1022
name = "proc-macro-error"
962
1023
version = "1.0.4"
963
1024
source = "registry+https://github.com/rust-lang/crates.io-index"
···
1005
1066
1006
1067
[[package]]
1007
1068
name = "quote"
1008
-
version = "1.0.40"
1069
+
version = "1.0.41"
1009
1070
source = "registry+https://github.com/rust-lang/crates.io-index"
1010
-
checksum = "1885c039570dc00dcb4ff087a89e185fd56bae234ddc7f056a945bf36467248d"
1071
+
checksum = "ce25767e7b499d1b604768e7cde645d14cc8584231ea6b295e9c9eb22c02e1d1"
1011
1072
dependencies = [
1012
1073
"proc-macro2",
1013
1074
]
···
1140
1201
1141
1202
[[package]]
1142
1203
name = "serde"
1143
-
version = "1.0.227"
1204
+
version = "1.0.228"
1144
1205
source = "registry+https://github.com/rust-lang/crates.io-index"
1145
-
checksum = "80ece43fc6fbed4eb5392ab50c07334d3e577cbf40997ee896fe7af40bba4245"
1206
+
checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e"
1146
1207
dependencies = [
1147
1208
"serde_core",
1148
1209
"serde_derive",
···
1159
1220
1160
1221
[[package]]
1161
1222
name = "serde_core"
1162
-
version = "1.0.227"
1223
+
version = "1.0.228"
1163
1224
source = "registry+https://github.com/rust-lang/crates.io-index"
1164
-
checksum = "7a576275b607a2c86ea29e410193df32bc680303c82f31e275bbfcafe8b33be5"
1225
+
checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad"
1165
1226
dependencies = [
1166
1227
"serde_derive",
1167
1228
]
1168
1229
1169
1230
[[package]]
1170
1231
name = "serde_derive"
1171
-
version = "1.0.227"
1232
+
version = "1.0.228"
1172
1233
source = "registry+https://github.com/rust-lang/crates.io-index"
1173
-
checksum = "51e694923b8824cf0e9b382adf0f60d4e05f348f357b38833a3fa5ed7c2ede04"
1234
+
checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79"
1174
1235
dependencies = [
1175
1236
"proc-macro2",
1176
1237
"quote",
···
1201
1262
"ryu",
1202
1263
"serde",
1203
1264
"serde_core",
1265
+
]
1266
+
1267
+
[[package]]
1268
+
name = "serde_repr"
1269
+
version = "0.1.20"
1270
+
source = "registry+https://github.com/rust-lang/crates.io-index"
1271
+
checksum = "175ee3e80ae9982737ca543e96133087cbd9a485eecc3bc4de9c1a37b47ea59c"
1272
+
dependencies = [
1273
+
"proc-macro2",
1274
+
"quote",
1275
+
"syn 2.0.106",
1204
1276
]
1205
1277
1206
1278
[[package]]
+356
codegen_plan.md
+356
codegen_plan.md
···
1
+
# Lexicon Codegen Plan
2
+
3
+
## Goal
4
+
Generate idiomatic Rust types from AT Protocol lexicon schemas with minimal nesting/indirection.
5
+
6
+
## Existing Infrastructure
7
+
8
+
### Already Implemented
9
+
- **lexicon.rs**: Complete lexicon parsing types (`LexiconDoc`, `LexUserType`, `LexObject`, etc)
10
+
- **fs.rs**: Directory walking for finding `.json` lexicon files
11
+
- **schema.rs**: `find_ref_unions()` - collects union fields from a single lexicon
12
+
- **output.rs**: Partial - has string type mapping and doc comment generation
13
+
14
+
### Attribute Macros
15
+
- `#[lexicon]` - adds `extra_data` field to structs
16
+
- `#[open_union]` - adds `Unknown(Data<'s>)` variant to enums
17
+
18
+
## Design Decisions
19
+
20
+
### Module/File Structure
21
+
- NSID `app.bsky.feed.post` → `app_bsky/feed/post.rs`
22
+
- Flat module names (no `app::bsky`, just `app_bsky`)
23
+
- Parent modules: `app_bsky/feed.rs` with `pub mod post;`
24
+
25
+
### Type Naming
26
+
- **Main def**: Use last segment of NSID
27
+
- `app.bsky.feed.post#main` → `Post`
28
+
- **Other defs**: Pascal-case the def name
29
+
- `replyRef` → `ReplyRef`
30
+
- **Union variants**: Use last segment of ref NSID
31
+
- `app.bsky.embed.images` → `Images`
32
+
- Collisions resolved by module path, not type name
33
+
- **No proliferation of `Main` types** like atrium has
34
+
35
+
### Type Generation
36
+
37
+
#### Records (lexRecord)
38
+
```rust
39
+
#[lexicon]
40
+
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]
41
+
#[serde(rename_all = "camelCase")]
42
+
pub struct Post<'s> {
43
+
/// Client-declared timestamp...
44
+
pub created_at: Datetime,
45
+
#[serde(skip_serializing_if = "Option::is_none")]
46
+
pub embed: Option<RecordEmbed<'s>>,
47
+
pub text: CowStr<'s>,
48
+
}
49
+
```
50
+
51
+
#### Objects (lexObject)
52
+
Same as records but without `#[lexicon]` if inline/not a top-level def.
53
+
54
+
#### Unions (lexRefUnion)
55
+
```rust
56
+
#[open_union]
57
+
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]
58
+
#[serde(tag = "$type")]
59
+
pub enum RecordEmbed<'s> {
60
+
#[serde(rename = "app.bsky.embed.images")]
61
+
Images(Box<jacquard_api::app_bsky::embed::Images<'s>>),
62
+
#[serde(rename = "app.bsky.embed.video")]
63
+
Video(Box<jacquard_api::app_bsky::embed::Video<'s>>),
64
+
}
65
+
```
66
+
67
+
- Use `Box<T>` for all variants (handles circular refs)
68
+
- `#[open_union]` adds `Unknown(Data<'s>)` catch-all
69
+
70
+
#### Queries (lexXrpcQuery)
71
+
```rust
72
+
pub struct GetAuthorFeedParams<'s> {
73
+
pub actor: AtIdentifier<'s>,
74
+
pub limit: Option<i64>,
75
+
pub cursor: Option<CowStr<'s>>,
76
+
}
77
+
78
+
pub struct GetAuthorFeedOutput<'s> {
79
+
pub cursor: Option<CowStr<'s>>,
80
+
pub feed: Vec<FeedViewPost<'s>>,
81
+
}
82
+
```
83
+
84
+
- Flat params/output structs
85
+
- No nesting like `Input { params: {...} }`
86
+
87
+
#### Procedures (lexXrpcProcedure)
88
+
Same as queries but with both `Input` and `Output` structs.
89
+
90
+
### Field Handling
91
+
92
+
#### Optional Fields
93
+
- Fields not in `required: []` → `Option<T>`
94
+
- Add `#[serde(skip_serializing_if = "Option::is_none")]`
95
+
96
+
#### Lifetimes
97
+
- All types have `'a` lifetime for borrowing from input
98
+
- `#[serde(borrow)]` where needed for zero-copy
99
+
100
+
#### Type Mapping
101
+
- `LexString` with format → specific types (`Datetime`, `Did`, etc)
102
+
- `LexString` without format → `CowStr<'a>`
103
+
- `LexInteger` → `i64`
104
+
- `LexBoolean` → `bool`
105
+
- `LexBytes` → `Bytes`
106
+
- `LexCidLink` → `CidLink<'a>`
107
+
- `LexBlob` → `Blob<'a>`
108
+
- `LexRef` → resolve to actual type path
109
+
- `LexRefUnion` → generate enum
110
+
- `LexArray` → `Vec<T>`
111
+
- `LexUnknown` → `Data<'a>`
112
+
113
+
### Reference Resolution
114
+
115
+
#### Known Refs
116
+
- Check corpus for ref existence
117
+
- `#ref: "app.bsky.embed.images"` → `jacquard_api::app_bsky::embed::Images<'a>`
118
+
- Handle fragments: `#ref: "com.example.foo#bar"` → `jacquard_api::com_example::foo::Bar<'a>`
119
+
120
+
#### Unknown Refs
121
+
- **In struct fields**: use `Data<'a>` as fallback type
122
+
- **In union variants**: handled by `Unknown(Data<'a>)` variant from `#[open_union]`
123
+
- Optional: log warnings for missing refs
124
+
125
+
## Implementation Phases
126
+
127
+
### Phase 1: Corpus Loading & Registry
128
+
**Goal**: Load all lexicons into memory for ref resolution
129
+
130
+
**Tasks**:
131
+
1. Create `LexiconCorpus` struct
132
+
- `HashMap<SmolStr, LexiconDoc<'static>>` - NSID → doc
133
+
- Methods: `load_from_dir()`, `get()`, `resolve_ref()`
134
+
2. Load all `.json` files from lexicon directory
135
+
3. Parse into `LexiconDoc` and insert into registry
136
+
4. Handle fragments in refs (`nsid#def`)
137
+
138
+
**Output**: Corpus registry that can resolve any ref
139
+
140
+
### Phase 2: Ref Analysis & Union Collection
141
+
**Goal**: Build complete picture of what refs exist and what unions need
142
+
143
+
**Tasks**:
144
+
1. Extend `find_ref_unions()` to work across entire corpus
145
+
2. For each union, collect all refs and check existence
146
+
3. Build `UnionRegistry`:
147
+
- Union name → list of (known refs, unknown refs)
148
+
4. Detect circular refs (optional - or just Box everything)
149
+
150
+
**Output**: Complete list of unions to generate with their variants
151
+
152
+
### Phase 3: Code Generation - Core Types
153
+
**Goal**: Generate Rust code for individual types
154
+
155
+
**Tasks**:
156
+
1. Implement type generators:
157
+
- `generate_struct()` for records/objects
158
+
- `generate_enum()` for unions
159
+
- `generate_field()` for object properties
160
+
- `generate_type()` for primitives/refs
161
+
2. Handle optional fields (`required` list)
162
+
3. Add doc comments from `description`
163
+
4. Apply `#[lexicon]` / `#[open_union]` macros
164
+
5. Add serde attributes
165
+
166
+
**Output**: `TokenStream` for each type
167
+
168
+
### Phase 4: Module Organization
169
+
**Goal**: Organize generated types into module hierarchy
170
+
171
+
**Tasks**:
172
+
1. Parse NSID into components: `["app", "bsky", "feed", "post"]`
173
+
2. Determine file paths: `app_bsky/feed/post.rs`
174
+
3. Generate module files: `app_bsky/feed.rs` with `pub mod post;`
175
+
4. Generate root module: `app_bsky.rs`
176
+
5. Handle re-exports if needed
177
+
178
+
**Output**: File path → generated code mapping
179
+
180
+
### Phase 5: File Writing
181
+
**Goal**: Write generated code to filesystem
182
+
183
+
**Tasks**:
184
+
1. Format code with `prettyplease`
185
+
2. Create directory structure
186
+
3. Write module files
187
+
4. Write type files
188
+
5. Optional: run `rustfmt`
189
+
190
+
**Output**: Generated code on disk
191
+
192
+
### Phase 6: Testing & Validation
193
+
**Goal**: Ensure generated code compiles and works
194
+
195
+
**Tasks**:
196
+
1. Generate code for test lexicons
197
+
2. Compile generated code
198
+
3. Test serialization/deserialization
199
+
4. Test union variant matching
200
+
5. Test extra_data capture
201
+
202
+
## Edge Cases & Considerations
203
+
204
+
### Circular References
205
+
- **Simple approach**: Union variants always use `Box<T>` → handles all circular refs
206
+
- **Alternative**: DFS cycle detection to only Box when needed
207
+
- Track visited refs and recursion stack
208
+
- If ref appears in rec_stack → cycle detected
209
+
- Algorithm:
210
+
```rust
211
+
fn has_cycle(corpus, start_ref, visited, rec_stack) -> bool {
212
+
visited.insert(start_ref);
213
+
rec_stack.insert(start_ref);
214
+
215
+
for child_ref in collect_refs_from_def(resolve(start_ref)) {
216
+
if !visited.contains(child_ref) {
217
+
if has_cycle(corpus, child_ref, visited, rec_stack) {
218
+
return true;
219
+
}
220
+
} else if rec_stack.contains(child_ref) {
221
+
return true; // back edge = cycle
222
+
}
223
+
}
224
+
225
+
rec_stack.remove(start_ref);
226
+
false
227
+
}
228
+
```
229
+
- Only box variants that participate in cycles
230
+
- **Recommendation**: Start with simple (always Box), optimize later if needed
231
+
232
+
### Name Collisions
233
+
- Multiple types with same name in different lexicons
234
+
- Module path disambiguates: `app_bsky::feed::Post` vs `com_example::feed::Post`
235
+
236
+
### Unknown Refs
237
+
- Fallback to `Data<'s>` in struct fields
238
+
- Caught by `Unknown` variant in unions
239
+
- Warn during generation
240
+
241
+
### Inline Defs
242
+
- Nested objects/unions in same lexicon
243
+
- Generate as separate types in same file
244
+
- Keep names scoped to parent (e.g., `PostReplyRef`)
245
+
246
+
### Arrays
247
+
- `Vec<T>` for arrays
248
+
- Handle nested unions in arrays
249
+
250
+
### Tokens
251
+
- Simple marker types
252
+
- Generate as unit structs or type aliases?
253
+
254
+
## Traits for Generated Types
255
+
256
+
### Collection Trait (Records)
257
+
Records implement the existing `Collection` trait from jacquard-common:
258
+
259
+
```rust
260
+
pub struct Post<'a> {
261
+
// ... fields
262
+
}
263
+
264
+
impl Collection for Post<'_> {
265
+
const NSID: &'static str = "app.bsky.feed.post";
266
+
type Record = Post<'static>;
267
+
}
268
+
```
269
+
270
+
### XrpcRequest Trait (Queries/Procedures)
271
+
New trait for XRPC endpoints:
272
+
273
+
```rust
274
+
pub trait XrpcRequest<'x> {
275
+
/// The NSID for this XRPC method
276
+
const NSID: &'static str;
277
+
278
+
/// HTTP method (GET for queries, POST for procedures)
279
+
const METHOD: XrpcMethod;
280
+
281
+
/// Input encoding (MIME type, e.g., "application/json")
282
+
/// None for queries (no body)
283
+
const INPUT_ENCODING: Option<&'static str>;
284
+
285
+
/// Output encoding (MIME type)
286
+
const OUTPUT_ENCODING: &'static str;
287
+
288
+
/// Request parameters type (query params or body)
289
+
type Params: Serialize;
290
+
291
+
/// Response output type
292
+
type Output: Deserialize<'x>;
293
+
}
294
+
295
+
pub enum XrpcMethod {
296
+
Query, // GET
297
+
Procedure, // POST
298
+
}
299
+
```
300
+
301
+
**Generated implementation:**
302
+
```rust
303
+
pub struct GetAuthorFeedParams<'a> {
304
+
pub actor: AtIdentifier<'a>,
305
+
pub limit: Option<i64>,
306
+
pub cursor: Option<CowStr<'a>>,
307
+
}
308
+
309
+
pub struct GetAuthorFeedOutput<'a> {
310
+
pub cursor: Option<CowStr<'a>>,
311
+
pub feed: Vec<FeedViewPost<'a>>,
312
+
}
313
+
314
+
impl XrpcRequest for GetAuthorFeedParams<'_> {
315
+
const NSID: &'static str = "app.bsky.feed.getAuthorFeed";
316
+
const METHOD: XrpcMethod = XrpcMethod::Query;
317
+
const INPUT_ENCODING: Option<&'static str> = None; // queries have no body
318
+
const OUTPUT_ENCODING: &'static str = "application/json";
319
+
320
+
type Params = Self;
321
+
type Output = GetAuthorFeedOutput<'static>;
322
+
}
323
+
```
324
+
325
+
**Encoding variations:**
326
+
- Most procedures: `"application/json"` for input/output
327
+
- Blob uploads: `"*/*"` or specific MIME type for input
328
+
- CAR files: `"application/vnd.ipld.car"` for repo operations
329
+
- Read from lexicon's `input.encoding` and `output.encoding` fields
330
+
331
+
**Trait benefits:**
332
+
- Allows monomorphization (static dispatch) for performance
333
+
- Also supports `dyn XrpcRequest` for dynamic dispatch if needed
334
+
- Client code can be generic over `impl XrpcRequest`
335
+
336
+
### Subscriptions
337
+
WebSocket streams - defer for now. Will need separate trait with message types.
338
+
339
+
## Open Questions
340
+
341
+
1. **Validation**: Generate runtime validation (min/max length, regex, etc)?
342
+
2. **Tokens**: How to represent token types?
343
+
3. **Errors**: How to handle codegen errors (missing refs, invalid schemas)?
344
+
4. **Incremental**: Support incremental codegen (only changed lexicons)?
345
+
5. **Formatting**: Always run rustfmt or rely on prettyplease?
346
+
6. **XrpcRequest location**: Should trait live in jacquard-common or separate jacquard-xrpc crate?
347
+
348
+
## Success Criteria
349
+
350
+
- [ ] Generates code for all official AT Protocol lexicons
351
+
- [ ] Generated code compiles without errors
352
+
- [ ] No `Main` proliferation
353
+
- [ ] Union variants have readable names
354
+
- [ ] Unknown refs handled gracefully
355
+
- [ ] `#[lexicon]` and `#[open_union]` applied correctly
356
+
- [ ] Serialization round-trips correctly
+14
crates/jacquard-api/Cargo.toml
+14
crates/jacquard-api/Cargo.toml
···
1
+
[package]
2
+
name = "jacquard-api"
3
+
edition.workspace = true
4
+
version.workspace = true
5
+
authors.workspace = true
6
+
repository.workspace = true
7
+
keywords.workspace = true
8
+
categories.workspace = true
9
+
readme.workspace = true
10
+
documentation.workspace = true
11
+
exclude.workspace = true
12
+
description.workspace = true
13
+
14
+
[dependencies]
+16
crates/jacquard-api/src/lib.rs
+16
crates/jacquard-api/src/lib.rs
+1
crates/jacquard-common/Cargo.toml
+1
crates/jacquard-common/Cargo.toml
+22
crates/jacquard-common/src/cowstr.rs
+22
crates/jacquard-common/src/cowstr.rs
···
177
177
}
178
178
}
179
179
180
+
impl PartialOrd for CowStr<'_> {
181
+
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
182
+
Some(match (self, other) {
183
+
(CowStr::Borrowed(s1), CowStr::Borrowed(s2)) => s1.cmp(s2),
184
+
(CowStr::Borrowed(s1), CowStr::Owned(s2)) => s1.cmp(&s2.as_ref()),
185
+
(CowStr::Owned(s1), CowStr::Borrowed(s2)) => s1.as_str().cmp(s2),
186
+
(CowStr::Owned(s1), CowStr::Owned(s2)) => s1.cmp(s2),
187
+
})
188
+
}
189
+
}
190
+
191
+
impl Ord for CowStr<'_> {
192
+
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
193
+
match (self, other) {
194
+
(CowStr::Borrowed(s1), CowStr::Borrowed(s2)) => s1.cmp(s2),
195
+
(CowStr::Borrowed(s1), CowStr::Owned(s2)) => s1.cmp(&s2.as_ref()),
196
+
(CowStr::Owned(s1), CowStr::Borrowed(s2)) => s1.as_str().cmp(s2),
197
+
(CowStr::Owned(s1), CowStr::Owned(s2)) => s1.cmp(s2),
198
+
}
199
+
}
200
+
}
201
+
180
202
impl Eq for CowStr<'_> {}
181
203
182
204
impl Hash for CowStr<'_> {
-2
crates/jacquard-common/src/types.rs
-2
crates/jacquard-common/src/types.rs
+77
-1
crates/jacquard-common/src/types/aturi.rs
+77
-1
crates/jacquard-common/src/types/aturi.rs
···
102
102
pub type UriPathBuf = UriPath<'static>;
103
103
104
104
pub static ATURI_REGEX: LazyLock<Regex> = LazyLock::new(|| {
105
-
Regex::new(r##"^at://(?<authority>[a-zA-Z0-9._:%-]+)(/(?<collection>[a-zA-Z0-9-.]+)(/(?<rkey>[a-zA-Z0-9._~:@!$&%')(*+,;=-]+))?)?(#(?<fragment>/[a-zA-Z0-9._~:@!$&%')(*+,;=-[]/\]*))?$"##).unwrap()
105
+
// Fragment allows: / and \ and other special chars. In raw string, backslashes are literal.
106
+
Regex::new(r##"^at://(?<authority>[a-zA-Z0-9._:%-]+)(/(?<collection>[a-zA-Z0-9-.]+)(/(?<rkey>[a-zA-Z0-9._~:@!$&%')(*+,;=-]+))?)?(#(?<fragment>/[a-zA-Z0-9._~:@!$&%')(*+,;=\-\[\]/\\]*))?$"##).unwrap()
106
107
});
107
108
108
109
impl<'u> AtUri<'u> {
···
712
713
self.inner.borrow_uri().as_ref()
713
714
}
714
715
}
716
+
717
+
#[cfg(test)]
718
+
mod tests {
719
+
use super::*;
720
+
721
+
#[test]
722
+
fn valid_at_uris() {
723
+
assert!(AtUri::new("at://did:plc:foo").is_ok());
724
+
assert!(AtUri::new("at://alice.bsky.social").is_ok());
725
+
assert!(AtUri::new("at://did:plc:foo/com.example.post").is_ok());
726
+
assert!(AtUri::new("at://did:plc:foo/com.example.post/123").is_ok());
727
+
}
728
+
729
+
#[test]
730
+
fn authority_only() {
731
+
let uri = AtUri::new("at://alice.test").unwrap();
732
+
assert_eq!(uri.authority().as_str(), "alice.test");
733
+
assert!(uri.collection().is_none());
734
+
assert!(uri.rkey().is_none());
735
+
}
736
+
737
+
#[test]
738
+
fn authority_and_collection() {
739
+
let uri = AtUri::new("at://alice.test/com.example.foo").unwrap();
740
+
assert_eq!(uri.authority().as_str(), "alice.test");
741
+
assert_eq!(uri.collection().unwrap().as_str(), "com.example.foo");
742
+
assert!(uri.rkey().is_none());
743
+
}
744
+
745
+
#[test]
746
+
fn full_uri() {
747
+
let uri = AtUri::new("at://alice.test/com.example.foo/123").unwrap();
748
+
assert_eq!(uri.authority().as_str(), "alice.test");
749
+
assert_eq!(uri.collection().unwrap().as_str(), "com.example.foo");
750
+
assert_eq!(uri.rkey().unwrap().as_ref(), "123");
751
+
}
752
+
753
+
#[test]
754
+
fn with_fragment() {
755
+
let uri = AtUri::new("at://alice.test/com.example.foo/123#/path").unwrap();
756
+
assert_eq!(uri.fragment().as_ref().unwrap().as_ref(), "/path");
757
+
758
+
// Fragment must start with /
759
+
assert!(AtUri::new("at://alice.test#path").is_err());
760
+
assert!(AtUri::new("at://alice.test#/foo/bar").is_ok());
761
+
}
762
+
763
+
#[test]
764
+
fn no_trailing_slash() {
765
+
assert!(AtUri::new("at://alice.test/").is_err());
766
+
assert!(AtUri::new("at://alice.test/com.example.foo/").is_err());
767
+
}
768
+
769
+
#[test]
770
+
fn must_have_authority() {
771
+
assert!(AtUri::new("at://").is_err());
772
+
assert!(AtUri::new("at:///com.example.foo").is_err());
773
+
}
774
+
775
+
#[test]
776
+
fn must_start_with_at_scheme() {
777
+
assert!(AtUri::new("alice.test").is_err());
778
+
assert!(AtUri::new("https://alice.test").is_err());
779
+
}
780
+
781
+
#[test]
782
+
fn max_length() {
783
+
// Spec says 8KB max
784
+
let long_did = format!("did:plc:{}", "a".repeat(8000));
785
+
let uri = format!("at://{}", long_did);
786
+
assert!(uri.len() < 8192);
787
+
// Should work if components are valid
788
+
// (our DID will fail at 2048 chars, but this tests the URI doesn't impose extra limits)
789
+
}
790
+
}
+33
-1
crates/jacquard-common/src/types/blob.rs
+33
-1
crates/jacquard-common/src/types/blob.rs
···
12
12
str::FromStr,
13
13
};
14
14
15
-
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq, Hash)]
15
+
#[derive(Deserialize, Debug, Clone, PartialEq, Eq, Hash)]
16
16
#[serde(rename_all = "camelCase")]
17
17
pub struct Blob<'b> {
18
18
pub r#ref: Cid<'b>,
19
19
#[serde(borrow)]
20
20
pub mime_type: MimeType<'b>,
21
21
pub size: usize,
22
+
}
23
+
24
+
impl Serialize for Blob<'_> {
25
+
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
26
+
where
27
+
S: Serializer,
28
+
{
29
+
use serde::ser::SerializeMap;
30
+
31
+
if serializer.is_human_readable() {
32
+
// JSON: ref needs to be {"$link": "cid"}
33
+
let mut map = serializer.serialize_map(Some(4))?;
34
+
map.serialize_entry("$type", "blob")?;
35
+
36
+
// Serialize ref as {"$link": "cid_string"}
37
+
let mut ref_map = std::collections::BTreeMap::new();
38
+
ref_map.insert("$link", self.r#ref.as_str());
39
+
map.serialize_entry("ref", &ref_map)?;
40
+
41
+
map.serialize_entry("mimeType", &self.mime_type)?;
42
+
map.serialize_entry("size", &self.size)?;
43
+
map.end()
44
+
} else {
45
+
// CBOR: ref is just the CID directly
46
+
let mut map = serializer.serialize_map(Some(4))?;
47
+
map.serialize_entry("$type", "blob")?;
48
+
map.serialize_entry("ref", &self.r#ref)?;
49
+
map.serialize_entry("mimeType", &self.mime_type)?;
50
+
map.serialize_entry("size", &self.size)?;
51
+
map.end()
52
+
}
53
+
}
22
54
}
23
55
24
56
impl IntoStatic for Blob<'_> {
+266
crates/jacquard-common/src/types/cid.rs
+266
crates/jacquard-common/src/types/cid.rs
···
214
214
self.as_str()
215
215
}
216
216
}
217
+
218
+
/// CID link wrapper that serializes as {"$link": "cid"} in JSON
219
+
/// and as raw CID in CBOR
220
+
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
221
+
#[repr(transparent)]
222
+
pub struct CidLink<'c>(pub Cid<'c>);
223
+
224
+
impl<'c> CidLink<'c> {
225
+
pub fn new(cid: &'c [u8]) -> Result<Self, Error> {
226
+
Ok(Self(Cid::new(cid)?))
227
+
}
228
+
229
+
pub fn new_owned(cid: &[u8]) -> Result<CidLink<'static>, Error> {
230
+
Ok(CidLink(Cid::new_owned(cid)?))
231
+
}
232
+
233
+
pub fn new_static(cid: &'static str) -> Self {
234
+
Self(Cid::str(cid))
235
+
}
236
+
237
+
pub fn ipld(cid: IpldCid) -> CidLink<'static> {
238
+
CidLink(Cid::ipld(cid))
239
+
}
240
+
241
+
pub fn str(cid: &'c str) -> Self {
242
+
Self(Cid::str(cid))
243
+
}
244
+
245
+
pub fn cow_str(cid: CowStr<'c>) -> Self {
246
+
Self(Cid::cow_str(cid))
247
+
}
248
+
249
+
pub fn as_str(&self) -> &str {
250
+
self.0.as_str()
251
+
}
252
+
253
+
pub fn to_ipld(&self) -> Result<IpldCid, cid::Error> {
254
+
self.0.to_ipld()
255
+
}
256
+
257
+
pub fn into_inner(self) -> Cid<'c> {
258
+
self.0
259
+
}
260
+
}
261
+
262
+
impl fmt::Display for CidLink<'_> {
263
+
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
264
+
self.0.fmt(f)
265
+
}
266
+
}
267
+
268
+
impl FromStr for CidLink<'_> {
269
+
type Err = Infallible;
270
+
271
+
fn from_str(s: &str) -> Result<Self, Self::Err> {
272
+
Ok(CidLink(Cid::from_str(s)?))
273
+
}
274
+
}
275
+
276
+
impl IntoStatic for CidLink<'_> {
277
+
type Output = CidLink<'static>;
278
+
279
+
fn into_static(self) -> Self::Output {
280
+
CidLink(self.0.into_static())
281
+
}
282
+
}
283
+
284
+
impl Serialize for CidLink<'_> {
285
+
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
286
+
where
287
+
S: Serializer,
288
+
{
289
+
if serializer.is_human_readable() {
290
+
// JSON: {"$link": "cid_string"}
291
+
use serde::ser::SerializeMap;
292
+
let mut map = serializer.serialize_map(Some(1))?;
293
+
map.serialize_entry("$link", self.0.as_str())?;
294
+
map.end()
295
+
} else {
296
+
// CBOR: raw CID
297
+
self.0.serialize(serializer)
298
+
}
299
+
}
300
+
}
301
+
302
+
impl<'de> Deserialize<'de> for CidLink<'_> {
303
+
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
304
+
where
305
+
D: Deserializer<'de>,
306
+
{
307
+
if deserializer.is_human_readable() {
308
+
// JSON: expect {"$link": "cid_string"}
309
+
struct LinkVisitor;
310
+
311
+
impl<'de> Visitor<'de> for LinkVisitor {
312
+
type Value = CidLink<'static>;
313
+
314
+
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
315
+
formatter.write_str("a CID link object with $link field")
316
+
}
317
+
318
+
fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
319
+
where
320
+
A: serde::de::MapAccess<'de>,
321
+
{
322
+
use serde::de::Error;
323
+
324
+
let mut link: Option<String> = None;
325
+
326
+
while let Some(key) = map.next_key::<String>()? {
327
+
if key == "$link" {
328
+
link = Some(map.next_value()?);
329
+
} else {
330
+
// Skip unknown fields
331
+
let _: serde::de::IgnoredAny = map.next_value()?;
332
+
}
333
+
}
334
+
335
+
if let Some(cid_str) = link {
336
+
Ok(CidLink(Cid::from(cid_str)))
337
+
} else {
338
+
Err(A::Error::missing_field("$link"))
339
+
}
340
+
}
341
+
}
342
+
343
+
deserializer.deserialize_map(LinkVisitor)
344
+
} else {
345
+
// CBOR: raw CID
346
+
Ok(CidLink(Cid::deserialize(deserializer)?))
347
+
}
348
+
}
349
+
}
350
+
351
+
impl From<CidLink<'_>> for String {
352
+
fn from(value: CidLink) -> Self {
353
+
value.0.into()
354
+
}
355
+
}
356
+
357
+
impl<'c> From<CidLink<'c>> for CowStr<'c> {
358
+
fn from(value: CidLink<'c>) -> Self {
359
+
value.0.into()
360
+
}
361
+
}
362
+
363
+
impl From<String> for CidLink<'_> {
364
+
fn from(value: String) -> Self {
365
+
CidLink(Cid::from(value))
366
+
}
367
+
}
368
+
369
+
impl<'c> From<CowStr<'c>> for CidLink<'c> {
370
+
fn from(value: CowStr<'c>) -> Self {
371
+
CidLink(Cid::from(value))
372
+
}
373
+
}
374
+
375
+
impl From<IpldCid> for CidLink<'_> {
376
+
fn from(value: IpldCid) -> Self {
377
+
CidLink(Cid::from(value))
378
+
}
379
+
}
380
+
381
+
impl<'c> From<Cid<'c>> for CidLink<'c> {
382
+
fn from(value: Cid<'c>) -> Self {
383
+
CidLink(value)
384
+
}
385
+
}
386
+
387
+
impl<'c> From<CidLink<'c>> for Cid<'c> {
388
+
fn from(value: CidLink<'c>) -> Self {
389
+
value.0
390
+
}
391
+
}
392
+
393
+
impl AsRef<str> for CidLink<'_> {
394
+
fn as_ref(&self) -> &str {
395
+
self.0.as_ref()
396
+
}
397
+
}
398
+
399
+
impl Deref for CidLink<'_> {
400
+
type Target = str;
401
+
402
+
fn deref(&self) -> &Self::Target {
403
+
self.0.deref()
404
+
}
405
+
}
406
+
407
+
#[cfg(test)]
408
+
mod tests {
409
+
use super::*;
410
+
411
+
const TEST_CID: &str = "bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha";
412
+
413
+
#[test]
414
+
fn cidlink_serialize_json() {
415
+
let link = CidLink::str(TEST_CID);
416
+
let json = serde_json::to_string(&link).unwrap();
417
+
assert_eq!(json, r#"{"$link":"bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"}"#);
418
+
}
419
+
420
+
#[test]
421
+
fn cidlink_deserialize_json() {
422
+
let json = r#"{"$link":"bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"}"#;
423
+
let link: CidLink = serde_json::from_str(json).unwrap();
424
+
assert_eq!(link.as_str(), TEST_CID);
425
+
}
426
+
427
+
#[test]
428
+
fn cidlink_roundtrip_json() {
429
+
let link = CidLink::str(TEST_CID);
430
+
let json = serde_json::to_string(&link).unwrap();
431
+
let parsed: CidLink = serde_json::from_str(&json).unwrap();
432
+
assert_eq!(link, parsed);
433
+
assert_eq!(link.as_str(), TEST_CID);
434
+
}
435
+
436
+
#[test]
437
+
fn cidlink_constructors() {
438
+
let link1 = CidLink::str(TEST_CID);
439
+
let link2 = CidLink::cow_str(CowStr::Borrowed(TEST_CID));
440
+
let link3 = CidLink::from(TEST_CID.to_string());
441
+
let link4 = CidLink::new_static(TEST_CID);
442
+
443
+
assert_eq!(link1.as_str(), TEST_CID);
444
+
assert_eq!(link2.as_str(), TEST_CID);
445
+
assert_eq!(link3.as_str(), TEST_CID);
446
+
assert_eq!(link4.as_str(), TEST_CID);
447
+
}
448
+
449
+
#[test]
450
+
fn cidlink_conversions() {
451
+
let link = CidLink::str(TEST_CID);
452
+
453
+
// CidLink -> Cid
454
+
let cid: Cid = link.clone().into();
455
+
assert_eq!(cid.as_str(), TEST_CID);
456
+
457
+
// Cid -> CidLink
458
+
let link2: CidLink = cid.into();
459
+
assert_eq!(link2.as_str(), TEST_CID);
460
+
461
+
// CidLink -> String
462
+
let s: String = link.clone().into();
463
+
assert_eq!(s, TEST_CID);
464
+
465
+
// CidLink -> CowStr
466
+
let cow: CowStr = link.into();
467
+
assert_eq!(cow.as_ref(), TEST_CID);
468
+
}
469
+
470
+
#[test]
471
+
fn cidlink_display() {
472
+
let link = CidLink::str(TEST_CID);
473
+
assert_eq!(format!("{}", link), TEST_CID);
474
+
}
475
+
476
+
#[test]
477
+
fn cidlink_deref() {
478
+
let link = CidLink::str(TEST_CID);
479
+
assert_eq!(&*link, TEST_CID);
480
+
assert_eq!(link.as_ref(), TEST_CID);
481
+
}
482
+
}
+73
-1
crates/jacquard-common/src/types/datetime.rs
+73
-1
crates/jacquard-common/src/types/datetime.rs
···
1
1
use chrono::DurationRound;
2
2
use serde::Serializer;
3
3
use serde::{Deserialize, Deserializer, Serialize, de::Error};
4
-
use smol_str::ToSmolStr;
4
+
use smol_str::{SmolStr, ToSmolStr};
5
+
use std::fmt;
5
6
use std::sync::LazyLock;
6
7
use std::{cmp, str::FromStr};
7
8
···
163
164
}
164
165
}
165
166
}
167
+
168
+
impl From<chrono::DateTime<chrono::FixedOffset>> for Datetime {
169
+
fn from(dt: chrono::DateTime<chrono::FixedOffset>) -> Self {
170
+
Self::new(dt)
171
+
}
172
+
}
173
+
174
+
impl From<Datetime> for String {
175
+
fn from(value: Datetime) -> Self {
176
+
value.serialized.to_string()
177
+
}
178
+
}
179
+
180
+
impl From<Datetime> for SmolStr {
181
+
fn from(value: Datetime) -> Self {
182
+
match value.serialized {
183
+
CowStr::Borrowed(s) => SmolStr::new(s),
184
+
CowStr::Owned(s) => s,
185
+
}
186
+
}
187
+
}
188
+
189
+
impl From<Datetime> for CowStr<'static> {
190
+
fn from(value: Datetime) -> Self {
191
+
value.serialized
192
+
}
193
+
}
194
+
195
+
impl AsRef<str> for Datetime {
196
+
fn as_ref(&self) -> &str {
197
+
self.as_str()
198
+
}
199
+
}
200
+
201
+
impl fmt::Display for Datetime {
202
+
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
203
+
f.write_str(self.as_str())
204
+
}
205
+
}
206
+
207
+
#[cfg(test)]
208
+
mod tests {
209
+
use super::*;
210
+
211
+
#[test]
212
+
fn valid_datetimes() {
213
+
assert!(Datetime::from_str("2023-01-15T12:30:45.123456Z").is_ok());
214
+
assert!(Datetime::from_str("2023-01-15T12:30:45Z").is_ok());
215
+
assert!(Datetime::from_str("2023-01-15T12:30:45+00:00").is_ok());
216
+
assert!(Datetime::from_str("2023-01-15T12:30:45-05:00").is_ok());
217
+
}
218
+
219
+
#[test]
220
+
fn microsecond_precision() {
221
+
let dt = Datetime::from_str("2023-01-15T12:30:45.123456Z").unwrap();
222
+
assert!(dt.as_str().contains(".123456"));
223
+
}
224
+
225
+
#[test]
226
+
fn requires_timezone() {
227
+
// Missing timezone should fail
228
+
assert!(Datetime::from_str("2023-01-15T12:30:45").is_err());
229
+
}
230
+
231
+
#[test]
232
+
fn round_trip() {
233
+
let original = "2023-01-15T12:30:45.123456Z";
234
+
let dt = Datetime::from_str(original).unwrap();
235
+
assert_eq!(dt.as_str(), original);
236
+
}
237
+
}
+97
crates/jacquard-common/src/types/did.rs
+97
crates/jacquard-common/src/types/did.rs
···
12
12
#[repr(transparent)]
13
13
pub struct Did<'d>(CowStr<'d>);
14
14
15
+
/// Regex for DID validation per AT Protocol spec.
16
+
///
17
+
/// Note: This regex allows `%` in the identifier but prevents DIDs from ending with `:` or `%`.
18
+
/// It does NOT validate that percent-encoding is well-formed (i.e., `%XX` where XX are hex digits).
19
+
/// This matches the behavior of the official TypeScript implementation, which also does not
20
+
/// enforce percent-encoding validity at validation time. While the spec states "percent sign
21
+
/// must be followed by two hex characters," this is treated as a best practice rather than
22
+
/// a hard validation requirement.
15
23
pub static DID_REGEX: LazyLock<Regex> =
16
24
LazyLock::new(|| Regex::new(r"^did:[a-z]+:[a-zA-Z0-9._:%-]*[a-zA-Z0-9._-]$").unwrap());
17
25
···
193
201
self.as_str()
194
202
}
195
203
}
204
+
205
+
#[cfg(test)]
206
+
mod tests {
207
+
use super::*;
208
+
209
+
#[test]
210
+
fn valid_dids() {
211
+
assert!(Did::new("did:plc:abc123").is_ok());
212
+
assert!(Did::new("did:web:example.com").is_ok());
213
+
assert!(Did::new("did:method:val_ue").is_ok());
214
+
assert!(Did::new("did:method:val-ue").is_ok());
215
+
assert!(Did::new("did:method:val.ue").is_ok());
216
+
assert!(Did::new("did:method:val%20ue").is_ok());
217
+
}
218
+
219
+
#[test]
220
+
fn prefix_stripping() {
221
+
assert_eq!(Did::new("at://did:plc:foo").unwrap().as_str(), "did:plc:foo");
222
+
assert_eq!(Did::new("did:plc:foo").unwrap().as_str(), "did:plc:foo");
223
+
}
224
+
225
+
#[test]
226
+
fn must_start_with_did() {
227
+
assert!(Did::new("DID:plc:foo").is_err());
228
+
assert!(Did::new("plc:foo").is_err());
229
+
assert!(Did::new("foo").is_err());
230
+
}
231
+
232
+
#[test]
233
+
fn method_must_be_lowercase() {
234
+
assert!(Did::new("did:plc:foo").is_ok());
235
+
assert!(Did::new("did:PLC:foo").is_err());
236
+
assert!(Did::new("did:Plc:foo").is_err());
237
+
}
238
+
239
+
#[test]
240
+
fn cannot_end_with_colon_or_percent() {
241
+
assert!(Did::new("did:plc:foo:").is_err());
242
+
assert!(Did::new("did:plc:foo%").is_err());
243
+
assert!(Did::new("did:plc:foo:bar").is_ok());
244
+
}
245
+
246
+
#[test]
247
+
fn max_length() {
248
+
let valid_2048 = format!("did:plc:{}", "a".repeat(2048 - 8));
249
+
assert_eq!(valid_2048.len(), 2048);
250
+
assert!(Did::new(&valid_2048).is_ok());
251
+
252
+
let too_long_2049 = format!("did:plc:{}", "a".repeat(2049 - 8));
253
+
assert_eq!(too_long_2049.len(), 2049);
254
+
assert!(Did::new(&too_long_2049).is_err());
255
+
}
256
+
257
+
#[test]
258
+
fn allowed_characters() {
259
+
assert!(Did::new("did:method:abc123").is_ok());
260
+
assert!(Did::new("did:method:ABC123").is_ok());
261
+
assert!(Did::new("did:method:a_b_c").is_ok());
262
+
assert!(Did::new("did:method:a-b-c").is_ok());
263
+
assert!(Did::new("did:method:a.b.c").is_ok());
264
+
assert!(Did::new("did:method:a:b:c").is_ok());
265
+
}
266
+
267
+
#[test]
268
+
fn disallowed_characters() {
269
+
assert!(Did::new("did:method:a b").is_err());
270
+
assert!(Did::new("did:method:a@b").is_err());
271
+
assert!(Did::new("did:method:a#b").is_err());
272
+
assert!(Did::new("did:method:a?b").is_err());
273
+
}
274
+
275
+
#[test]
276
+
fn percent_encoding() {
277
+
// Valid percent encoding
278
+
assert!(Did::new("did:method:foo%20bar").is_ok());
279
+
assert!(Did::new("did:method:foo%2Fbar").is_ok());
280
+
281
+
// DIDs cannot end with %
282
+
assert!(Did::new("did:method:foo%").is_err());
283
+
284
+
// IMPORTANT: The regex does NOT validate that percent-encoding is well-formed.
285
+
// This matches the TypeScript reference implementation's behavior.
286
+
// While the spec says "percent sign must be followed by two hex characters",
287
+
// implementations treat this as a best practice, not a hard validation requirement.
288
+
// Thus, malformed percent encoding like %2x is accepted by the regex.
289
+
assert!(Did::new("did:method:foo%2x").is_ok());
290
+
assert!(Did::new("did:method:foo%ZZ").is_ok());
291
+
}
292
+
}
+120
-26
crates/jacquard-common/src/types/handle.rs
+120
-26
crates/jacquard-common/src/types/handle.rs
···
21
21
impl<'h> Handle<'h> {
22
22
/// Fallible constructor, validates, borrows from input
23
23
///
24
-
/// Accepts (and strips) preceding '@' if present
24
+
/// Accepts (and strips) preceding '@' or 'at://' if present
25
25
pub fn new(handle: &'h str) -> Result<Self, AtStrError> {
26
-
let handle = handle
26
+
let stripped = handle
27
27
.strip_prefix("at://")
28
-
.unwrap_or(handle)
29
-
.strip_prefix('@')
28
+
.or_else(|| handle.strip_prefix('@'))
30
29
.unwrap_or(handle);
31
-
if handle.len() > 253 {
32
-
Err(AtStrError::too_long("handle", handle, 253, handle.len()))
33
-
} else if !HANDLE_REGEX.is_match(handle) {
30
+
31
+
if stripped.len() > 253 {
32
+
Err(AtStrError::too_long("handle", stripped, 253, stripped.len()))
33
+
} else if !HANDLE_REGEX.is_match(stripped) {
34
34
Err(AtStrError::regex(
35
35
"handle",
36
-
handle,
36
+
stripped,
37
37
SmolStr::new_static("invalid"),
38
38
))
39
-
} else if ends_with(handle, DISALLOWED_TLDS) {
40
-
Err(AtStrError::disallowed("handle", handle, DISALLOWED_TLDS))
39
+
} else if ends_with(stripped, DISALLOWED_TLDS) {
40
+
Err(AtStrError::disallowed("handle", stripped, DISALLOWED_TLDS))
41
41
} else {
42
-
Ok(Self(CowStr::Borrowed(handle)))
42
+
Ok(Self(CowStr::Borrowed(stripped)))
43
43
}
44
44
}
45
45
46
46
/// Fallible constructor, validates, takes ownership
47
47
pub fn new_owned(handle: impl AsRef<str>) -> Result<Self, AtStrError> {
48
48
let handle = handle.as_ref();
49
-
let handle = handle
49
+
let stripped = handle
50
50
.strip_prefix("at://")
51
-
.unwrap_or(handle)
52
-
.strip_prefix('@')
51
+
.or_else(|| handle.strip_prefix('@'))
53
52
.unwrap_or(handle);
53
+
let handle = stripped;
54
54
if handle.len() > 253 {
55
55
Err(AtStrError::too_long("handle", handle, 253, handle.len()))
56
56
} else if !HANDLE_REGEX.is_match(handle) {
···
68
68
69
69
/// Fallible constructor, validates, doesn't allocate
70
70
pub fn new_static(handle: &'static str) -> Result<Self, AtStrError> {
71
-
let handle = handle
71
+
let stripped = handle
72
72
.strip_prefix("at://")
73
-
.unwrap_or(handle)
74
-
.strip_prefix('@')
73
+
.or_else(|| handle.strip_prefix('@'))
75
74
.unwrap_or(handle);
75
+
let handle = stripped;
76
76
if handle.len() > 253 {
77
77
Err(AtStrError::too_long("handle", handle, 253, handle.len()))
78
78
} else if !HANDLE_REGEX.is_match(handle) {
···
92
92
/// or API values you know are valid (rather than using serde), this is the one to use.
93
93
/// The From<String> and From<CowStr> impls use the same logic.
94
94
///
95
-
/// Accepts (and strips) preceding '@' if present
95
+
/// Accepts (and strips) preceding '@' or 'at://' if present
96
96
pub fn raw(handle: &'h str) -> Self {
97
-
let handle = handle
97
+
let stripped = handle
98
98
.strip_prefix("at://")
99
-
.unwrap_or(handle)
100
-
.strip_prefix('@')
99
+
.or_else(|| handle.strip_prefix('@'))
101
100
.unwrap_or(handle);
101
+
let handle = stripped;
102
102
if handle.len() > 253 {
103
103
panic!("handle too long")
104
104
} else if !HANDLE_REGEX.is_match(handle) {
···
113
113
/// Infallible constructor for when you *know* the string is a valid handle.
114
114
/// Marked unsafe because responsibility for upholding the invariant is on the developer.
115
115
///
116
-
/// Accepts (and strips) preceding '@' if present
116
+
/// Accepts (and strips) preceding '@' or 'at://' if present
117
117
pub unsafe fn unchecked(handle: &'h str) -> Self {
118
-
let handle = handle
118
+
let stripped = handle
119
119
.strip_prefix("at://")
120
-
.unwrap_or(handle)
121
-
.strip_prefix('@')
120
+
.or_else(|| handle.strip_prefix('@'))
122
121
.unwrap_or(handle);
123
-
Self(CowStr::Borrowed(handle))
122
+
Self(CowStr::Borrowed(stripped))
124
123
}
125
124
126
125
pub fn as_str(&self) -> &str {
···
208
207
self.as_str()
209
208
}
210
209
}
210
+
211
+
#[cfg(test)]
212
+
mod tests {
213
+
use super::*;
214
+
215
+
#[test]
216
+
fn valid_handles() {
217
+
assert!(Handle::new("alice.test").is_ok());
218
+
assert!(Handle::new("foo.bsky.social").is_ok());
219
+
assert!(Handle::new("a.b.c.d.e").is_ok());
220
+
assert!(Handle::new("a1.b2.c3").is_ok());
221
+
assert!(Handle::new("name-with-dash.com").is_ok());
222
+
}
223
+
224
+
#[test]
225
+
fn prefix_stripping() {
226
+
assert_eq!(Handle::new("@alice.test").unwrap().as_str(), "alice.test");
227
+
assert_eq!(Handle::new("at://alice.test").unwrap().as_str(), "alice.test");
228
+
assert_eq!(Handle::new("alice.test").unwrap().as_str(), "alice.test");
229
+
}
230
+
231
+
#[test]
232
+
fn max_length() {
233
+
// 253 chars: three 63-char segments + one 61-char segment + 3 dots = 253
234
+
let s1 = format!("a{}a", "b".repeat(61)); // 63
235
+
let s2 = format!("c{}c", "d".repeat(61)); // 63
236
+
let s3 = format!("e{}e", "f".repeat(61)); // 63
237
+
let s4 = format!("g{}g", "h".repeat(59)); // 61
238
+
let valid_253 = format!("{}.{}.{}.{}", s1, s2, s3, s4);
239
+
assert_eq!(valid_253.len(), 253);
240
+
assert!(Handle::new(&valid_253).is_ok());
241
+
242
+
// 254 chars: make last segment 62 chars
243
+
let s4_long = format!("g{}g", "h".repeat(60)); // 62
244
+
let too_long_254 = format!("{}.{}.{}.{}", s1, s2, s3, s4_long);
245
+
assert_eq!(too_long_254.len(), 254);
246
+
assert!(Handle::new(&too_long_254).is_err());
247
+
}
248
+
249
+
#[test]
250
+
fn segment_length_constraints() {
251
+
let valid_63_char_segment = format!("{}.com", "a".repeat(63));
252
+
assert!(Handle::new(&valid_63_char_segment).is_ok());
253
+
254
+
let too_long_64_char_segment = format!("{}.com", "a".repeat(64));
255
+
assert!(Handle::new(&too_long_64_char_segment).is_err());
256
+
}
257
+
258
+
#[test]
259
+
fn hyphen_placement() {
260
+
assert!(Handle::new("valid-label.com").is_ok());
261
+
assert!(Handle::new("-nope.com").is_err());
262
+
assert!(Handle::new("nope-.com").is_err());
263
+
}
264
+
265
+
#[test]
266
+
fn tld_must_start_with_letter() {
267
+
assert!(Handle::new("foo.bar").is_ok());
268
+
assert!(Handle::new("foo.9bar").is_err());
269
+
}
270
+
271
+
#[test]
272
+
fn disallowed_tlds() {
273
+
assert!(Handle::new("foo.local").is_err());
274
+
assert!(Handle::new("foo.localhost").is_err());
275
+
assert!(Handle::new("foo.arpa").is_err());
276
+
assert!(Handle::new("foo.invalid").is_err());
277
+
assert!(Handle::new("foo.internal").is_err());
278
+
assert!(Handle::new("foo.example").is_err());
279
+
assert!(Handle::new("foo.alt").is_err());
280
+
assert!(Handle::new("foo.onion").is_err());
281
+
}
282
+
283
+
#[test]
284
+
fn minimum_segments() {
285
+
assert!(Handle::new("a.b").is_ok());
286
+
assert!(Handle::new("a").is_err());
287
+
assert!(Handle::new("com").is_err());
288
+
}
289
+
290
+
#[test]
291
+
fn invalid_characters() {
292
+
assert!(Handle::new("foo!bar.com").is_err());
293
+
assert!(Handle::new("foo_bar.com").is_err());
294
+
assert!(Handle::new("foo bar.com").is_err());
295
+
assert!(Handle::new("foo@bar.com").is_err());
296
+
}
297
+
298
+
#[test]
299
+
fn empty_segments() {
300
+
assert!(Handle::new("foo..com").is_err());
301
+
assert!(Handle::new(".foo.com").is_err());
302
+
assert!(Handle::new("foo.com.").is_err());
303
+
}
304
+
}
+37
crates/jacquard-common/src/types/ident.rs
+37
crates/jacquard-common/src/types/ident.rs
···
168
168
}
169
169
}
170
170
}
171
+
172
+
#[cfg(test)]
173
+
mod tests {
174
+
use super::*;
175
+
176
+
#[test]
177
+
fn parses_did() {
178
+
let ident = AtIdentifier::new("did:plc:foo").unwrap();
179
+
assert!(matches!(ident, AtIdentifier::Did(_)));
180
+
assert_eq!(ident.as_str(), "did:plc:foo");
181
+
}
182
+
183
+
#[test]
184
+
fn parses_handle() {
185
+
let ident = AtIdentifier::new("alice.test").unwrap();
186
+
assert!(matches!(ident, AtIdentifier::Handle(_)));
187
+
assert_eq!(ident.as_str(), "alice.test");
188
+
}
189
+
190
+
#[test]
191
+
fn did_takes_precedence() {
192
+
// DID is tried first, so valid DIDs are parsed as DIDs
193
+
let ident = AtIdentifier::new("did:web:alice.test").unwrap();
194
+
assert!(matches!(ident, AtIdentifier::Did(_)));
195
+
}
196
+
197
+
#[test]
198
+
fn from_types() {
199
+
let did = Did::new("did:plc:foo").unwrap();
200
+
let ident: AtIdentifier = did.into();
201
+
assert!(matches!(ident, AtIdentifier::Did(_)));
202
+
203
+
let handle = Handle::new("alice.test").unwrap();
204
+
let ident: AtIdentifier = handle.into();
205
+
assert!(matches!(ident, AtIdentifier::Handle(_)));
206
+
}
207
+
}
+35
-3
crates/jacquard-common/src/types/language.rs
+35
-3
crates/jacquard-common/src/types/language.rs
···
22
22
T: AsRef<str> + ?Sized,
23
23
{
24
24
let tag = langtag::LangTag::new(lang)?;
25
-
Ok(Language(SmolStr::new_inline(tag.as_str())))
25
+
Ok(Language(SmolStr::new(tag.as_str())))
26
+
}
27
+
28
+
/// Parses an IETF language tag from a static string.
29
+
pub fn new_static(lang: &'static str) -> Result<Self, langtag::InvalidLangTag<&'static str>> {
30
+
let tag = langtag::LangTag::new(lang)?;
31
+
Ok(Language(SmolStr::new_static(tag.as_str())))
26
32
}
27
33
28
34
/// Infallible constructor for when you *know* the string is a valid IETF language tag.
···
32
38
pub fn raw(lang: impl AsRef<str>) -> Self {
33
39
let lang = lang.as_ref();
34
40
let tag = langtag::LangTag::new(lang).expect("valid IETF language tag");
35
-
Language(SmolStr::new_inline(tag.as_str()))
41
+
Language(SmolStr::new(tag.as_str()))
36
42
}
37
43
38
44
/// Infallible constructor for when you *know* the string is a valid IETF language tag.
39
45
/// Marked unsafe because responsibility for upholding the invariant is on the developer.
40
46
pub unsafe fn unchecked(lang: impl AsRef<str>) -> Self {
41
47
let lang = lang.as_ref();
42
-
Self(SmolStr::new_inline(lang))
48
+
Self(SmolStr::new(lang))
43
49
}
44
50
45
51
/// Returns the LANG as a string slice.
···
112
118
self.as_str()
113
119
}
114
120
}
121
+
122
+
#[cfg(test)]
123
+
mod tests {
124
+
use super::*;
125
+
126
+
#[test]
127
+
fn valid_language_tags() {
128
+
assert!(Language::new("en").is_ok());
129
+
assert!(Language::new("en-US").is_ok());
130
+
assert!(Language::new("zh-Hans").is_ok());
131
+
assert!(Language::new("es-419").is_ok());
132
+
}
133
+
134
+
#[test]
135
+
fn case_insensitive_but_preserves() {
136
+
let lang = Language::new("en-US").unwrap();
137
+
assert_eq!(lang.as_str(), "en-US");
138
+
}
139
+
140
+
#[test]
141
+
fn invalid_tags() {
142
+
assert!(Language::new("").is_err());
143
+
assert!(Language::new("not_a_tag").is_err());
144
+
assert!(Language::new("123").is_err());
145
+
}
146
+
}
+94
crates/jacquard-common/src/types/nsid.rs
+94
crates/jacquard-common/src/types/nsid.rs
···
221
221
self.as_str()
222
222
}
223
223
}
224
+
225
+
#[cfg(test)]
226
+
mod tests {
227
+
use super::*;
228
+
229
+
#[test]
230
+
fn valid_nsids() {
231
+
assert!(Nsid::new("com.example.foo").is_ok());
232
+
assert!(Nsid::new("com.example.fooBar").is_ok());
233
+
assert!(Nsid::new("com.long-domain.foo").is_ok());
234
+
assert!(Nsid::new("a.b.c").is_ok());
235
+
assert!(Nsid::new("a1.b2.c3").is_ok());
236
+
}
237
+
238
+
#[test]
239
+
fn minimum_segments() {
240
+
assert!(Nsid::new("a.b.c").is_ok()); // 3 segments minimum
241
+
assert!(Nsid::new("a.b").is_err());
242
+
assert!(Nsid::new("a").is_err());
243
+
}
244
+
245
+
#[test]
246
+
fn domain_and_name_parsing() {
247
+
let nsid = Nsid::new("com.example.fooBar").unwrap();
248
+
assert_eq!(nsid.domain_authority(), "com.example");
249
+
assert_eq!(nsid.name(), "fooBar");
250
+
}
251
+
252
+
#[test]
253
+
fn max_length() {
254
+
// 317 chars: 63 + 63 + 63 + 63 + 63 = 315 + 4 dots + 1 = 320, too much
255
+
// try: 63 + 63 + 63 + 63 + 62 = 314 + 4 dots = 318, still too much
256
+
// try: 63 + 63 + 63 + 63 + 61 = 313 + 4 dots = 317
257
+
let s1 = format!("a{}a", "b".repeat(61));
258
+
let s2 = format!("c{}c", "d".repeat(61));
259
+
let s3 = format!("e{}e", "f".repeat(61));
260
+
let s4 = format!("g{}g", "h".repeat(61));
261
+
let s5 = format!("i{}i", "j".repeat(59));
262
+
let valid_317 = format!("{}.{}.{}.{}.{}", s1, s2, s3, s4, s5);
263
+
assert_eq!(valid_317.len(), 317);
264
+
assert!(Nsid::new(&valid_317).is_ok());
265
+
266
+
let s5_long = format!("i{}i", "j".repeat(60));
267
+
let too_long_318 = format!("{}.{}.{}.{}.{}", s1, s2, s3, s4, s5_long);
268
+
assert_eq!(too_long_318.len(), 318);
269
+
assert!(Nsid::new(&too_long_318).is_err());
270
+
}
271
+
272
+
#[test]
273
+
fn segment_length() {
274
+
let valid_63 = format!("{}.{}.foo", "a".repeat(63), "b".repeat(63));
275
+
assert!(Nsid::new(&valid_63).is_ok());
276
+
277
+
let too_long_64 = format!("{}.b.foo", "a".repeat(64));
278
+
assert!(Nsid::new(&too_long_64).is_err());
279
+
}
280
+
281
+
#[test]
282
+
fn first_segment_cannot_start_with_digit() {
283
+
assert!(Nsid::new("com.example.foo").is_ok());
284
+
assert!(Nsid::new("9com.example.foo").is_err());
285
+
}
286
+
287
+
#[test]
288
+
fn name_segment_rules() {
289
+
assert!(Nsid::new("com.example.foo").is_ok());
290
+
assert!(Nsid::new("com.example.fooBar123").is_ok());
291
+
assert!(Nsid::new("com.example.9foo").is_err()); // can't start with digit
292
+
assert!(Nsid::new("com.example.foo-bar").is_err()); // no hyphens in name
293
+
}
294
+
295
+
#[test]
296
+
fn domain_segment_rules() {
297
+
assert!(Nsid::new("foo-bar.example.baz").is_ok());
298
+
assert!(Nsid::new("foo.bar-baz.qux").is_ok());
299
+
assert!(Nsid::new("-foo.bar.baz").is_err()); // can't start with hyphen
300
+
assert!(Nsid::new("foo-.bar.baz").is_err()); // can't end with hyphen
301
+
}
302
+
303
+
#[test]
304
+
fn case_sensitivity() {
305
+
// Domain should be case-insensitive per spec (but not enforced in validation)
306
+
// Name is case-sensitive
307
+
assert!(Nsid::new("com.example.fooBar").is_ok());
308
+
assert!(Nsid::new("com.example.FooBar").is_ok());
309
+
}
310
+
311
+
#[test]
312
+
fn no_hyphens_in_name() {
313
+
assert!(Nsid::new("com.example.foo").is_ok());
314
+
assert!(Nsid::new("com.example.foo-bar").is_err());
315
+
assert!(Nsid::new("com.example.fooBar").is_ok());
316
+
}
317
+
}
+69
crates/jacquard-common/src/types/recordkey.rs
+69
crates/jacquard-common/src/types/recordkey.rs
···
417
417
self.as_str()
418
418
}
419
419
}
420
+
421
+
#[cfg(test)]
422
+
mod tests {
423
+
use super::*;
424
+
425
+
#[test]
426
+
fn valid_rkeys() {
427
+
assert!(Rkey::new("3jzfcijpj2z2a").is_ok()); // TID format
428
+
assert!(Rkey::new("self").is_ok()); // literal
429
+
assert!(Rkey::new("com.example.foo").is_ok()); // NSID format
430
+
assert!(Rkey::new("foo-bar_baz").is_ok());
431
+
assert!(Rkey::new("foo:bar").is_ok());
432
+
assert!(Rkey::new("foo~bar").is_ok());
433
+
}
434
+
435
+
#[test]
436
+
fn length_constraints() {
437
+
assert!(Rkey::new("a").is_ok()); // min 1
438
+
let valid_512 = "a".repeat(512);
439
+
assert_eq!(valid_512.len(), 512);
440
+
assert!(Rkey::new(&valid_512).is_ok());
441
+
442
+
let too_long_513 = "a".repeat(513);
443
+
assert_eq!(too_long_513.len(), 513);
444
+
assert!(Rkey::new(&too_long_513).is_err());
445
+
}
446
+
447
+
#[test]
448
+
fn disallowed_literals() {
449
+
assert!(Rkey::new(".").is_err());
450
+
assert!(Rkey::new("..").is_err());
451
+
assert!(Rkey::new("...").is_ok()); // 3+ dots is fine
452
+
}
453
+
454
+
#[test]
455
+
fn allowed_characters() {
456
+
assert!(Rkey::new("abc123").is_ok());
457
+
assert!(Rkey::new("ABC123").is_ok());
458
+
assert!(Rkey::new("foo-bar").is_ok());
459
+
assert!(Rkey::new("foo_bar").is_ok());
460
+
assert!(Rkey::new("foo.bar").is_ok());
461
+
assert!(Rkey::new("foo:bar").is_ok());
462
+
assert!(Rkey::new("foo~bar").is_ok());
463
+
}
464
+
465
+
#[test]
466
+
fn disallowed_characters() {
467
+
assert!(Rkey::new("foo bar").is_err());
468
+
assert!(Rkey::new("foo@bar").is_err());
469
+
assert!(Rkey::new("foo#bar").is_err());
470
+
assert!(Rkey::new("foo/bar").is_err());
471
+
assert!(Rkey::new("foo\\bar").is_err());
472
+
}
473
+
474
+
#[test]
475
+
fn literal_key_self() {
476
+
let key = LiteralKey::<SelfRecord>::new("self").unwrap();
477
+
assert_eq!(key.as_str(), "self");
478
+
479
+
assert!(LiteralKey::<SelfRecord>::new("Self").is_ok()); // case insensitive
480
+
assert!(LiteralKey::<SelfRecord>::new("other").is_err());
481
+
}
482
+
483
+
#[test]
484
+
fn literal_key_disallowed() {
485
+
assert!(LiteralKey::<SelfRecord>::new(".").is_err());
486
+
assert!(LiteralKey::<SelfRecord>::new("..").is_err());
487
+
}
488
+
}
+1
-1
crates/jacquard-common/src/types/string.rs
+1
-1
crates/jacquard-common/src/types/string.rs
+70
crates/jacquard-common/src/types/tid.rs
+70
crates/jacquard-common/src/types/tid.rs
···
307
307
Self::new()
308
308
}
309
309
}
310
+
311
+
#[cfg(test)]
312
+
mod tests {
313
+
use super::*;
314
+
315
+
#[test]
316
+
fn valid_tids() {
317
+
assert!(Tid::new("3jzfcijpj2z2a").is_ok());
318
+
assert!(Tid::new("2222222222222").is_ok());
319
+
assert!(Tid::new("j7777777777777").is_err()); // j is valid for first char but makes high bit set
320
+
}
321
+
322
+
#[test]
323
+
fn exact_length() {
324
+
assert!(Tid::new("3jzfcijpj2z2a").is_ok());
325
+
assert!(Tid::new("3jzfcijpj2z2").is_err()); // 12 chars
326
+
assert!(Tid::new("3jzfcijpj2z2aa").is_err()); // 14 chars
327
+
}
328
+
329
+
#[test]
330
+
fn first_char_constraint() {
331
+
// First char must be 2-7 or a-j (not k-z)
332
+
assert!(Tid::new("2222222222222").is_ok());
333
+
assert!(Tid::new("7777777777777").is_ok());
334
+
assert!(Tid::new("a222222222222").is_ok());
335
+
assert!(Tid::new("j222222222222").is_ok());
336
+
assert!(Tid::new("k222222222222").is_err());
337
+
assert!(Tid::new("z222222222222").is_err());
338
+
}
339
+
340
+
#[test]
341
+
fn remaining_chars_constraint() {
342
+
// Remaining 12 chars must be 2-7 or a-z
343
+
assert!(Tid::new("3abcdefghijkl").is_ok());
344
+
assert!(Tid::new("3zzzzzzzzzzzz").is_ok());
345
+
assert!(Tid::new("3222222222222").is_ok());
346
+
assert!(Tid::new("3777777777777").is_ok());
347
+
}
348
+
349
+
#[test]
350
+
fn disallowed_characters() {
351
+
assert!(Tid::new("3jzfcijpj2z2A").is_err()); // uppercase
352
+
assert!(Tid::new("3jzfcijpj2z21").is_err()); // 1 not allowed
353
+
assert!(Tid::new("3jzfcijpj2z28").is_err()); // 8 not allowed
354
+
assert!(Tid::new("3jzfcijpj2z2-").is_err()); // special char
355
+
}
356
+
357
+
#[test]
358
+
fn generation_and_comparison() {
359
+
let tid1 = Tid::now_0();
360
+
std::thread::sleep(std::time::Duration::from_micros(10));
361
+
let tid2 = Tid::now_0();
362
+
363
+
assert!(tid1.as_str().len() == 13);
364
+
assert!(tid2.as_str().len() == 13);
365
+
assert!(tid2.newer_than(&tid1));
366
+
assert!(tid1.older_than(&tid2));
367
+
}
368
+
369
+
#[test]
370
+
fn ticker_monotonic() {
371
+
let mut ticker = Ticker::new();
372
+
let tid1 = ticker.next(None);
373
+
let tid2 = ticker.next(Some(tid1.clone()));
374
+
let tid3 = ticker.next(Some(tid2.clone()));
375
+
376
+
assert!(tid2.newer_than(&tid1));
377
+
assert!(tid3.newer_than(&tid2));
378
+
}
379
+
}
+21
-324
crates/jacquard-common/src/types/value.rs
+21
-324
crates/jacquard-common/src/types/value.rs
···
1
-
use base64::{
2
-
Engine,
3
-
prelude::{BASE64_STANDARD, BASE64_STANDARD_NO_PAD, BASE64_URL_SAFE, BASE64_URL_SAFE_NO_PAD},
4
-
};
1
+
use crate::types::{DataModelType, blob::Blob, string::*};
5
2
use bytes::Bytes;
6
3
use ipld_core::ipld::Ipld;
7
-
use serde::{Deserialize, Deserializer, Serialize, Serializer};
8
4
use smol_str::{SmolStr, ToSmolStr};
9
-
use std::{collections::BTreeMap, str::FromStr};
10
-
use url::Url;
5
+
use std::collections::BTreeMap;
11
6
12
-
use crate::types::{
13
-
DataModelType, LexiconStringType,
14
-
blob::{Blob, MimeType},
15
-
string::*,
16
-
};
7
+
pub mod parsing;
8
+
pub mod serde_impl;
9
+
10
+
#[cfg(test)]
11
+
mod tests;
17
12
18
13
#[derive(Debug, Clone, PartialEq, Eq)]
19
14
pub enum Data<'s> {
···
21
16
Boolean(bool),
22
17
Integer(i64),
23
18
String(AtprotoStr<'s>),
24
-
Bytes(Bytes), // maybe need custom type for serialization
25
-
CidLink(Cid<'s>), // maybe need custom type for serialization
19
+
Bytes(Bytes),
20
+
CidLink(Cid<'s>),
26
21
Array(Array<'s>),
27
22
Object(Object<'s>),
28
23
Blob(Blob<'s>),
···
98
93
json: &'s serde_json::Map<String, serde_json::Value>,
99
94
) -> Result<Data<'s>, AtDataError> {
100
95
if let Some(type_field) = json.get("$type").and_then(|v| v.as_str()) {
101
-
if infer_from_type(type_field) == DataModelType::Blob {
102
-
if let Some(blob) = json_to_blob(json) {
96
+
if parsing::infer_from_type(type_field) == DataModelType::Blob {
97
+
if let Some(blob) = parsing::json_to_blob(json) {
103
98
return Ok(Data::Blob(blob));
104
99
}
105
100
}
···
110
105
if key == "$type" {
111
106
map.insert(key.to_smolstr(), Data::from_json(value)?);
112
107
}
113
-
match string_key_type_guess(key) {
108
+
match parsing::string_key_type_guess(key) {
114
109
DataModelType::Null if value.is_null() => {
115
110
map.insert(key.to_smolstr(), Data::Null);
116
111
}
···
121
116
map.insert(key.to_smolstr(), Data::Integer(value.as_i64().unwrap()));
122
117
}
123
118
DataModelType::Bytes if value.is_string() => {
124
-
map.insert(key.to_smolstr(), decode_bytes(value.as_str().unwrap()));
119
+
map.insert(
120
+
key.to_smolstr(),
121
+
parsing::decode_bytes(value.as_str().unwrap()),
122
+
);
125
123
}
126
124
DataModelType::CidLink => {
127
125
if let Some(value) = value.as_object() {
···
153
151
);
154
152
}
155
153
DataModelType::String(string_type) if value.is_string() => {
156
-
insert_string(&mut map, key, value.as_str().unwrap(), string_type);
154
+
parsing::insert_string(&mut map, key, value.as_str().unwrap(), string_type)?;
157
155
}
158
156
_ => {
159
157
map.insert(key.to_smolstr(), Data::from_json(value)?);
···
166
164
167
165
pub fn from_cbor(cbor: &'s BTreeMap<String, Ipld>) -> Result<Data<'s>, AtDataError> {
168
166
if let Some(Ipld::String(type_field)) = cbor.get("$type") {
169
-
if infer_from_type(type_field) == DataModelType::Blob {
170
-
if let Some(blob) = cbor_to_blob(cbor) {
167
+
if parsing::infer_from_type(type_field) == DataModelType::Blob {
168
+
if let Some(blob) = parsing::cbor_to_blob(cbor) {
171
169
return Ok(Data::Blob(blob));
172
170
}
173
171
}
···
178
176
if key == "$type" {
179
177
map.insert(key.to_smolstr(), Data::from_cbor(value)?);
180
178
}
181
-
match (string_key_type_guess(key), value) {
179
+
match (parsing::string_key_type_guess(key), value) {
182
180
(DataModelType::Null, Ipld::Null) => {
183
181
map.insert(key.to_smolstr(), Data::Null);
184
182
}
···
201
199
map.insert(key.to_smolstr(), Object::from_cbor(value)?);
202
200
}
203
201
(DataModelType::String(string_type), Ipld::String(value)) => {
204
-
insert_string(&mut map, key, value, string_type);
202
+
parsing::insert_string(&mut map, key, value, string_type)?;
205
203
}
206
204
_ => {
207
205
map.insert(key.to_smolstr(), Data::from_cbor(value)?);
···
212
210
Ok(Data::Object(Object(map)))
213
211
}
214
212
}
215
-
216
-
pub fn insert_string<'s>(
217
-
map: &mut BTreeMap<SmolStr, Data<'s>>,
218
-
key: &'s str,
219
-
value: &'s str,
220
-
string_type: LexiconStringType,
221
-
) -> Result<(), AtDataError> {
222
-
match string_type {
223
-
LexiconStringType::Datetime => {
224
-
if let Ok(datetime) = Datetime::from_str(value) {
225
-
map.insert(
226
-
key.to_smolstr(),
227
-
Data::String(AtprotoStr::Datetime(datetime)),
228
-
);
229
-
} else {
230
-
map.insert(
231
-
key.to_smolstr(),
232
-
Data::String(AtprotoStr::String(value.into())),
233
-
);
234
-
}
235
-
}
236
-
LexiconStringType::AtUri => {
237
-
if let Ok(value) = AtUri::new(value) {
238
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::AtUri(value)));
239
-
} else {
240
-
map.insert(
241
-
key.to_smolstr(),
242
-
Data::String(AtprotoStr::String(value.into())),
243
-
);
244
-
}
245
-
}
246
-
LexiconStringType::Did => {
247
-
if let Ok(value) = Did::new(value) {
248
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Did(value)));
249
-
} else {
250
-
map.insert(
251
-
key.to_smolstr(),
252
-
Data::String(AtprotoStr::String(value.into())),
253
-
);
254
-
}
255
-
}
256
-
LexiconStringType::Handle => {
257
-
if let Ok(value) = Handle::new(value) {
258
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Handle(value)));
259
-
} else {
260
-
map.insert(
261
-
key.to_smolstr(),
262
-
Data::String(AtprotoStr::String(value.into())),
263
-
);
264
-
}
265
-
}
266
-
LexiconStringType::AtIdentifier => {
267
-
if let Ok(value) = AtIdentifier::new(value) {
268
-
map.insert(
269
-
key.to_smolstr(),
270
-
Data::String(AtprotoStr::AtIdentifier(value)),
271
-
);
272
-
} else {
273
-
map.insert(
274
-
key.to_smolstr(),
275
-
Data::String(AtprotoStr::String(value.into())),
276
-
);
277
-
}
278
-
}
279
-
LexiconStringType::Nsid => {
280
-
if let Ok(value) = Nsid::new(value) {
281
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Nsid(value)));
282
-
} else {
283
-
map.insert(
284
-
key.to_smolstr(),
285
-
Data::String(AtprotoStr::String(value.into())),
286
-
);
287
-
}
288
-
}
289
-
LexiconStringType::Cid => {
290
-
if let Ok(value) = Cid::new(value.as_bytes()) {
291
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Cid(value)));
292
-
} else {
293
-
map.insert(
294
-
key.to_smolstr(),
295
-
Data::String(AtprotoStr::String(value.into())),
296
-
);
297
-
}
298
-
}
299
-
LexiconStringType::Language => {
300
-
if let Ok(value) = Language::new(value) {
301
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Language(value)));
302
-
} else {
303
-
map.insert(
304
-
key.to_smolstr(),
305
-
Data::String(AtprotoStr::String(value.into())),
306
-
);
307
-
}
308
-
}
309
-
LexiconStringType::Tid => {
310
-
if let Ok(value) = Tid::new(value) {
311
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Tid(value)));
312
-
} else {
313
-
map.insert(
314
-
key.to_smolstr(),
315
-
Data::String(AtprotoStr::String(value.into())),
316
-
);
317
-
}
318
-
}
319
-
LexiconStringType::RecordKey => {
320
-
if let Ok(value) = Rkey::new(value) {
321
-
map.insert(
322
-
key.to_smolstr(),
323
-
Data::String(AtprotoStr::RecordKey(RecordKey::from(value))),
324
-
);
325
-
} else {
326
-
map.insert(
327
-
key.to_smolstr(),
328
-
Data::String(AtprotoStr::String(value.into())),
329
-
);
330
-
}
331
-
}
332
-
LexiconStringType::Uri(_) => {
333
-
if let Ok(uri) = Uri::new(value) {
334
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Uri(uri)));
335
-
} else {
336
-
map.insert(
337
-
key.to_smolstr(),
338
-
Data::String(AtprotoStr::String(value.into())),
339
-
);
340
-
}
341
-
}
342
-
LexiconStringType::String => {
343
-
map.insert(key.to_smolstr(), Data::String(parse_string(value)));
344
-
}
345
-
}
346
-
Ok(())
347
-
}
348
-
349
-
/// smarter parsing to avoid trying as many posibilities.
350
-
pub fn parse_string<'s>(string: &'s str) -> AtprotoStr<'s> {
351
-
if string.len() < 2048 && string.starts_with("did:") {
352
-
if let Ok(did) = Did::new(string) {
353
-
return AtprotoStr::Did(did);
354
-
}
355
-
} else if string.starts_with("20") && string.ends_with("Z") {
356
-
// probably a date (for the next 75 years)
357
-
if let Ok(datetime) = Datetime::from_str(string) {
358
-
return AtprotoStr::Datetime(datetime);
359
-
}
360
-
} else if string.starts_with("at://") {
361
-
if let Ok(uri) = AtUri::new(string) {
362
-
return AtprotoStr::AtUri(uri);
363
-
}
364
-
} else if string.starts_with("https://") {
365
-
if let Ok(uri) = Url::parse(string) {
366
-
return AtprotoStr::Uri(Uri::Https(uri));
367
-
}
368
-
} else if string.starts_with("wss://") {
369
-
if let Ok(uri) = Url::parse(string) {
370
-
return AtprotoStr::Uri(Uri::Https(uri));
371
-
}
372
-
} else if string.starts_with("ipfs://") {
373
-
return AtprotoStr::Uri(Uri::Cid(Cid::str(string)));
374
-
} else if string.contains('.') && !string.contains([' ', '\n']) {
375
-
if string.len() < 253 && Url::parse(string).is_ok() {
376
-
// probably a handle
377
-
if let Ok(handle) = AtIdentifier::new(string) {
378
-
return AtprotoStr::AtIdentifier(handle);
379
-
} else {
380
-
return AtprotoStr::Uri(Uri::Any(string.into()));
381
-
}
382
-
} else if let Ok(nsid) = Nsid::new(string) {
383
-
return AtprotoStr::Nsid(nsid);
384
-
}
385
-
} else if string.len() == 13 {
386
-
if let Ok(tid) = Tid::new(string) {
387
-
return AtprotoStr::Tid(tid);
388
-
}
389
-
} else if !string.contains([' ', '\n']) {
390
-
// cid?
391
-
if let Ok(cid) = Cid::new(string.as_bytes()) {
392
-
return AtprotoStr::Cid(cid);
393
-
}
394
-
}
395
-
396
-
AtprotoStr::String(string.into())
397
-
}
398
-
399
-
/// First-level guess at what we should parse the corresponding value as
400
-
/// Helps speed up parsing, avoids some ambiguities.
401
-
pub fn string_key_type_guess(key: &str) -> DataModelType {
402
-
match key {
403
-
"cid" => DataModelType::String(LexiconStringType::Cid),
404
-
"uri" => DataModelType::String(LexiconStringType::Uri(super::UriType::Any)),
405
-
"did" => DataModelType::String(LexiconStringType::Did),
406
-
"handle" => DataModelType::String(LexiconStringType::AtIdentifier),
407
-
"ref" => DataModelType::CidLink,
408
-
"list" => DataModelType::String(LexiconStringType::AtUri),
409
-
"blobref" => DataModelType::Blob,
410
-
"createdAt" | "created" | "indexedAt" | "issuedAt" | "updatedAt" | "playedTime" => {
411
-
DataModelType::String(LexiconStringType::Datetime)
412
-
}
413
-
"size" | "width" | "height" => DataModelType::Integer,
414
-
"value" | "record" | "embed" => DataModelType::Object,
415
-
"text" | "displayName" | "alt" | "name" | "description" => {
416
-
DataModelType::String(LexiconStringType::String)
417
-
}
418
-
"langs" | "blobs" | "images" | "labels" => DataModelType::Array,
419
-
"$bytes" => DataModelType::Bytes,
420
-
"$link" => DataModelType::String(LexiconStringType::Cid),
421
-
"$type" => DataModelType::String(LexiconStringType::String),
422
-
423
-
// we assume others are strings speficially because it's easy to check if a serde_json::Value
424
-
// or Ipld value is at least a string, and then we fall back to Object/Map.
425
-
_ => DataModelType::String(LexiconStringType::String),
426
-
}
427
-
}
428
-
429
-
pub fn cbor_to_blob<'b>(blob: &'b BTreeMap<String, Ipld>) -> Option<Blob<'b>> {
430
-
let mime_type = blob.get("mimeType").and_then(|o| {
431
-
if let Ipld::String(string) = o {
432
-
Some(string)
433
-
} else {
434
-
None
435
-
}
436
-
});
437
-
if let Some(Ipld::Link(value)) = blob.get("ref") {
438
-
let size = blob.get("size").and_then(|o| {
439
-
if let Ipld::Integer(i) = o {
440
-
Some(*i as i64)
441
-
} else {
442
-
None
443
-
}
444
-
});
445
-
if let (Some(mime_type), Some(size)) = (mime_type, size) {
446
-
return Some(Blob {
447
-
r#ref: Cid::ipld(*value),
448
-
mime_type: MimeType::raw(mime_type),
449
-
size: size as usize,
450
-
});
451
-
}
452
-
} else if let Some(Ipld::String(value)) = blob.get("cid") {
453
-
if let Some(mime_type) = mime_type {
454
-
return Some(Blob {
455
-
r#ref: Cid::str(value),
456
-
mime_type: MimeType::raw(mime_type),
457
-
size: 0,
458
-
});
459
-
}
460
-
}
461
-
462
-
None
463
-
}
464
-
465
-
pub fn json_to_blob<'b>(blob: &'b serde_json::Map<String, serde_json::Value>) -> Option<Blob<'b>> {
466
-
let mime_type = blob.get("mimeType").and_then(|v| v.as_str());
467
-
if let Some(value) = blob.get("ref") {
468
-
if let Some(value) = value
469
-
.as_object()
470
-
.and_then(|o| o.get("$link"))
471
-
.and_then(|v| v.as_str())
472
-
{
473
-
let size = blob.get("size").and_then(|v| v.as_u64());
474
-
if let (Some(mime_type), Some(size)) = (mime_type, size) {
475
-
return Some(Blob {
476
-
r#ref: Cid::str(value),
477
-
mime_type: MimeType::raw(mime_type),
478
-
size: size as usize,
479
-
});
480
-
}
481
-
}
482
-
} else if let Some(value) = blob.get("cid").and_then(|v| v.as_str()) {
483
-
if let Some(mime_type) = mime_type {
484
-
return Some(Blob {
485
-
r#ref: Cid::str(value),
486
-
mime_type: MimeType::raw(mime_type),
487
-
size: 0,
488
-
});
489
-
}
490
-
}
491
-
492
-
None
493
-
}
494
-
495
-
pub fn infer_from_type(type_field: &str) -> DataModelType {
496
-
match type_field {
497
-
"blob" => DataModelType::Blob,
498
-
_ => DataModelType::Object,
499
-
}
500
-
}
501
-
502
-
pub fn decode_bytes<'s>(bytes: &'s str) -> Data<'s> {
503
-
// First one should just work. rest are insurance.
504
-
if let Ok(bytes) = BASE64_STANDARD.decode(bytes) {
505
-
Data::Bytes(Bytes::from_owner(bytes))
506
-
} else if let Ok(bytes) = BASE64_STANDARD_NO_PAD.decode(bytes) {
507
-
Data::Bytes(Bytes::from_owner(bytes))
508
-
} else if let Ok(bytes) = BASE64_URL_SAFE.decode(bytes) {
509
-
Data::Bytes(Bytes::from_owner(bytes))
510
-
} else if let Ok(bytes) = BASE64_URL_SAFE_NO_PAD.decode(bytes) {
511
-
Data::Bytes(Bytes::from_owner(bytes))
512
-
} else {
513
-
Data::String(AtprotoStr::String(bytes.into()))
514
-
}
515
-
}
+320
crates/jacquard-common/src/types/value/parsing.rs
+320
crates/jacquard-common/src/types/value/parsing.rs
···
1
+
use crate::{
2
+
IntoStatic,
3
+
types::{
4
+
DataModelType, LexiconStringType, UriType,
5
+
blob::{Blob, MimeType},
6
+
string::*,
7
+
value::{AtDataError, Data},
8
+
},
9
+
};
10
+
use base64::{
11
+
Engine,
12
+
prelude::{BASE64_STANDARD, BASE64_STANDARD_NO_PAD, BASE64_URL_SAFE, BASE64_URL_SAFE_NO_PAD},
13
+
};
14
+
use bytes::Bytes;
15
+
use ipld_core::ipld::Ipld;
16
+
use smol_str::{SmolStr, ToSmolStr};
17
+
use std::{collections::BTreeMap, str::FromStr};
18
+
use url::Url;
19
+
20
+
pub fn insert_string<'s>(
21
+
map: &mut BTreeMap<SmolStr, Data<'s>>,
22
+
key: &'s str,
23
+
value: &'s str,
24
+
string_type: LexiconStringType,
25
+
) -> Result<(), AtDataError> {
26
+
match string_type {
27
+
LexiconStringType::Datetime => {
28
+
if let Ok(datetime) = Datetime::from_str(value) {
29
+
map.insert(
30
+
key.to_smolstr(),
31
+
Data::String(AtprotoStr::Datetime(datetime)),
32
+
);
33
+
} else {
34
+
map.insert(
35
+
key.to_smolstr(),
36
+
Data::String(AtprotoStr::String(value.into())),
37
+
);
38
+
}
39
+
}
40
+
LexiconStringType::AtUri => {
41
+
if let Ok(value) = AtUri::new(value) {
42
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::AtUri(value)));
43
+
} else {
44
+
map.insert(
45
+
key.to_smolstr(),
46
+
Data::String(AtprotoStr::String(value.into())),
47
+
);
48
+
}
49
+
}
50
+
LexiconStringType::Did => {
51
+
if let Ok(value) = Did::new(value) {
52
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Did(value)));
53
+
} else {
54
+
map.insert(
55
+
key.to_smolstr(),
56
+
Data::String(AtprotoStr::String(value.into())),
57
+
);
58
+
}
59
+
}
60
+
LexiconStringType::Handle => {
61
+
if let Ok(value) = Handle::new(value) {
62
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Handle(value)));
63
+
} else {
64
+
map.insert(
65
+
key.to_smolstr(),
66
+
Data::String(AtprotoStr::String(value.into())),
67
+
);
68
+
}
69
+
}
70
+
LexiconStringType::AtIdentifier => {
71
+
if let Ok(value) = AtIdentifier::new(value) {
72
+
map.insert(
73
+
key.to_smolstr(),
74
+
Data::String(AtprotoStr::AtIdentifier(value)),
75
+
);
76
+
} else {
77
+
map.insert(
78
+
key.to_smolstr(),
79
+
Data::String(AtprotoStr::String(value.into())),
80
+
);
81
+
}
82
+
}
83
+
LexiconStringType::Nsid => {
84
+
if let Ok(value) = Nsid::new(value) {
85
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Nsid(value)));
86
+
} else {
87
+
map.insert(
88
+
key.to_smolstr(),
89
+
Data::String(AtprotoStr::String(value.into())),
90
+
);
91
+
}
92
+
}
93
+
LexiconStringType::Cid => {
94
+
if let Ok(value) = Cid::new(value.as_bytes()) {
95
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Cid(value)));
96
+
} else {
97
+
map.insert(
98
+
key.to_smolstr(),
99
+
Data::String(AtprotoStr::String(value.into())),
100
+
);
101
+
}
102
+
}
103
+
LexiconStringType::Language => {
104
+
if let Ok(value) = Language::new(value) {
105
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Language(value)));
106
+
} else {
107
+
map.insert(
108
+
key.to_smolstr(),
109
+
Data::String(AtprotoStr::String(value.into())),
110
+
);
111
+
}
112
+
}
113
+
LexiconStringType::Tid => {
114
+
if let Ok(value) = Tid::new(value) {
115
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Tid(value)));
116
+
} else {
117
+
map.insert(
118
+
key.to_smolstr(),
119
+
Data::String(AtprotoStr::String(value.into())),
120
+
);
121
+
}
122
+
}
123
+
LexiconStringType::RecordKey => {
124
+
if let Ok(value) = Rkey::new(value) {
125
+
map.insert(
126
+
key.to_smolstr(),
127
+
Data::String(AtprotoStr::RecordKey(RecordKey::from(value))),
128
+
);
129
+
} else {
130
+
map.insert(
131
+
key.to_smolstr(),
132
+
Data::String(AtprotoStr::String(value.into())),
133
+
);
134
+
}
135
+
}
136
+
LexiconStringType::Uri(_) => {
137
+
if let Ok(uri) = Uri::new(value) {
138
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Uri(uri)));
139
+
} else {
140
+
map.insert(
141
+
key.to_smolstr(),
142
+
Data::String(AtprotoStr::String(value.into())),
143
+
);
144
+
}
145
+
}
146
+
LexiconStringType::String => {
147
+
map.insert(key.to_smolstr(), Data::String(parse_string(value)));
148
+
}
149
+
}
150
+
Ok(())
151
+
}
152
+
153
+
/// smarter parsing to avoid trying as many posibilities.
154
+
pub fn parse_string<'s>(string: &'s str) -> AtprotoStr<'s> {
155
+
if string.len() < 2048 && string.starts_with("did:") {
156
+
if let Ok(did) = Did::new(string) {
157
+
return AtprotoStr::Did(did);
158
+
}
159
+
} else if string.starts_with("20") && string.ends_with("Z") {
160
+
// probably a date (for the next 75 years)
161
+
if let Ok(datetime) = Datetime::from_str(string) {
162
+
return AtprotoStr::Datetime(datetime);
163
+
}
164
+
} else if string.starts_with("at://") {
165
+
if let Ok(uri) = AtUri::new(string) {
166
+
return AtprotoStr::AtUri(uri);
167
+
}
168
+
} else if string.starts_with("https://") {
169
+
if let Ok(uri) = Url::parse(string) {
170
+
return AtprotoStr::Uri(Uri::Https(uri));
171
+
}
172
+
} else if string.starts_with("wss://") {
173
+
if let Ok(uri) = Url::parse(string) {
174
+
return AtprotoStr::Uri(Uri::Https(uri));
175
+
}
176
+
} else if string.starts_with("ipfs://") {
177
+
return AtprotoStr::Uri(Uri::Cid(Cid::str(string)));
178
+
} else if string.contains('.') && !string.contains([' ', '\n']) {
179
+
if string.len() < 253 && Url::parse(string).is_ok() {
180
+
// probably a handle
181
+
if let Ok(handle) = AtIdentifier::new(string) {
182
+
return AtprotoStr::AtIdentifier(handle);
183
+
} else {
184
+
return AtprotoStr::Uri(Uri::Any(string.into()));
185
+
}
186
+
} else if let Ok(nsid) = Nsid::new(string) {
187
+
return AtprotoStr::Nsid(nsid);
188
+
}
189
+
} else if string.len() == 13 {
190
+
if let Ok(tid) = Tid::new(string) {
191
+
return AtprotoStr::Tid(tid);
192
+
}
193
+
} else if !string.contains([' ', '\n']) && string.len() > 20 {
194
+
// CID: must be longer than typical short strings to avoid false positives
195
+
// Most CIDs are 46+ chars (base32 encoded), minimum realistic is around 30
196
+
if let Ok(cid) = Cid::new(string.as_bytes()) {
197
+
return AtprotoStr::Cid(cid);
198
+
}
199
+
}
200
+
201
+
AtprotoStr::String(string.into())
202
+
}
203
+
204
+
/// First-level guess at what we should parse the corresponding value as
205
+
/// Helps speed up parsing, avoids some ambiguities.
206
+
pub fn string_key_type_guess(key: &str) -> DataModelType {
207
+
match key {
208
+
"cid" => DataModelType::String(LexiconStringType::Cid),
209
+
"uri" => DataModelType::String(LexiconStringType::Uri(UriType::Any)),
210
+
"did" => DataModelType::String(LexiconStringType::Did),
211
+
"handle" => DataModelType::String(LexiconStringType::AtIdentifier),
212
+
"ref" => DataModelType::CidLink,
213
+
"list" => DataModelType::String(LexiconStringType::AtUri),
214
+
"blobref" => DataModelType::Blob,
215
+
"createdAt" | "created" | "indexedAt" | "issuedAt" | "updatedAt" | "playedTime" => {
216
+
DataModelType::String(LexiconStringType::Datetime)
217
+
}
218
+
"size" | "width" | "height" => DataModelType::Integer,
219
+
"value" | "record" | "embed" => DataModelType::Object,
220
+
"text" | "displayName" | "alt" | "name" | "description" => {
221
+
DataModelType::String(LexiconStringType::String)
222
+
}
223
+
"langs" | "blobs" | "images" | "labels" => DataModelType::Array,
224
+
"$bytes" => DataModelType::Bytes,
225
+
"$link" => DataModelType::String(LexiconStringType::Cid),
226
+
"$type" => DataModelType::String(LexiconStringType::String),
227
+
228
+
// we assume others are strings speficially because it's easy to check if a serde_json::Value
229
+
// or Ipld value is at least a string, and then we fall back to Object/Map.
230
+
_ => DataModelType::String(LexiconStringType::String),
231
+
}
232
+
}
233
+
234
+
pub fn cbor_to_blob<'b>(blob: &'b BTreeMap<String, Ipld>) -> Option<Blob<'b>> {
235
+
let mime_type = blob.get("mimeType").and_then(|o| {
236
+
if let Ipld::String(string) = o {
237
+
Some(string)
238
+
} else {
239
+
None
240
+
}
241
+
});
242
+
if let Some(Ipld::Link(value)) = blob.get("ref") {
243
+
let size = blob.get("size").and_then(|o| {
244
+
if let Ipld::Integer(i) = o {
245
+
Some(*i as i64)
246
+
} else {
247
+
None
248
+
}
249
+
});
250
+
if let (Some(mime_type), Some(size)) = (mime_type, size) {
251
+
return Some(Blob {
252
+
r#ref: Cid::ipld(*value),
253
+
mime_type: MimeType::raw(mime_type),
254
+
size: size as usize,
255
+
});
256
+
}
257
+
} else if let Some(Ipld::String(value)) = blob.get("cid") {
258
+
if let Some(mime_type) = mime_type {
259
+
return Some(Blob {
260
+
r#ref: Cid::str(value),
261
+
mime_type: MimeType::raw(mime_type),
262
+
size: 0,
263
+
});
264
+
}
265
+
}
266
+
267
+
None
268
+
}
269
+
270
+
pub fn json_to_blob<'b>(blob: &'b serde_json::Map<String, serde_json::Value>) -> Option<Blob<'b>> {
271
+
let mime_type = blob.get("mimeType").and_then(|v| v.as_str());
272
+
if let Some(value) = blob.get("ref") {
273
+
if let Some(value) = value
274
+
.as_object()
275
+
.and_then(|o| o.get("$link"))
276
+
.and_then(|v| v.as_str())
277
+
{
278
+
let size = blob.get("size").and_then(|v| v.as_u64());
279
+
if let (Some(mime_type), Some(size)) = (mime_type, size) {
280
+
return Some(Blob {
281
+
r#ref: Cid::str(value),
282
+
mime_type: MimeType::raw(mime_type),
283
+
size: size as usize,
284
+
});
285
+
}
286
+
}
287
+
} else if let Some(value) = blob.get("cid").and_then(|v| v.as_str()) {
288
+
if let Some(mime_type) = mime_type {
289
+
return Some(Blob {
290
+
r#ref: Cid::str(value),
291
+
mime_type: MimeType::raw(mime_type),
292
+
size: 0,
293
+
});
294
+
}
295
+
}
296
+
297
+
None
298
+
}
299
+
300
+
pub fn infer_from_type(type_field: &str) -> DataModelType {
301
+
match type_field {
302
+
"blob" => DataModelType::Blob,
303
+
_ => DataModelType::Object,
304
+
}
305
+
}
306
+
307
+
pub fn decode_bytes<'s>(bytes: &str) -> Data<'s> {
308
+
// First one should just work. rest are insurance.
309
+
if let Ok(bytes) = BASE64_STANDARD.decode(bytes) {
310
+
Data::Bytes(Bytes::from_owner(bytes))
311
+
} else if let Ok(bytes) = BASE64_STANDARD_NO_PAD.decode(bytes) {
312
+
Data::Bytes(Bytes::from_owner(bytes))
313
+
} else if let Ok(bytes) = BASE64_URL_SAFE.decode(bytes) {
314
+
Data::Bytes(Bytes::from_owner(bytes))
315
+
} else if let Ok(bytes) = BASE64_URL_SAFE_NO_PAD.decode(bytes) {
316
+
Data::Bytes(Bytes::from_owner(bytes))
317
+
} else {
318
+
Data::String(AtprotoStr::String(CowStr::Borrowed(bytes).into_static()))
319
+
}
320
+
}
+390
crates/jacquard-common/src/types/value/serde_impl.rs
+390
crates/jacquard-common/src/types/value/serde_impl.rs
···
1
+
use core::fmt;
2
+
use std::{collections::BTreeMap, str::FromStr};
3
+
4
+
use base64::{Engine, prelude::BASE64_STANDARD};
5
+
use bytes::Bytes;
6
+
use serde::{Deserialize, Deserializer, Serialize, Serializer};
7
+
use smol_str::SmolStr;
8
+
9
+
use crate::{
10
+
IntoStatic,
11
+
types::{
12
+
DataModelType, LexiconStringType,
13
+
blob::{Blob, MimeType},
14
+
string::*,
15
+
value::{
16
+
Array, AtDataError, Data, Object,
17
+
parsing::{decode_bytes, infer_from_type, parse_string, string_key_type_guess},
18
+
},
19
+
},
20
+
};
21
+
22
+
impl Serialize for Data<'_> {
23
+
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
24
+
where
25
+
S: Serializer,
26
+
{
27
+
match self {
28
+
Data::Null => serializer.serialize_none(),
29
+
Data::Boolean(b) => serializer.serialize_bool(*b),
30
+
Data::Integer(i) => serializer.serialize_i64(*i),
31
+
Data::String(s) => s.serialize(serializer),
32
+
Data::Bytes(bytes) => {
33
+
if serializer.is_human_readable() {
34
+
// JSON: {"$bytes": "base64 string"}
35
+
use serde::ser::SerializeMap;
36
+
let mut map = serializer.serialize_map(Some(1))?;
37
+
map.serialize_entry("$bytes", &BASE64_STANDARD.encode(bytes))?;
38
+
map.end()
39
+
} else {
40
+
// CBOR: raw bytes
41
+
serializer.serialize_bytes(bytes)
42
+
}
43
+
}
44
+
Data::CidLink(cid) => {
45
+
if serializer.is_human_readable() {
46
+
// JSON: {"$link": "cid_string"}
47
+
use serde::ser::SerializeMap;
48
+
let mut map = serializer.serialize_map(Some(1))?;
49
+
map.serialize_entry("$link", cid.as_str())?;
50
+
map.end()
51
+
} else {
52
+
// CBOR: raw cid (Cid's serialize handles this)
53
+
cid.serialize(serializer)
54
+
}
55
+
}
56
+
Data::Array(arr) => arr.serialize(serializer),
57
+
Data::Object(obj) => obj.serialize(serializer),
58
+
Data::Blob(blob) => blob.serialize(serializer),
59
+
}
60
+
}
61
+
}
62
+
63
+
impl<'de> Deserialize<'de> for Data<'de> {
64
+
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
65
+
where
66
+
D: Deserializer<'de>,
67
+
{
68
+
deserializer.deserialize_any(DataVisitor)
69
+
}
70
+
}
71
+
72
+
struct DataVisitor;
73
+
74
+
impl<'de: 'v, 'v> serde::de::Visitor<'v> for DataVisitor {
75
+
type Value = Data<'v>;
76
+
77
+
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
78
+
formatter.write_str("any valid AT Protocol data value")
79
+
}
80
+
81
+
fn visit_none<E>(self) -> Result<Self::Value, E>
82
+
where
83
+
E: serde::de::Error,
84
+
{
85
+
Ok(Data::Null)
86
+
}
87
+
88
+
fn visit_unit<E>(self) -> Result<Self::Value, E>
89
+
where
90
+
E: serde::de::Error,
91
+
{
92
+
Ok(Data::Null)
93
+
}
94
+
95
+
fn visit_bool<E>(self, v: bool) -> Result<Self::Value, E>
96
+
where
97
+
E: serde::de::Error,
98
+
{
99
+
Ok(Data::Boolean(v))
100
+
}
101
+
102
+
fn visit_i64<E>(self, v: i64) -> Result<Self::Value, E>
103
+
where
104
+
E: serde::de::Error,
105
+
{
106
+
Ok(Data::Integer(v))
107
+
}
108
+
109
+
fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
110
+
where
111
+
E: serde::de::Error,
112
+
{
113
+
Ok(Data::Integer(v as i64))
114
+
}
115
+
116
+
fn visit_f64<E>(self, _v: f64) -> Result<Self::Value, E>
117
+
where
118
+
E: serde::de::Error,
119
+
{
120
+
Err(E::custom(
121
+
"floating point numbers not allowed in AT protocol data",
122
+
))
123
+
}
124
+
125
+
fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
126
+
where
127
+
E: serde::de::Error,
128
+
{
129
+
Ok(Data::String(AtprotoStr::String(
130
+
CowStr::Borrowed(v).into_static(),
131
+
)))
132
+
}
133
+
134
+
fn visit_borrowed_str<E>(self, v: &'v str) -> Result<Self::Value, E>
135
+
where
136
+
E: serde::de::Error,
137
+
{
138
+
// Don't infer type here - just store as plain string
139
+
// Type inference happens in apply_type_inference based on field names
140
+
Ok(Data::String(AtprotoStr::String(v.into())))
141
+
}
142
+
143
+
fn visit_string<E>(self, v: String) -> Result<Self::Value, E>
144
+
where
145
+
E: serde::de::Error,
146
+
{
147
+
Ok(Data::String(AtprotoStr::String(v.into())))
148
+
}
149
+
150
+
fn visit_bytes<E>(self, v: &[u8]) -> Result<Self::Value, E>
151
+
where
152
+
E: serde::de::Error,
153
+
{
154
+
Ok(Data::Bytes(Bytes::copy_from_slice(v)))
155
+
}
156
+
157
+
fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
158
+
where
159
+
A: serde::de::SeqAccess<'v>,
160
+
{
161
+
let mut array = Vec::new();
162
+
while let Some(elem) = seq.next_element()? {
163
+
array.push(elem);
164
+
}
165
+
Ok(Data::Array(Array(array)))
166
+
}
167
+
168
+
fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
169
+
where
170
+
A: serde::de::MapAccess<'v>,
171
+
{
172
+
use serde::de::Error;
173
+
174
+
// Peek at first key to check for special single-key patterns
175
+
let mut temp_map: BTreeMap<SmolStr, Data<'v>> = BTreeMap::new();
176
+
177
+
while let Some(key) = map.next_key::<SmolStr>()? {
178
+
// Check for special patterns on single-key maps
179
+
if temp_map.is_empty() {
180
+
if key.as_str() == "$link" {
181
+
// {"$link": "cid_string"} pattern
182
+
let cid_str: String = map.next_value()?;
183
+
// Check if there are more keys
184
+
if let Some(next_key) = map.next_key::<SmolStr>()? {
185
+
// More keys, treat as regular object
186
+
temp_map.insert(key, Data::String(AtprotoStr::String(cid_str.into())));
187
+
let next_value: Data = map.next_value()?;
188
+
temp_map.insert(next_key, next_value);
189
+
continue;
190
+
} else {
191
+
// Only key, return CidLink
192
+
return Ok(Data::CidLink(Cid::from(cid_str)));
193
+
}
194
+
} else if key.as_str() == "$bytes" {
195
+
// {"$bytes": "base64_string"} pattern
196
+
let bytes_str: String = map.next_value()?;
197
+
// Check if there are more keys
198
+
if map.next_key::<SmolStr>()?.is_some() {
199
+
// More keys, treat as regular object - shouldn't happen but handle it
200
+
temp_map.insert(key, Data::String(AtprotoStr::String(bytes_str.into())));
201
+
continue;
202
+
} else {
203
+
// Only key, decode and return bytes
204
+
return Ok(decode_bytes(&bytes_str));
205
+
}
206
+
}
207
+
}
208
+
209
+
let value: Data = map.next_value()?;
210
+
temp_map.insert(key, value);
211
+
}
212
+
213
+
// Second pass: apply type inference and check for special patterns
214
+
apply_type_inference(temp_map).map_err(A::Error::custom)
215
+
}
216
+
}
217
+
218
+
fn apply_type_inference<'s>(mut map: BTreeMap<SmolStr, Data<'s>>) -> Result<Data<'s>, AtDataError> {
219
+
// Check for CID link pattern first: {"$link": "cid_string"}
220
+
if map.len() == 1 {
221
+
if let Some(Data::String(AtprotoStr::String(link))) = map.get("$link") {
222
+
// Need to extract ownership, can't borrow from map we're about to consume
223
+
let link_owned = link.clone();
224
+
return Ok(Data::CidLink(Cid::cow_str(link_owned)));
225
+
}
226
+
}
227
+
228
+
// Check for $type field to detect special structures
229
+
let type_field = map.get("$type").and_then(|v| {
230
+
if let Data::String(AtprotoStr::String(s)) = v {
231
+
Some(s.as_ref())
232
+
} else {
233
+
None
234
+
}
235
+
});
236
+
237
+
// Check for blob
238
+
if let Some(type_str) = type_field {
239
+
if type_str == "blob" && infer_from_type(type_str) == DataModelType::Blob {
240
+
// Try to construct blob from the collected data
241
+
let ref_cid = map.get("ref").and_then(|v| {
242
+
if let Data::CidLink(cid) = v {
243
+
Some(cid.clone())
244
+
} else {
245
+
None
246
+
}
247
+
});
248
+
249
+
let mime_type = map.get("mimeType").and_then(|v| {
250
+
if let Data::String(AtprotoStr::String(s)) = v {
251
+
Some(s.clone())
252
+
} else {
253
+
None
254
+
}
255
+
});
256
+
257
+
let size = map.get("size").and_then(|v| {
258
+
if let Data::Integer(i) = v {
259
+
Some(*i as usize)
260
+
} else {
261
+
None
262
+
}
263
+
});
264
+
265
+
if let (Some(ref_cid), Some(mime_cowstr), Some(size)) = (ref_cid, mime_type, size) {
266
+
return Ok(Data::Blob(Blob {
267
+
r#ref: ref_cid,
268
+
mime_type: MimeType::from(mime_cowstr),
269
+
size,
270
+
}));
271
+
}
272
+
}
273
+
}
274
+
275
+
// Apply type inference for string fields based on key names (mutate in place)
276
+
for (key, value) in map.iter_mut() {
277
+
if let Data::String(AtprotoStr::String(s)) = value.to_owned() {
278
+
let type_hint = string_key_type_guess(key.as_str());
279
+
let refined = match type_hint {
280
+
DataModelType::String(string_type) => refine_string_by_type(s, string_type),
281
+
DataModelType::Bytes => {
282
+
// Decode base64
283
+
decode_bytes(&s)
284
+
}
285
+
DataModelType::CidLink if key.as_str() == "$link" => {
286
+
Data::CidLink(Cid::from_str(&s).unwrap())
287
+
}
288
+
_ => continue, // no refinement needed
289
+
};
290
+
*value = refined;
291
+
}
292
+
}
293
+
294
+
Ok(Data::Object(Object(map)))
295
+
}
296
+
297
+
fn refine_string_by_type<'s>(s: CowStr<'s>, string_type: LexiconStringType) -> Data<'s> {
298
+
match string_type {
299
+
LexiconStringType::Datetime => Datetime::from_str(&s)
300
+
.map(|dt| Data::String(AtprotoStr::Datetime(dt)))
301
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
302
+
LexiconStringType::AtUri => AtUri::new_owned(s.clone())
303
+
.map(|uri| Data::String(AtprotoStr::AtUri(uri)))
304
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
305
+
LexiconStringType::Did => Did::new_owned(s.clone())
306
+
.map(|did| Data::String(AtprotoStr::Did(did)))
307
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
308
+
LexiconStringType::Handle => Handle::new_owned(s.clone())
309
+
.map(|handle| Data::String(AtprotoStr::Handle(handle)))
310
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
311
+
LexiconStringType::AtIdentifier => AtIdentifier::new_owned(s.clone())
312
+
.map(|ident| Data::String(AtprotoStr::AtIdentifier(ident)))
313
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
314
+
LexiconStringType::Nsid => Nsid::new_owned(s.clone())
315
+
.map(|nsid| Data::String(AtprotoStr::Nsid(nsid)))
316
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
317
+
LexiconStringType::Cid => Cid::new_owned(s.as_bytes())
318
+
.map(|cid| Data::String(AtprotoStr::Cid(cid)))
319
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.into()))),
320
+
LexiconStringType::Language => Language::new(&s)
321
+
.map(|lang| Data::String(AtprotoStr::Language(lang)))
322
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
323
+
LexiconStringType::Tid => Tid::new(s.clone())
324
+
.map(|tid| Data::String(AtprotoStr::Tid(tid)))
325
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
326
+
LexiconStringType::RecordKey => Rkey::new_owned(s.clone())
327
+
.map(|rkey| Data::String(AtprotoStr::RecordKey(RecordKey::from(rkey))))
328
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
329
+
LexiconStringType::Uri(_) => Uri::new_owned(s.clone())
330
+
.map(|uri| Data::String(AtprotoStr::Uri(uri)))
331
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
332
+
LexiconStringType::String => Data::String(parse_string(&s).into_static()),
333
+
}
334
+
}
335
+
336
+
impl Serialize for Array<'_> {
337
+
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
338
+
where
339
+
S: Serializer,
340
+
{
341
+
use serde::ser::SerializeSeq;
342
+
let mut seq = serializer.serialize_seq(Some(self.0.len()))?;
343
+
for item in &self.0 {
344
+
seq.serialize_element(item)?;
345
+
}
346
+
seq.end()
347
+
}
348
+
}
349
+
350
+
impl<'de> Deserialize<'de> for Array<'de> {
351
+
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
352
+
where
353
+
D: Deserializer<'de>,
354
+
{
355
+
// Just deserialize as Vec<Data> directly - the Data visitor handles everything
356
+
let vec: Vec<Data<'de>> = Deserialize::deserialize(deserializer)?;
357
+
Ok(Array(vec))
358
+
}
359
+
}
360
+
361
+
impl Serialize for Object<'_> {
362
+
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
363
+
where
364
+
S: Serializer,
365
+
{
366
+
use serde::ser::SerializeMap;
367
+
let mut map = serializer.serialize_map(Some(self.0.len()))?;
368
+
for (key, value) in &self.0 {
369
+
map.serialize_entry(key.as_str(), value)?;
370
+
}
371
+
map.end()
372
+
}
373
+
}
374
+
375
+
impl<'de> Deserialize<'de> for Object<'de> {
376
+
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
377
+
where
378
+
D: Deserializer<'de>,
379
+
{
380
+
use serde::de::Error;
381
+
382
+
// Deserialize via Data, then extract the Object
383
+
// The Data visitor handles all the type inference and special cases
384
+
let data: Data<'de> = Data::deserialize(deserializer)?;
385
+
match data {
386
+
Data::Object(obj) => Ok(obj),
387
+
_ => Err(D::Error::custom("expected object, got something else")),
388
+
}
389
+
}
390
+
}
+364
crates/jacquard-common/src/types/value/test_thread.json
+364
crates/jacquard-common/src/types/value/test_thread.json
···
1
+
{
2
+
"hasOtherReplies": false,
3
+
"thread": [
4
+
{
5
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k",
6
+
"depth": 0,
7
+
"value": {
8
+
"$type": "app.bsky.unspecced.defs#threadItemPost",
9
+
"post": {
10
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k",
11
+
"cid": "bafyreicvplbzmlrbwdxv2zpbhibziexxnwmiskvzbapjtnidofzlh4yk64",
12
+
"author": {
13
+
"did": "did:plc:hbpefio3f5csc44msmbgioxz",
14
+
"handle": "sharonk.bsky.social",
15
+
"displayName": "Sharon",
16
+
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:hbpefio3f5csc44msmbgioxz/bafkreia7dcruptjvvv7t46322zqsuqukkwblihzrm3f45r246o5zjulyn4@jpeg",
17
+
"associated": {
18
+
"chat": { "allowIncoming": "following" },
19
+
"activitySubscription": { "allowSubscriptions": "mutuals" }
20
+
},
21
+
"viewer": {
22
+
"muted": false,
23
+
"blockedBy": false,
24
+
"following": "at://did:plc:yfvwmnlztr4dwkb7hwz55r2g/app.bsky.graph.follow/3l6aft4mu2324",
25
+
"followedBy": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.graph.follow/3kygkag25zo2v"
26
+
},
27
+
"labels": [
28
+
{
29
+
"cts": "2024-05-11T03:48:55.341Z",
30
+
"src": "did:plc:e4elbtctnfqocyfcml6h2lf7",
31
+
"uri": "did:plc:hbpefio3f5csc44msmbgioxz",
32
+
"val": "bluesky-elder",
33
+
"ver": 1
34
+
},
35
+
{
36
+
"src": "did:plc:hbpefio3f5csc44msmbgioxz",
37
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.actor.profile/self",
38
+
"cid": "bafyreihtzylnytd2224tatvvpktt5rwvdwppkyhlsrweshlrmnqcq32vsy",
39
+
"val": "!no-unauthenticated",
40
+
"cts": "1970-01-01T00:00:00.000Z"
41
+
}
42
+
],
43
+
"createdAt": "2023-04-13T22:29:27.076Z"
44
+
},
45
+
"record": {
46
+
"$type": "app.bsky.feed.post",
47
+
"createdAt": "2025-10-01T17:15:19.282Z",
48
+
"embed": {
49
+
"$type": "app.bsky.embed.record",
50
+
"record": {
51
+
"cid": "bafyreidmo5ot3qoctmgw2vcckrqzy5hexocp5vto554a5kprwxhbsy3oqi",
52
+
"uri": "at://did:plc:2whlowi5jjjqrdrrj4lxh2lx/app.bsky.feed.post/3m25ixj2fec2a"
53
+
}
54
+
},
55
+
"langs": ["en"],
56
+
"text": "Sora 2 going to hit boomer epistemology like a hurricane"
57
+
},
58
+
"embed": {
59
+
"$type": "app.bsky.embed.record#view",
60
+
"record": {
61
+
"$type": "app.bsky.embed.record#viewRecord",
62
+
"uri": "at://did:plc:2whlowi5jjjqrdrrj4lxh2lx/app.bsky.feed.post/3m25ixj2fec2a",
63
+
"cid": "bafyreidmo5ot3qoctmgw2vcckrqzy5hexocp5vto554a5kprwxhbsy3oqi",
64
+
"author": {
65
+
"did": "did:plc:2whlowi5jjjqrdrrj4lxh2lx",
66
+
"handle": "eliothiggins.bsky.social",
67
+
"displayName": "Eliot Higgins",
68
+
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:2whlowi5jjjqrdrrj4lxh2lx/bafkreiarcjakxx7hkgtfocqilj22vxgmyskl43blurk6vwu2nfvd5ihueu@jpeg",
69
+
"associated": { "activitySubscription": { "allowSubscriptions": "followers" } },
70
+
"viewer": { "muted": false, "blockedBy": false },
71
+
"labels": [],
72
+
"createdAt": "2024-11-06T08:20:47.084Z",
73
+
"verification": {
74
+
"verifications": [
75
+
{
76
+
"issuer": "did:plc:z72i7hdynmk6r22z27h6tvur",
77
+
"uri": "at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.graph.verification/3lv44jce72j2q",
78
+
"isValid": true,
79
+
"createdAt": "2025-07-29T12:33:45.033Z"
80
+
}
81
+
],
82
+
"verifiedStatus": "valid",
83
+
"trustedVerifierStatus": "none"
84
+
}
85
+
},
86
+
"value": {
87
+
"$type": "app.bsky.feed.post",
88
+
"createdAt": "2025-10-01T16:55:04.861Z",
89
+
"embed": {
90
+
"$type": "app.bsky.embed.video",
91
+
"aspectRatio": { "height": 720, "width": 1280 },
92
+
"video": {
93
+
"$type": "blob",
94
+
"ref": { "$link": "bafkreid7ybejd5s2vv2j7d4aajjlmdgazguemcnuliiyfn6coxpwp2mi6y" },
95
+
"mimeType": "video/mp4",
96
+
"size": 2244592
97
+
}
98
+
},
99
+
"langs": ["en"],
100
+
"text": "Really good news for fans of garbage in their timelines, Sora 2 allows anyone to use copyrighted characters to sell you cryptocurrency."
101
+
},
102
+
"labels": [],
103
+
"likeCount": 248,
104
+
"replyCount": 18,
105
+
"repostCount": 67,
106
+
"quoteCount": 52,
107
+
"indexedAt": "2025-10-01T16:55:06.653Z",
108
+
"embeds": [
109
+
{
110
+
"$type": "app.bsky.embed.video#view",
111
+
"cid": "bafkreid7ybejd5s2vv2j7d4aajjlmdgazguemcnuliiyfn6coxpwp2mi6y",
112
+
"playlist": "https://video.bsky.app/watch/did%3Aplc%3A2whlowi5jjjqrdrrj4lxh2lx/bafkreid7ybejd5s2vv2j7d4aajjlmdgazguemcnuliiyfn6coxpwp2mi6y/playlist.m3u8",
113
+
"thumbnail": "https://video.bsky.app/watch/did%3Aplc%3A2whlowi5jjjqrdrrj4lxh2lx/bafkreid7ybejd5s2vv2j7d4aajjlmdgazguemcnuliiyfn6coxpwp2mi6y/thumbnail.jpg",
114
+
"aspectRatio": { "height": 720, "width": 1280 }
115
+
}
116
+
]
117
+
}
118
+
},
119
+
"bookmarkCount": 3,
120
+
"replyCount": 1,
121
+
"repostCount": 7,
122
+
"likeCount": 76,
123
+
"quoteCount": 1,
124
+
"indexedAt": "2025-10-01T17:15:19.523Z",
125
+
"viewer": { "bookmarked": false, "threadMuted": false, "replyDisabled": false, "embeddingDisabled": false },
126
+
"labels": [],
127
+
"threadgate": {
128
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.threadgate/3m25k3p7lek2k",
129
+
"cid": "bafyreiclfmvpqfsfhpl4cffaw6elfdusx5wosinugljpwy2u3ckjse3rse",
130
+
"record": {
131
+
"$type": "app.bsky.feed.threadgate",
132
+
"allow": [
133
+
{ "$type": "app.bsky.feed.threadgate#followingRule" },
134
+
{ "$type": "app.bsky.feed.threadgate#mentionRule" },
135
+
{ "$type": "app.bsky.feed.threadgate#followerRule" }
136
+
],
137
+
"createdAt": "2025-10-01T17:15:19.285Z",
138
+
"hiddenReplies": [],
139
+
"post": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
140
+
},
141
+
"lists": []
142
+
}
143
+
},
144
+
"moreParents": false,
145
+
"moreReplies": 0,
146
+
"opThread": true,
147
+
"hiddenByThreadgate": false,
148
+
"mutedByViewer": false
149
+
}
150
+
},
151
+
{
152
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k6zjhps2k",
153
+
"depth": 1,
154
+
"value": {
155
+
"$type": "app.bsky.unspecced.defs#threadItemPost",
156
+
"post": {
157
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k6zjhps2k",
158
+
"cid": "bafyreieqxxi7nwep5nuhogkv3tgub4rk4pv5tbh3m6yyf66nqdmmrknwsa",
159
+
"author": {
160
+
"did": "did:plc:hbpefio3f5csc44msmbgioxz",
161
+
"handle": "sharonk.bsky.social",
162
+
"displayName": "Sharon",
163
+
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:hbpefio3f5csc44msmbgioxz/bafkreia7dcruptjvvv7t46322zqsuqukkwblihzrm3f45r246o5zjulyn4@jpeg",
164
+
"associated": {
165
+
"chat": { "allowIncoming": "following" },
166
+
"activitySubscription": { "allowSubscriptions": "mutuals" }
167
+
},
168
+
"viewer": {
169
+
"muted": false,
170
+
"blockedBy": false,
171
+
"following": "at://did:plc:yfvwmnlztr4dwkb7hwz55r2g/app.bsky.graph.follow/3l6aft4mu2324",
172
+
"followedBy": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.graph.follow/3kygkag25zo2v"
173
+
},
174
+
"labels": [
175
+
{
176
+
"cts": "2024-05-11T03:48:55.341Z",
177
+
"src": "did:plc:e4elbtctnfqocyfcml6h2lf7",
178
+
"uri": "did:plc:hbpefio3f5csc44msmbgioxz",
179
+
"val": "bluesky-elder",
180
+
"ver": 1
181
+
},
182
+
{
183
+
"src": "did:plc:hbpefio3f5csc44msmbgioxz",
184
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.actor.profile/self",
185
+
"cid": "bafyreihtzylnytd2224tatvvpktt5rwvdwppkyhlsrweshlrmnqcq32vsy",
186
+
"val": "!no-unauthenticated",
187
+
"cts": "1970-01-01T00:00:00.000Z"
188
+
}
189
+
],
190
+
"createdAt": "2023-04-13T22:29:27.076Z"
191
+
},
192
+
"record": {
193
+
"$type": "app.bsky.feed.post",
194
+
"createdAt": "2025-10-01T17:17:10.755Z",
195
+
"langs": ["en"],
196
+
"reply": {
197
+
"parent": {
198
+
"cid": "bafyreicvplbzmlrbwdxv2zpbhibziexxnwmiskvzbapjtnidofzlh4yk64",
199
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
200
+
},
201
+
"root": {
202
+
"cid": "bafyreicvplbzmlrbwdxv2zpbhibziexxnwmiskvzbapjtnidofzlh4yk64",
203
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
204
+
}
205
+
},
206
+
"text": "it's funny how these all feel like stuff that would be playing on the Cyberpunk 2077 mediafeeds"
207
+
},
208
+
"bookmarkCount": 0,
209
+
"replyCount": 2,
210
+
"repostCount": 1,
211
+
"likeCount": 28,
212
+
"quoteCount": 0,
213
+
"indexedAt": "2025-10-01T17:17:11.164Z",
214
+
"viewer": { "bookmarked": false, "threadMuted": false, "replyDisabled": false, "embeddingDisabled": false },
215
+
"labels": []
216
+
},
217
+
"moreParents": false,
218
+
"moreReplies": 0,
219
+
"opThread": true,
220
+
"hiddenByThreadgate": false,
221
+
"mutedByViewer": false
222
+
}
223
+
},
224
+
{
225
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25l6doydc26",
226
+
"depth": 2,
227
+
"value": {
228
+
"$type": "app.bsky.unspecced.defs#threadItemPost",
229
+
"post": {
230
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25l6doydc26",
231
+
"cid": "bafyreia7yd5gbxdw4djkmocr6sbg2bgczup6d6pdmctsspdrne3vhcll6q",
232
+
"author": {
233
+
"did": "did:plc:hbpefio3f5csc44msmbgioxz",
234
+
"handle": "sharonk.bsky.social",
235
+
"displayName": "Sharon",
236
+
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:hbpefio3f5csc44msmbgioxz/bafkreia7dcruptjvvv7t46322zqsuqukkwblihzrm3f45r246o5zjulyn4@jpeg",
237
+
"associated": {
238
+
"chat": { "allowIncoming": "following" },
239
+
"activitySubscription": { "allowSubscriptions": "mutuals" }
240
+
},
241
+
"viewer": {
242
+
"muted": false,
243
+
"blockedBy": false,
244
+
"following": "at://did:plc:yfvwmnlztr4dwkb7hwz55r2g/app.bsky.graph.follow/3l6aft4mu2324",
245
+
"followedBy": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.graph.follow/3kygkag25zo2v"
246
+
},
247
+
"labels": [
248
+
{
249
+
"cts": "2024-05-11T03:48:55.341Z",
250
+
"src": "did:plc:e4elbtctnfqocyfcml6h2lf7",
251
+
"uri": "did:plc:hbpefio3f5csc44msmbgioxz",
252
+
"val": "bluesky-elder",
253
+
"ver": 1
254
+
},
255
+
{
256
+
"src": "did:plc:hbpefio3f5csc44msmbgioxz",
257
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.actor.profile/self",
258
+
"cid": "bafyreihtzylnytd2224tatvvpktt5rwvdwppkyhlsrweshlrmnqcq32vsy",
259
+
"val": "!no-unauthenticated",
260
+
"cts": "1970-01-01T00:00:00.000Z"
261
+
}
262
+
],
263
+
"createdAt": "2023-04-13T22:29:27.076Z"
264
+
},
265
+
"record": {
266
+
"$type": "app.bsky.feed.post",
267
+
"createdAt": "2025-10-01T17:34:41.609Z",
268
+
"langs": ["en"],
269
+
"reply": {
270
+
"parent": {
271
+
"cid": "bafyreieqxxi7nwep5nuhogkv3tgub4rk4pv5tbh3m6yyf66nqdmmrknwsa",
272
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k6zjhps2k"
273
+
},
274
+
"root": {
275
+
"cid": "bafyreicvplbzmlrbwdxv2zpbhibziexxnwmiskvzbapjtnidofzlh4yk64",
276
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
277
+
}
278
+
},
279
+
"text": "\"Morty, I've turned myself into a rugpull, Morty!\""
280
+
},
281
+
"bookmarkCount": 0,
282
+
"replyCount": 0,
283
+
"repostCount": 0,
284
+
"likeCount": 3,
285
+
"quoteCount": 0,
286
+
"indexedAt": "2025-10-01T17:34:41.858Z",
287
+
"viewer": { "bookmarked": false, "threadMuted": false, "replyDisabled": false, "embeddingDisabled": false },
288
+
"labels": []
289
+
},
290
+
"moreParents": false,
291
+
"moreReplies": 0,
292
+
"opThread": true,
293
+
"hiddenByThreadgate": false,
294
+
"mutedByViewer": false
295
+
}
296
+
},
297
+
{
298
+
"uri": "at://did:plc:duaatzbzy7qm4ppl2hluilpg/app.bsky.feed.post/3m25knxro3s2t",
299
+
"depth": 2,
300
+
"value": {
301
+
"$type": "app.bsky.unspecced.defs#threadItemPost",
302
+
"post": {
303
+
"uri": "at://did:plc:duaatzbzy7qm4ppl2hluilpg/app.bsky.feed.post/3m25knxro3s2t",
304
+
"cid": "bafyreidppmo5vaepnafjlwc5doxmk7n3hrctve6cmcz67hhkcuxf5cr6tm",
305
+
"author": {
306
+
"did": "did:plc:duaatzbzy7qm4ppl2hluilpg",
307
+
"handle": "gentlemanengineer.bsky.social",
308
+
"displayName": "Chris Magerkurth",
309
+
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:duaatzbzy7qm4ppl2hluilpg/bafkreifqxgriyccwjt5pwahsx6sju2bdmtoytqyubxoxlfwk4rv3eedkia@jpeg",
310
+
"associated": { "activitySubscription": { "allowSubscriptions": "followers" } },
311
+
"viewer": { "muted": false, "blockedBy": false },
312
+
"labels": [],
313
+
"createdAt": "2025-02-22T02:03:41.644Z"
314
+
},
315
+
"record": {
316
+
"$type": "app.bsky.feed.post",
317
+
"createdAt": "2025-10-01T17:25:32.243Z",
318
+
"langs": ["en"],
319
+
"reply": {
320
+
"parent": {
321
+
"cid": "bafyreieqxxi7nwep5nuhogkv3tgub4rk4pv5tbh3m6yyf66nqdmmrknwsa",
322
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k6zjhps2k"
323
+
},
324
+
"root": {
325
+
"cid": "bafyreicvplbzmlrbwdxv2zpbhibziexxnwmiskvzbapjtnidofzlh4yk64",
326
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
327
+
}
328
+
},
329
+
"text": "Or on Interdimensional Cable from Rick and Morty."
330
+
},
331
+
"bookmarkCount": 0,
332
+
"replyCount": 0,
333
+
"repostCount": 0,
334
+
"likeCount": 1,
335
+
"quoteCount": 0,
336
+
"indexedAt": "2025-10-01T17:25:25.858Z",
337
+
"viewer": { "bookmarked": false, "threadMuted": false, "replyDisabled": false, "embeddingDisabled": false },
338
+
"labels": []
339
+
},
340
+
"moreParents": false,
341
+
"moreReplies": 0,
342
+
"opThread": false,
343
+
"hiddenByThreadgate": false,
344
+
"mutedByViewer": false
345
+
}
346
+
}
347
+
],
348
+
"threadgate": {
349
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.threadgate/3m25k3p7lek2k",
350
+
"cid": "bafyreiclfmvpqfsfhpl4cffaw6elfdusx5wosinugljpwy2u3ckjse3rse",
351
+
"record": {
352
+
"$type": "app.bsky.feed.threadgate",
353
+
"allow": [
354
+
{ "$type": "app.bsky.feed.threadgate#followingRule" },
355
+
{ "$type": "app.bsky.feed.threadgate#mentionRule" },
356
+
{ "$type": "app.bsky.feed.threadgate#followerRule" }
357
+
],
358
+
"createdAt": "2025-10-01T17:15:19.285Z",
359
+
"hiddenReplies": [],
360
+
"post": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
361
+
},
362
+
"lists": []
363
+
}
364
+
}
+305
crates/jacquard-common/src/types/value/tests.rs
+305
crates/jacquard-common/src/types/value/tests.rs
···
1
+
use super::*;
2
+
3
+
/// Canonicalize JSON by sorting object keys recursively
4
+
fn canonicalize_json(value: &serde_json::Value) -> serde_json::Value {
5
+
match value {
6
+
serde_json::Value::Object(map) => {
7
+
let mut sorted_map = serde_json::Map::new();
8
+
let mut keys: Vec<_> = map.keys().collect();
9
+
keys.sort();
10
+
for key in keys {
11
+
sorted_map.insert(key.clone(), canonicalize_json(&map[key]));
12
+
}
13
+
serde_json::Value::Object(sorted_map)
14
+
}
15
+
serde_json::Value::Array(arr) => {
16
+
serde_json::Value::Array(arr.iter().map(canonicalize_json).collect())
17
+
}
18
+
other => other.clone(),
19
+
}
20
+
}
21
+
22
+
#[test]
23
+
fn serialize_deserialize_null() {
24
+
let data = Data::Null;
25
+
26
+
// JSON roundtrip
27
+
let json = serde_json::to_string(&data).unwrap();
28
+
assert_eq!(json, "null");
29
+
let parsed: Data = serde_json::from_str(&json).unwrap();
30
+
assert_eq!(data, parsed);
31
+
assert!(matches!(parsed, Data::Null));
32
+
}
33
+
34
+
#[test]
35
+
fn serialize_deserialize_boolean() {
36
+
let data = Data::Boolean(true);
37
+
38
+
let json = serde_json::to_string(&data).unwrap();
39
+
assert_eq!(json, "true");
40
+
let parsed: Data = serde_json::from_str(&json).unwrap();
41
+
assert_eq!(data, parsed);
42
+
}
43
+
44
+
#[test]
45
+
fn serialize_deserialize_integer() {
46
+
let data = Data::Integer(42);
47
+
48
+
let json = serde_json::to_string(&data).unwrap();
49
+
assert_eq!(json, "42");
50
+
let parsed: Data = serde_json::from_str(&json).unwrap();
51
+
assert_eq!(data, parsed);
52
+
}
53
+
54
+
#[test]
55
+
fn serialize_deserialize_string() {
56
+
let data = Data::String(AtprotoStr::String("hello world".into()));
57
+
58
+
let json = serde_json::to_string(&data).unwrap();
59
+
assert_eq!(json, r#""hello world""#);
60
+
let parsed: Data = serde_json::from_str(&json).unwrap();
61
+
assert_eq!(data, parsed);
62
+
}
63
+
64
+
#[test]
65
+
fn serialize_deserialize_bytes_json() {
66
+
let data = Data::Bytes(Bytes::from_static(b"hello"));
67
+
68
+
// JSON: should be {"$bytes": "base64"}
69
+
let json = serde_json::to_string(&data).unwrap();
70
+
assert!(json.contains("$bytes"));
71
+
assert!(json.contains("aGVsbG8=")); // base64("hello")
72
+
73
+
let parsed: Data = serde_json::from_str(&json).unwrap();
74
+
assert_eq!(data, parsed);
75
+
}
76
+
77
+
#[test]
78
+
fn serialize_deserialize_cid_link_json() {
79
+
let data = Data::CidLink(Cid::str("bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"));
80
+
81
+
// JSON: should be {"$link": "cid_string"}
82
+
let json = serde_json::to_string(&data).unwrap();
83
+
assert!(json.contains("$link"));
84
+
assert!(json.contains("bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"));
85
+
86
+
let parsed: Data = serde_json::from_str(&json).unwrap();
87
+
match parsed {
88
+
Data::CidLink(cid) => assert_eq!(cid.as_str(), "bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"),
89
+
_ => panic!("expected CidLink"),
90
+
}
91
+
}
92
+
93
+
#[test]
94
+
fn serialize_deserialize_array() {
95
+
let data = Data::Array(Array(vec![
96
+
Data::Null,
97
+
Data::Boolean(true),
98
+
Data::Integer(42),
99
+
Data::String(AtprotoStr::String("test".into())),
100
+
]));
101
+
102
+
let json = serde_json::to_string(&data).unwrap();
103
+
let parsed: Data = serde_json::from_str(&json).unwrap();
104
+
assert_eq!(data, parsed);
105
+
106
+
// Verify structure
107
+
if let Data::Array(Array(items)) = parsed {
108
+
assert_eq!(items.len(), 4);
109
+
assert!(matches!(items[0], Data::Null));
110
+
assert!(matches!(items[1], Data::Boolean(true)));
111
+
assert!(matches!(items[2], Data::Integer(42)));
112
+
if let Data::String(AtprotoStr::String(s)) = &items[3] {
113
+
assert_eq!(s.as_ref(), "test");
114
+
} else {
115
+
panic!("expected plain string");
116
+
}
117
+
} else {
118
+
panic!("expected array");
119
+
}
120
+
}
121
+
122
+
#[test]
123
+
fn serialize_deserialize_object() {
124
+
let mut map = BTreeMap::new();
125
+
map.insert("name".to_smolstr(), Data::String(AtprotoStr::String("alice".into())));
126
+
map.insert("age".to_smolstr(), Data::Integer(30));
127
+
map.insert("active".to_smolstr(), Data::Boolean(true));
128
+
129
+
let data = Data::Object(Object(map));
130
+
131
+
let json = serde_json::to_string(&data).unwrap();
132
+
let parsed: Data = serde_json::from_str(&json).unwrap();
133
+
assert_eq!(data, parsed);
134
+
}
135
+
136
+
#[test]
137
+
fn type_inference_datetime() {
138
+
// Field name "createdAt" should infer datetime type
139
+
let json = r#"{"createdAt": "2023-01-15T12:30:45.123456Z"}"#;
140
+
let data: Data = serde_json::from_str(json).unwrap();
141
+
142
+
if let Data::Object(obj) = data {
143
+
if let Some(Data::String(AtprotoStr::Datetime(dt))) = obj.0.get("createdAt") {
144
+
// Verify it's actually parsed correctly
145
+
assert_eq!(dt.as_str(), "2023-01-15T12:30:45.123456Z");
146
+
} else {
147
+
panic!("createdAt should be parsed as Datetime");
148
+
}
149
+
} else {
150
+
panic!("expected object");
151
+
}
152
+
}
153
+
154
+
#[test]
155
+
fn type_inference_did() {
156
+
let json = r#"{"did": "did:plc:abc123"}"#;
157
+
let data: Data = serde_json::from_str(json).unwrap();
158
+
159
+
if let Data::Object(obj) = data {
160
+
if let Some(Data::String(AtprotoStr::Did(did))) = obj.0.get("did") {
161
+
assert_eq!(did.as_str(), "did:plc:abc123");
162
+
} else {
163
+
panic!("did should be parsed as Did");
164
+
}
165
+
} else {
166
+
panic!("expected object");
167
+
}
168
+
}
169
+
170
+
#[test]
171
+
fn type_inference_uri() {
172
+
let json = r#"{"uri": "at://alice.test/com.example.foo/123"}"#;
173
+
let data: Data = serde_json::from_str(json).unwrap();
174
+
175
+
if let Data::Object(obj) = data {
176
+
// "uri" field gets inferred as Uri type, but at:// should parse to AtUri
177
+
match obj.0.get("uri") {
178
+
Some(Data::String(AtprotoStr::AtUri(_))) | Some(Data::String(AtprotoStr::Uri(_))) => {
179
+
// Success
180
+
}
181
+
_ => panic!("uri should be parsed as Uri or AtUri"),
182
+
}
183
+
} else {
184
+
panic!("expected object");
185
+
}
186
+
}
187
+
188
+
#[test]
189
+
fn blob_deserialization() {
190
+
let json = r#"{
191
+
"$type": "blob",
192
+
"ref": {"$link": "bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"},
193
+
"mimeType": "image/png",
194
+
"size": 12345
195
+
}"#;
196
+
197
+
let data: Data = serde_json::from_str(json).unwrap();
198
+
199
+
if let Data::Blob(blob) = data {
200
+
assert_eq!(blob.mime_type.as_str(), "image/png");
201
+
assert_eq!(blob.size, 12345);
202
+
} else {
203
+
panic!("expected blob");
204
+
}
205
+
}
206
+
207
+
#[test]
208
+
fn reject_floats() {
209
+
let json = "42.5"; // float literal
210
+
211
+
let result: Result<Data, _> = serde_json::from_str(json);
212
+
assert!(result.is_err());
213
+
}
214
+
215
+
#[test]
216
+
fn nested_objects() {
217
+
let json = r#"{
218
+
"user": {
219
+
"name": "alice",
220
+
"profile": {
221
+
"bio": "test bio",
222
+
"createdAt": "2023-01-15T12:30:45Z"
223
+
}
224
+
}
225
+
}"#;
226
+
227
+
let data: Data = serde_json::from_str(json).unwrap();
228
+
229
+
// Should successfully parse with nested type inference
230
+
if let Data::Object(obj) = data {
231
+
assert!(obj.0.contains_key("user"));
232
+
} else {
233
+
panic!("expected object");
234
+
}
235
+
}
236
+
237
+
#[test]
238
+
fn integration_bluesky_thread() {
239
+
// Real bluesky thread data with complex nested structures
240
+
let json = include_str!("test_thread.json");
241
+
let data: Data = serde_json::from_str(json).unwrap();
242
+
243
+
// Verify top-level structure
244
+
if let Data::Object(obj) = data {
245
+
// Should have "thread" array
246
+
assert!(obj.0.contains_key("thread"));
247
+
248
+
// Verify thread is an array
249
+
if let Some(Data::Array(thread)) = obj.0.get("thread") {
250
+
assert!(!thread.0.is_empty());
251
+
252
+
// Check first thread item
253
+
if let Some(Data::Object(item)) = thread.0.first() {
254
+
// Should have "uri" field parsed as AtUri
255
+
if let Some(Data::String(AtprotoStr::AtUri(uri))) = item.0.get("uri") {
256
+
assert!(uri.as_str().starts_with("at://did:plc:"));
257
+
}
258
+
259
+
// Should have "value" object
260
+
if let Some(Data::Object(value)) = item.0.get("value") {
261
+
// Should have post object
262
+
if let Some(Data::Object(post)) = value.0.get("post") {
263
+
// CID should be parsed as Cid
264
+
if let Some(Data::String(AtprotoStr::Cid(cid))) = post.0.get("cid") {
265
+
assert!(cid.as_str().starts_with("bafy"));
266
+
}
267
+
268
+
// Author should have DID
269
+
if let Some(Data::Object(author)) = post.0.get("author") {
270
+
if let Some(Data::String(AtprotoStr::Did(did))) = author.0.get("did") {
271
+
assert!(did.as_str().starts_with("did:plc:"));
272
+
}
273
+
274
+
// createdAt should be parsed as Datetime
275
+
if let Some(Data::String(AtprotoStr::Datetime(_))) =
276
+
author.0.get("createdAt")
277
+
{
278
+
// Success
279
+
} else {
280
+
panic!("author.createdAt should be Datetime");
281
+
}
282
+
}
283
+
}
284
+
}
285
+
}
286
+
} else {
287
+
panic!("thread should be an array");
288
+
}
289
+
290
+
// Verify serialization produces same JSON structure
291
+
let serialized = serde_json::to_string(&obj).unwrap();
292
+
293
+
// Parse both as generic serde_json::Value to compare structure
294
+
let original_value: serde_json::Value = serde_json::from_str(json).unwrap();
295
+
let serialized_value: serde_json::Value = serde_json::from_str(&serialized).unwrap();
296
+
297
+
// Canonicalize by sorting keys
298
+
let original_canonical = canonicalize_json(&original_value);
299
+
let serialized_canonical = canonicalize_json(&serialized_value);
300
+
301
+
assert_eq!(original_canonical, serialized_canonical, "Serialized JSON should match original structure")
302
+
} else {
303
+
panic!("expected top-level object");
304
+
}
305
+
}
+29
crates/jacquard-derive/Cargo.toml
+29
crates/jacquard-derive/Cargo.toml
···
1
+
[package]
2
+
name = "jacquard-derive"
3
+
edition.workspace = true
4
+
version.workspace = true
5
+
authors.workspace = true
6
+
repository.workspace = true
7
+
keywords.workspace = true
8
+
categories.workspace = true
9
+
readme.workspace = true
10
+
documentation.workspace = true
11
+
exclude.workspace = true
12
+
description.workspace = true
13
+
14
+
[lib]
15
+
proc-macro = true
16
+
17
+
[dependencies]
18
+
heck = "0.5.0"
19
+
itertools = "0.14.0"
20
+
jacquard-common = { version = "0.1.0", path = "../jacquard-common" }
21
+
jacquard-lexicon = { version = "0.1.0", path = "../jacquard-lexicon" }
22
+
prettyplease = "0.2.37"
23
+
proc-macro2 = "1.0.101"
24
+
quote = "1.0.41"
25
+
serde = { version = "1.0.228", features = ["derive"] }
26
+
serde_json = "1.0.145"
27
+
serde_repr = "0.1.20"
28
+
serde_with = "3.14.1"
29
+
syn = "2.0.106"
+117
crates/jacquard-derive/src/lib.rs
+117
crates/jacquard-derive/src/lib.rs
···
1
+
use proc_macro::TokenStream;
2
+
use quote::quote;
3
+
use syn::{Data, DeriveInput, Fields, parse_macro_input};
4
+
5
+
/// Attribute macro that adds an `extra_data` field to structs to capture unknown fields
6
+
/// during deserialization.
7
+
///
8
+
/// # Example
9
+
/// ```ignore
10
+
/// #[lexicon]
11
+
/// struct Post<'s> {
12
+
/// text: &'s str,
13
+
/// }
14
+
/// // Expands to:
15
+
/// // struct Post<'s> {
16
+
/// // text: &'s str,
17
+
/// // #[serde(flatten)]
18
+
/// // pub extra_data: BTreeMap<SmolStr, Data<'s>>,
19
+
/// // }
20
+
/// ```
21
+
#[proc_macro_attribute]
22
+
pub fn lexicon(_attr: TokenStream, item: TokenStream) -> TokenStream {
23
+
let mut input = parse_macro_input!(item as DeriveInput);
24
+
25
+
match &mut input.data {
26
+
Data::Struct(data_struct) => {
27
+
if let Fields::Named(fields) = &mut data_struct.fields {
28
+
// Check if extra_data field already exists
29
+
let has_extra_data = fields
30
+
.named
31
+
.iter()
32
+
.any(|f| f.ident.as_ref().map(|i| i == "extra_data").unwrap_or(false));
33
+
34
+
if !has_extra_data {
35
+
// Determine the lifetime parameter to use
36
+
let lifetime = if let Some(lt) = input.generics.lifetimes().next() {
37
+
quote! { #lt }
38
+
} else {
39
+
quote! { 'static }
40
+
};
41
+
42
+
// Add the extra_data field
43
+
let new_field: syn::Field = syn::parse_quote! {
44
+
#[serde(flatten)]
45
+
pub extra_data: ::std::collections::BTreeMap<
46
+
::jacquard_common::smol_str::SmolStr,
47
+
::jacquard_common::types::value::Data<#lifetime>
48
+
>
49
+
};
50
+
fields.named.push(new_field);
51
+
}
52
+
} else {
53
+
return syn::Error::new_spanned(
54
+
input,
55
+
"lexicon attribute can only be used on structs with named fields",
56
+
)
57
+
.to_compile_error()
58
+
.into();
59
+
}
60
+
61
+
quote! { #input }.into()
62
+
}
63
+
_ => syn::Error::new_spanned(input, "lexicon attribute can only be used on structs")
64
+
.to_compile_error()
65
+
.into(),
66
+
}
67
+
}
68
+
69
+
/// Attribute macro that adds an `Other(Data)` variant to enums to make them open unions.
70
+
///
71
+
/// # Example
72
+
/// ```ignore
73
+
/// #[open_union]
74
+
/// enum RecordEmbed<'s> {
75
+
/// #[serde(rename = "app.bsky.embed.images")]
76
+
/// Images(Images),
77
+
/// }
78
+
/// // Expands to:
79
+
/// // enum RecordEmbed<'s> {
80
+
/// // #[serde(rename = "app.bsky.embed.images")]
81
+
/// // Images(Images),
82
+
/// // #[serde(untagged)]
83
+
/// // Unknown(Data<'s>),
84
+
/// // }
85
+
/// ```
86
+
#[proc_macro_attribute]
87
+
pub fn open_union(_attr: TokenStream, item: TokenStream) -> TokenStream {
88
+
let mut input = parse_macro_input!(item as DeriveInput);
89
+
90
+
match &mut input.data {
91
+
Data::Enum(data_enum) => {
92
+
// Check if Other variant already exists
93
+
let has_other = data_enum.variants.iter().any(|v| v.ident == "Other");
94
+
95
+
if !has_other {
96
+
// Determine the lifetime parameter to use
97
+
let lifetime = if let Some(lt) = input.generics.lifetimes().next() {
98
+
quote! { #lt }
99
+
} else {
100
+
quote! { 'static }
101
+
};
102
+
103
+
// Add the Other variant
104
+
let new_variant: syn::Variant = syn::parse_quote! {
105
+
#[serde(untagged)]
106
+
Unknown(::jacquard_common::types::value::Data<#lifetime>)
107
+
};
108
+
data_enum.variants.push(new_variant);
109
+
}
110
+
111
+
quote! { #input }.into()
112
+
}
113
+
_ => syn::Error::new_spanned(input, "open_union attribute can only be used on enums")
114
+
.to_compile_error()
115
+
.into(),
116
+
}
117
+
}
+89
crates/jacquard-derive/tests/lexicon.rs
+89
crates/jacquard-derive/tests/lexicon.rs
···
1
+
use jacquard_derive::lexicon;
2
+
use serde::{Deserialize, Serialize};
3
+
4
+
#[lexicon]
5
+
#[derive(Serialize, Deserialize, Debug, PartialEq)]
6
+
#[serde(rename_all = "camelCase")]
7
+
struct TestRecord<'s> {
8
+
text: &'s str,
9
+
count: i64,
10
+
}
11
+
12
+
#[test]
13
+
fn test_lexicon_adds_extra_data_field() {
14
+
let json = r#"{"text":"hello","count":42,"unknown":"field","another":123}"#;
15
+
16
+
let record: TestRecord = serde_json::from_str(json).unwrap();
17
+
18
+
assert_eq!(record.text, "hello");
19
+
assert_eq!(record.count, 42);
20
+
assert_eq!(record.extra_data.len(), 2);
21
+
assert!(record.extra_data.contains_key("unknown"));
22
+
assert!(record.extra_data.contains_key("another"));
23
+
}
24
+
25
+
#[test]
26
+
fn test_lexicon_roundtrip() {
27
+
use jacquard_common::CowStr;
28
+
use jacquard_common::types::value::Data;
29
+
use std::collections::BTreeMap;
30
+
31
+
let mut extra = BTreeMap::new();
32
+
extra.insert(
33
+
"custom".into(),
34
+
Data::String(jacquard_common::types::string::AtprotoStr::String(
35
+
CowStr::Borrowed("value"),
36
+
)),
37
+
);
38
+
extra.insert(
39
+
"number".into(),
40
+
Data::Integer(42),
41
+
);
42
+
extra.insert(
43
+
"nested".into(),
44
+
Data::Object(jacquard_common::types::value::Object({
45
+
let mut nested_map = BTreeMap::new();
46
+
nested_map.insert(
47
+
"inner".into(),
48
+
Data::Boolean(true),
49
+
);
50
+
nested_map
51
+
})),
52
+
);
53
+
54
+
let record = TestRecord {
55
+
text: "test",
56
+
count: 100,
57
+
extra_data: extra,
58
+
};
59
+
60
+
let json = serde_json::to_string(&record).unwrap();
61
+
let parsed: TestRecord = serde_json::from_str(&json).unwrap();
62
+
63
+
assert_eq!(record, parsed);
64
+
assert_eq!(parsed.extra_data.len(), 3);
65
+
66
+
// Verify the extra fields were preserved
67
+
assert!(parsed.extra_data.contains_key("custom"));
68
+
assert!(parsed.extra_data.contains_key("number"));
69
+
assert!(parsed.extra_data.contains_key("nested"));
70
+
71
+
// Verify the values
72
+
if let Some(Data::String(s)) = parsed.extra_data.get("custom") {
73
+
assert_eq!(s.as_str(), "value");
74
+
} else {
75
+
panic!("expected custom field to be a string");
76
+
}
77
+
78
+
if let Some(Data::Integer(n)) = parsed.extra_data.get("number") {
79
+
assert_eq!(*n, 42);
80
+
} else {
81
+
panic!("expected number field to be an integer");
82
+
}
83
+
84
+
if let Some(Data::Object(obj)) = parsed.extra_data.get("nested") {
85
+
assert!(obj.0.contains_key("inner"));
86
+
} else {
87
+
panic!("expected nested field to be an object");
88
+
}
89
+
}
+117
crates/jacquard-derive/tests/open_union.rs
+117
crates/jacquard-derive/tests/open_union.rs
···
1
+
use jacquard_derive::open_union;
2
+
use serde::{Deserialize, Serialize};
3
+
4
+
#[open_union]
5
+
#[derive(Serialize, Deserialize, Debug, PartialEq)]
6
+
#[serde(tag = "$type")]
7
+
enum TestUnion<'s> {
8
+
#[serde(rename = "com.example.typeA")]
9
+
TypeA { value: &'s str },
10
+
#[serde(rename = "com.example.typeB")]
11
+
TypeB { count: i64 },
12
+
}
13
+
14
+
#[test]
15
+
fn test_open_union_known_variant() {
16
+
let json = r#"{"$type":"com.example.typeA","value":"hello"}"#;
17
+
let union: TestUnion = serde_json::from_str(json).unwrap();
18
+
19
+
match union {
20
+
TestUnion::TypeA { value } => assert_eq!(value, "hello"),
21
+
_ => panic!("expected TypeA"),
22
+
}
23
+
}
24
+
25
+
#[test]
26
+
fn test_open_union_unknown_variant() {
27
+
use jacquard_common::types::value::{Data, Object};
28
+
29
+
let json = r#"{"$type":"com.example.unknown","data":"something"}"#;
30
+
let union: TestUnion = serde_json::from_str(json).unwrap();
31
+
32
+
match union {
33
+
TestUnion::Unknown(Data::Object(obj)) => {
34
+
// Verify the captured data contains the expected fields
35
+
assert!(obj.0.contains_key("$type"));
36
+
assert!(obj.0.contains_key("data"));
37
+
38
+
// Check the actual values
39
+
if let Some(Data::String(type_str)) = obj.0.get("$type") {
40
+
assert_eq!(type_str.as_str(), "com.example.unknown");
41
+
} else {
42
+
panic!("expected $type field to be a string");
43
+
}
44
+
45
+
if let Some(Data::String(data_str)) = obj.0.get("data") {
46
+
assert_eq!(data_str.as_str(), "something");
47
+
} else {
48
+
panic!("expected data field to be a string");
49
+
}
50
+
}
51
+
_ => panic!("expected Unknown variant with Object data"),
52
+
}
53
+
}
54
+
55
+
#[test]
56
+
fn test_open_union_roundtrip() {
57
+
let union = TestUnion::TypeB { count: 42 };
58
+
let json = serde_json::to_string(&union).unwrap();
59
+
let parsed: TestUnion = serde_json::from_str(&json).unwrap();
60
+
61
+
assert_eq!(union, parsed);
62
+
63
+
// Verify the $type field is present
64
+
assert!(json.contains(r#""$type":"com.example.typeB""#));
65
+
}
66
+
67
+
#[test]
68
+
fn test_open_union_unknown_roundtrip() {
69
+
use jacquard_common::types::value::{Data, Object};
70
+
use std::collections::BTreeMap;
71
+
72
+
// Create an Unknown variant with complex data
73
+
let mut map = BTreeMap::new();
74
+
map.insert(
75
+
"$type".into(),
76
+
Data::String(jacquard_common::types::string::AtprotoStr::String(
77
+
"com.example.custom".into(),
78
+
)),
79
+
);
80
+
map.insert("field1".into(), Data::Integer(123));
81
+
map.insert("field2".into(), Data::Boolean(false));
82
+
83
+
let union = TestUnion::Unknown(Data::Object(Object(map)));
84
+
85
+
let json = serde_json::to_string(&union).unwrap();
86
+
let parsed: TestUnion = serde_json::from_str(&json).unwrap();
87
+
88
+
// Should deserialize back as Unknown since the type is not recognized
89
+
match parsed {
90
+
TestUnion::Unknown(Data::Object(obj)) => {
91
+
assert_eq!(obj.0.len(), 3);
92
+
assert!(obj.0.contains_key("$type"));
93
+
assert!(obj.0.contains_key("field1"));
94
+
assert!(obj.0.contains_key("field2"));
95
+
96
+
// Verify values
97
+
if let Some(Data::String(s)) = obj.0.get("$type") {
98
+
assert_eq!(s.as_str(), "com.example.custom");
99
+
} else {
100
+
panic!("expected $type to be a string");
101
+
}
102
+
103
+
if let Some(Data::Integer(n)) = obj.0.get("field1") {
104
+
assert_eq!(*n, 123);
105
+
} else {
106
+
panic!("expected field1 to be an integer");
107
+
}
108
+
109
+
if let Some(Data::Boolean(b)) = obj.0.get("field2") {
110
+
assert_eq!(*b, false);
111
+
} else {
112
+
panic!("expected field2 to be a boolean");
113
+
}
114
+
}
115
+
_ => panic!("expected Unknown variant"),
116
+
}
117
+
}
+11
crates/jacquard-lexicon/Cargo.toml
+11
crates/jacquard-lexicon/Cargo.toml
···
12
12
description.workspace = true
13
13
14
14
[dependencies]
15
+
heck = "0.5.0"
16
+
itertools = "0.14.0"
17
+
jacquard-common = { version = "0.1.0", path = "../jacquard-common" }
18
+
prettyplease = "0.2.37"
19
+
proc-macro2 = "1.0.101"
20
+
quote = "1.0.41"
21
+
serde = { version = "1.0.228", features = ["derive"] }
22
+
serde_json = "1.0.145"
23
+
serde_repr = "0.1.20"
24
+
serde_with = "3.14.1"
25
+
syn = "2.0.106"
+36
crates/jacquard-lexicon/src/fs.rs
+36
crates/jacquard-lexicon/src/fs.rs
···
1
+
// Forked from atrium-codegen
2
+
// https://github.com/sugyan/atrium/blob/main/lexicon/atrium-codegen/src/fs.rs
3
+
4
+
use std::ffi::OsStr;
5
+
use std::fs::read_dir;
6
+
use std::io::Result;
7
+
use std::path::{Path, PathBuf};
8
+
9
+
fn walk<F>(path: &Path, results: &mut Vec<PathBuf>, f: &mut F) -> Result<()>
10
+
where
11
+
F: FnMut(&Path) -> bool,
12
+
{
13
+
if f(path) {
14
+
results.push(path.into());
15
+
}
16
+
if path.is_dir() {
17
+
for entry in read_dir(path)? {
18
+
walk(&entry?.path(), results, f)?;
19
+
}
20
+
}
21
+
Ok(())
22
+
}
23
+
24
+
pub(crate) fn find_schemas(path: &Path) -> Result<Vec<impl AsRef<Path>>> {
25
+
let mut results = Vec::new();
26
+
walk(path, &mut results, &mut |path| {
27
+
path.extension().and_then(OsStr::to_str) == Some("json")
28
+
})?;
29
+
Ok(results)
30
+
}
31
+
32
+
pub(crate) fn find_dirs(path: &Path) -> Result<Vec<impl AsRef<Path>>> {
33
+
let mut results = Vec::new();
34
+
walk(path, &mut results, &mut |path| path.is_dir())?;
35
+
Ok(results)
36
+
}
+433
crates/jacquard-lexicon/src/lexicon.rs
+433
crates/jacquard-lexicon/src/lexicon.rs
···
1
+
// Forked from atrium-lexicon
2
+
// https://github.com/atrium-rs/atrium/blob/main/lexicon/atrium-lex/src/lexicon.rs
3
+
// https://github.com/atrium-rs/atrium/blob/main/lexicon/atrium-lex/src/lib.rs
4
+
5
+
use jacquard_common::{CowStr, smol_str::SmolStr, types::blob::MimeType};
6
+
use serde::{Deserialize, Serialize};
7
+
use serde_repr::{Deserialize_repr, Serialize_repr};
8
+
use serde_with::skip_serializing_none;
9
+
use std::collections::BTreeMap;
10
+
11
+
#[derive(Debug, Serialize_repr, Deserialize_repr, PartialEq, Eq, Clone, Copy)]
12
+
#[repr(u8)]
13
+
pub enum Lexicon {
14
+
Lexicon1 = 1,
15
+
}
16
+
#[skip_serializing_none]
17
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
18
+
pub struct LexiconDoc<'s> {
19
+
pub lexicon: Lexicon,
20
+
#[serde(borrow)]
21
+
pub id: CowStr<'s>,
22
+
pub revision: Option<u32>,
23
+
pub description: Option<CowStr<'s>>,
24
+
pub defs: BTreeMap<SmolStr, LexUserType<'s>>,
25
+
}
26
+
27
+
// primitives
28
+
29
+
#[skip_serializing_none]
30
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
31
+
pub struct LexBoolean<'s> {
32
+
#[serde(borrow)]
33
+
pub description: Option<CowStr<'s>>,
34
+
pub default: Option<bool>,
35
+
pub r#const: Option<bool>,
36
+
}
37
+
38
+
/// The Lexicon type `integer`.
39
+
///
40
+
/// Lexicon integers are [specified] as signed and 64-bit, which means that values will
41
+
/// always fit in an `i64`.
42
+
///
43
+
/// [specified]: https://atproto.com/specs/data-model#data-types
44
+
#[skip_serializing_none]
45
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
46
+
pub struct LexInteger<'s> {
47
+
#[serde(borrow)]
48
+
pub description: Option<CowStr<'s>>,
49
+
pub default: Option<i64>,
50
+
pub minimum: Option<i64>,
51
+
pub maximum: Option<i64>,
52
+
pub r#enum: Option<Vec<i64>>,
53
+
pub r#const: Option<i64>,
54
+
}
55
+
56
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone, Copy)]
57
+
#[serde(rename_all = "kebab-case")]
58
+
pub enum LexStringFormat {
59
+
Datetime,
60
+
Uri,
61
+
AtUri,
62
+
Did,
63
+
Handle,
64
+
AtIdentifier,
65
+
Nsid,
66
+
Cid,
67
+
Language,
68
+
Tid,
69
+
RecordKey,
70
+
}
71
+
#[skip_serializing_none]
72
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
73
+
#[serde(rename_all = "camelCase")]
74
+
pub struct LexString<'s> {
75
+
#[serde(borrow)]
76
+
pub description: Option<CowStr<'s>>,
77
+
pub format: Option<LexStringFormat>,
78
+
pub default: Option<CowStr<'s>>,
79
+
pub min_length: Option<usize>,
80
+
pub max_length: Option<usize>,
81
+
pub min_graphemes: Option<usize>,
82
+
pub max_graphemes: Option<usize>,
83
+
pub r#enum: Option<Vec<CowStr<'s>>>,
84
+
pub r#const: Option<CowStr<'s>>,
85
+
pub known_values: Option<Vec<CowStr<'s>>>,
86
+
}
87
+
88
+
#[skip_serializing_none]
89
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
90
+
pub struct LexUnknown<'s> {
91
+
#[serde(borrow)]
92
+
pub description: Option<CowStr<'s>>,
93
+
}
94
+
// ipld types
95
+
96
+
#[skip_serializing_none]
97
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
98
+
#[serde(rename_all = "camelCase")]
99
+
pub struct LexBytes<'s> {
100
+
#[serde(borrow)]
101
+
pub description: Option<CowStr<'s>>,
102
+
pub max_length: Option<usize>,
103
+
pub min_length: Option<usize>,
104
+
}
105
+
106
+
#[skip_serializing_none]
107
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
108
+
pub struct LexCidLink<'s> {
109
+
#[serde(borrow)]
110
+
pub description: Option<CowStr<'s>>,
111
+
}
112
+
113
+
// references
114
+
115
+
#[skip_serializing_none]
116
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
117
+
pub struct LexRef<'s> {
118
+
#[serde(borrow)]
119
+
pub description: Option<CowStr<'s>>,
120
+
pub r#ref: CowStr<'s>,
121
+
}
122
+
123
+
#[skip_serializing_none]
124
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
125
+
pub struct LexRefUnion<'s> {
126
+
#[serde(borrow)]
127
+
pub description: Option<CowStr<'s>>,
128
+
pub refs: Vec<CowStr<'s>>,
129
+
pub closed: Option<bool>,
130
+
}
131
+
132
+
// blobs
133
+
134
+
#[skip_serializing_none]
135
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
136
+
#[serde(rename_all = "camelCase")]
137
+
pub struct LexBlob<'s> {
138
+
#[serde(borrow)]
139
+
pub description: Option<CowStr<'s>>,
140
+
pub accept: Option<Vec<MimeType<'s>>>,
141
+
pub max_size: Option<usize>,
142
+
}
143
+
144
+
// complex types
145
+
146
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
147
+
#[serde(tag = "type", rename_all = "kebab-case")]
148
+
pub enum LexArrayItem<'s> {
149
+
// lexPrimitive
150
+
Boolean(LexBoolean<'s>),
151
+
Integer(LexInteger<'s>),
152
+
String(LexString<'s>),
153
+
Unknown(LexUnknown<'s>),
154
+
// lexIpldType
155
+
Bytes(LexBytes<'s>),
156
+
CidLink(LexCidLink<'s>),
157
+
// lexBlob
158
+
#[serde(borrow)]
159
+
Blob(LexBlob<'s>),
160
+
// lexRefVariant
161
+
Ref(LexRef<'s>),
162
+
Union(LexRefUnion<'s>),
163
+
}
164
+
#[skip_serializing_none]
165
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
166
+
#[serde(rename_all = "camelCase")]
167
+
pub struct LexArray<'s> {
168
+
#[serde(borrow)]
169
+
pub description: Option<CowStr<'s>>,
170
+
pub items: LexArrayItem<'s>,
171
+
pub min_length: Option<usize>,
172
+
pub max_length: Option<usize>,
173
+
}
174
+
175
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
176
+
#[serde(tag = "type", rename_all = "lowercase")]
177
+
pub enum LexPrimitiveArrayItem<'s> {
178
+
// lexPrimitive
179
+
#[serde(borrow)]
180
+
Boolean(LexBoolean<'s>),
181
+
Integer(LexInteger<'s>),
182
+
String(LexString<'s>),
183
+
Unknown(LexUnknown<'s>),
184
+
}
185
+
#[skip_serializing_none]
186
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
187
+
#[serde(rename_all = "camelCase")]
188
+
pub struct LexPrimitiveArray<'s> {
189
+
#[serde(borrow)]
190
+
pub description: Option<CowStr<'s>>,
191
+
pub items: LexPrimitiveArrayItem<'s>,
192
+
pub min_length: Option<usize>,
193
+
pub max_length: Option<usize>,
194
+
}
195
+
196
+
#[skip_serializing_none]
197
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
198
+
pub struct LexToken<'s> {
199
+
#[serde(borrow)]
200
+
pub description: Option<CowStr<'s>>,
201
+
}
202
+
203
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
204
+
#[serde(tag = "type", rename_all = "kebab-case")]
205
+
pub enum LexObjectProperty<'s> {
206
+
// lexRefVariant
207
+
#[serde(borrow)]
208
+
Ref(LexRef<'s>),
209
+
Union(LexRefUnion<'s>),
210
+
// lexIpldType
211
+
Bytes(LexBytes<'s>),
212
+
CidLink(LexCidLink<'s>),
213
+
// lexArray
214
+
Array(LexArray<'s>),
215
+
// lexBlob
216
+
Blob(LexBlob<'s>),
217
+
// lexPrimitive
218
+
Boolean(LexBoolean<'s>),
219
+
Integer(LexInteger<'s>),
220
+
String(LexString<'s>),
221
+
Unknown(LexUnknown<'s>),
222
+
}
223
+
#[skip_serializing_none]
224
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
225
+
pub struct LexObject<'s> {
226
+
#[serde(borrow)]
227
+
pub description: Option<CowStr<'s>>,
228
+
pub required: Option<Vec<SmolStr>>,
229
+
pub nullable: Option<Vec<SmolStr>>,
230
+
pub properties: BTreeMap<SmolStr, LexObjectProperty<'s>>,
231
+
}
232
+
233
+
// xrpc
234
+
235
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
236
+
#[serde(tag = "type", rename_all = "lowercase")]
237
+
pub enum LexXrpcParametersProperty<'s> {
238
+
// lexPrimitive
239
+
#[serde(borrow)]
240
+
Boolean(LexBoolean<'s>),
241
+
Integer(LexInteger<'s>),
242
+
String(LexString<'s>),
243
+
Unknown(LexUnknown<'s>),
244
+
// lexPrimitiveArray
245
+
Array(LexPrimitiveArray<'s>),
246
+
}
247
+
#[skip_serializing_none]
248
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
249
+
pub struct LexXrpcParameters<'s> {
250
+
#[serde(borrow)]
251
+
pub description: Option<CowStr<'s>>,
252
+
pub required: Option<Vec<SmolStr>>,
253
+
pub properties: BTreeMap<SmolStr, LexXrpcParametersProperty<'s>>,
254
+
}
255
+
256
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
257
+
#[serde(tag = "type", rename_all = "lowercase")]
258
+
pub enum LexXrpcBodySchema<'s> {
259
+
// lexRefVariant
260
+
#[serde(borrow)]
261
+
Ref(LexRef<'s>),
262
+
Union(LexRefUnion<'s>),
263
+
// lexObject
264
+
Object(LexObject<'s>),
265
+
}
266
+
#[skip_serializing_none]
267
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
268
+
pub struct LexXrpcBody<'s> {
269
+
#[serde(borrow)]
270
+
pub description: Option<CowStr<'s>>,
271
+
pub encoding: CowStr<'s>,
272
+
pub schema: Option<LexXrpcBodySchema<'s>>,
273
+
}
274
+
275
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
276
+
#[serde(tag = "type", rename_all = "lowercase")]
277
+
pub enum LexXrpcSubscriptionMessageSchema<'s> {
278
+
// lexRefVariant
279
+
#[serde(borrow)]
280
+
Ref(LexRef<'s>),
281
+
Union(LexRefUnion<'s>),
282
+
// lexObject
283
+
Object(LexObject<'s>),
284
+
}
285
+
#[skip_serializing_none]
286
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
287
+
pub struct LexXrpcSubscriptionMessage<'s> {
288
+
#[serde(borrow)]
289
+
pub description: Option<CowStr<'s>>,
290
+
pub schema: Option<LexXrpcSubscriptionMessageSchema<'s>>,
291
+
}
292
+
293
+
#[skip_serializing_none]
294
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
295
+
pub struct LexXrpcError<'s> {
296
+
#[serde(borrow)]
297
+
pub description: Option<CowStr<'s>>,
298
+
pub name: CowStr<'s>,
299
+
}
300
+
301
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
302
+
#[serde(tag = "type", rename_all = "lowercase")]
303
+
pub enum LexXrpcQueryParameter<'s> {
304
+
#[serde(borrow)]
305
+
Params(LexXrpcParameters<'s>),
306
+
}
307
+
#[skip_serializing_none]
308
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
309
+
pub struct LexXrpcQuery<'s> {
310
+
#[serde(borrow)]
311
+
pub description: Option<CowStr<'s>>,
312
+
pub parameters: Option<LexXrpcQueryParameter<'s>>,
313
+
pub output: Option<LexXrpcBody<'s>>,
314
+
pub errors: Option<Vec<LexXrpcError<'s>>>,
315
+
}
316
+
317
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
318
+
#[serde(tag = "type", rename_all = "lowercase")]
319
+
pub enum LexXrpcProcedureParameter<'s> {
320
+
#[serde(borrow)]
321
+
Params(LexXrpcParameters<'s>),
322
+
}
323
+
#[skip_serializing_none]
324
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
325
+
pub struct LexXrpcProcedure<'s> {
326
+
#[serde(borrow)]
327
+
pub description: Option<CowStr<'s>>,
328
+
pub parameters: Option<LexXrpcProcedureParameter<'s>>,
329
+
pub input: Option<LexXrpcBody<'s>>,
330
+
pub output: Option<LexXrpcBody<'s>>,
331
+
pub errors: Option<Vec<LexXrpcError<'s>>>,
332
+
}
333
+
334
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
335
+
#[serde(tag = "type", rename_all = "lowercase")]
336
+
pub enum LexXrpcSubscriptionParameter<'s> {
337
+
#[serde(borrow)]
338
+
Params(LexXrpcParameters<'s>),
339
+
}
340
+
#[skip_serializing_none]
341
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
342
+
pub struct LexXrpcSubscription<'s> {
343
+
#[serde(borrow)]
344
+
pub description: Option<CowStr<'s>>,
345
+
pub parameters: Option<LexXrpcSubscriptionParameter<'s>>,
346
+
pub message: Option<LexXrpcSubscriptionMessage<'s>>,
347
+
pub infos: Option<Vec<LexXrpcError<'s>>>,
348
+
pub errors: Option<Vec<LexXrpcError<'s>>>,
349
+
}
350
+
351
+
// database
352
+
353
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
354
+
#[serde(tag = "type", rename_all = "lowercase")]
355
+
pub enum LexRecordRecord<'s> {
356
+
#[serde(borrow)]
357
+
Object(LexObject<'s>),
358
+
}
359
+
#[skip_serializing_none]
360
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
361
+
pub struct LexRecord<'s> {
362
+
#[serde(borrow)]
363
+
pub description: Option<CowStr<'s>>,
364
+
pub key: Option<CowStr<'s>>,
365
+
pub record: LexRecordRecord<'s>,
366
+
}
367
+
368
+
// core
369
+
370
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
371
+
#[serde(tag = "type", rename_all = "kebab-case")]
372
+
pub enum LexUserType<'s> {
373
+
// lexRecord
374
+
#[serde(borrow)]
375
+
Record(LexRecord<'s>),
376
+
// lexXrpcQuery
377
+
#[serde(rename = "query")]
378
+
XrpcQuery(LexXrpcQuery<'s>),
379
+
// lexXrpcProcedure
380
+
#[serde(rename = "procedure")]
381
+
XrpcProcedure(LexXrpcProcedure<'s>),
382
+
// lexXrpcSubscription
383
+
#[serde(rename = "subscription")]
384
+
XrpcSubscription(LexXrpcSubscription<'s>),
385
+
// lexBlob
386
+
Blob(LexBlob<'s>),
387
+
// lexArray
388
+
Array(LexArray<'s>),
389
+
// lexToken
390
+
Token(LexToken<'s>),
391
+
// lexObject
392
+
Object(LexObject<'s>),
393
+
// lexBoolean,
394
+
Boolean(LexBoolean<'s>),
395
+
// lexInteger,
396
+
Integer(LexInteger<'s>),
397
+
// lexString,
398
+
String(LexString<'s>),
399
+
// lexBytes
400
+
Bytes(LexBytes<'s>),
401
+
// lexCidLink
402
+
CidLink(LexCidLink<'s>),
403
+
// lexUnknown
404
+
Unknown(LexUnknown<'s>),
405
+
}
406
+
407
+
#[cfg(test)]
408
+
mod tests {
409
+
use super::*;
410
+
411
+
const LEXICON_EXAMPLE_TOKEN: &str = r#"
412
+
{
413
+
"lexicon": 1,
414
+
"id": "com.socialapp.actorUser",
415
+
"defs": {
416
+
"main": {
417
+
"type": "token",
418
+
"description": "Actor type of 'User'"
419
+
}
420
+
}
421
+
}"#;
422
+
423
+
#[test]
424
+
fn parse() {
425
+
let doc = serde_json::from_str::<LexiconDoc>(LEXICON_EXAMPLE_TOKEN)
426
+
.expect("failed to deserialize");
427
+
assert_eq!(doc.lexicon, Lexicon::Lexicon1);
428
+
assert_eq!(doc.id, "com.socialapp.actorUser");
429
+
assert_eq!(doc.revision, None);
430
+
assert_eq!(doc.description, None);
431
+
assert_eq!(doc.defs.len(), 1);
432
+
}
433
+
}
+49
-12
crates/jacquard-lexicon/src/lib.rs
+49
-12
crates/jacquard-lexicon/src/lib.rs
···
1
-
pub fn add(left: u64, right: u64) -> u64 {
2
-
left + right
3
-
}
1
+
pub mod fs;
2
+
pub mod lexicon;
3
+
pub mod output;
4
+
pub mod schema;
4
5
5
-
#[cfg(test)]
6
-
mod tests {
7
-
use super::*;
6
+
// #[lexicon]
7
+
// #[derive(serde::Serialize, serde::Deserialize, Debug, Clone, PartialEq, Eq)]
8
+
// #[serde(rename_all = "camelCase")]
9
+
// pub struct Post<'s> {
10
+
// ///Client-declared timestamp when this post was originally created.
11
+
// pub created_at: jacquard_common::types::string::Datetime,
12
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
13
+
// pub embed: core::option::Option<RecordEmbed<'s>>,
14
+
// ///DEPRECATED: replaced by app.bsky.richtext.facet.
15
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
16
+
// pub entities: core::option::Option<Vec<Entity<'s>>>,
17
+
// ///Annotations of text (mentions, URLs, hashtags, etc)
18
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
19
+
// pub facets: core::option::Option<Vec<jacquard_api::app_bsky::richtext::Facet<'s>>>,
20
+
// ///Self-label values for this post. Effectively content warnings.
21
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
22
+
// pub labels: core::option::Option<RecordLabels<'s>>,
23
+
// ///Indicates human language of post primary text content.
24
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
25
+
// pub langs: core::option::Option<Vec<jacquard_common::types::string::Language>>,
26
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
27
+
// pub reply: core::option::Option<ReplyRef<'s>>,
28
+
// ///Additional hashtags, in addition to any included in post text and facets.
29
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
30
+
// pub tags: core::option::Option<Vec<jacquard_common::CowStr<'s>>>,
31
+
// ///The primary post content. May be an empty string, if there are embeds.
32
+
// #[serde(borrow)]
33
+
// pub text: jacquard_common::CowStr<'s>,
34
+
// }
8
35
9
-
#[test]
10
-
fn it_works() {
11
-
let result = add(2, 2);
12
-
assert_eq!(result, 4);
13
-
}
14
-
}
36
+
// #[open_union]
37
+
// #[derive(serde::Serialize, serde::Deserialize, Debug, Clone, PartialEq, Eq)]
38
+
// #[serde(tag = "$type")]
39
+
// pub enum RecordEmbed<'s> {
40
+
// #[serde(borrow)]
41
+
// #[serde(rename = "app.bsky.embed.images")]
42
+
// EmbedImages(Box<jacquard_api::app_bsky::embed::Images<'s>>),
43
+
// #[serde(rename = "app.bsky.embed.video")]
44
+
// EmbedVideo(Box<jacquard_api::app_bsky::embed::Video<'s>>),
45
+
// #[serde(rename = "app.bsky.embed.external")]
46
+
// EmbedExternal(Box<jacquard_api::app_bsky::embed::External<'s>>),
47
+
// #[serde(rename = "app.bsky.embed.record")]
48
+
// EmbedRecord(Box<jacquard_api::app_bsky::embed::Record<'s>>),
49
+
// #[serde(rename = "app.bsky.embed.recordWithMedia")]
50
+
// EmbedRecordWithMedia(Box<jacquard_api::app_bsky::embed::RecordWithMedia<'s>>),
51
+
// }
+41
crates/jacquard-lexicon/src/output.rs
+41
crates/jacquard-lexicon/src/output.rs
···
1
+
use crate::lexicon::*;
2
+
use heck::{ToPascalCase, ToShoutySnakeCase, ToSnakeCase};
3
+
use itertools::Itertools;
4
+
use jacquard_common::CowStr;
5
+
use proc_macro2::TokenStream;
6
+
use quote::{format_ident, quote};
7
+
use std::collections::{HashMap, HashSet};
8
+
use syn::{Path, Result};
9
+
10
+
fn string_type<'s>(string: &'s LexString<'s>) -> Result<(TokenStream, TokenStream)> {
11
+
let description = description(&string.description);
12
+
let typ = match string.format {
13
+
Some(LexStringFormat::AtIdentifier) => {
14
+
quote!(jacquard_common::types::string::AtIdentifier<'s>)
15
+
}
16
+
Some(LexStringFormat::Cid) => quote!(jacquard_common::types::string::Cid<'s>),
17
+
Some(LexStringFormat::Datetime) => quote!(jacquard_common::types::string::Datetime),
18
+
Some(LexStringFormat::Did) => quote!(jacquard_common::types::string::Did<'s>),
19
+
Some(LexStringFormat::Handle) => quote!(jacquard_common::types::string::Handle<'s>),
20
+
Some(LexStringFormat::Nsid) => quote!(jacquard_common::types::string::Nsid<'s>),
21
+
Some(LexStringFormat::Language) => quote!(jacquard_common::types::string::Language),
22
+
Some(LexStringFormat::Tid) => quote!(jacquard_common::types::string::Tid),
23
+
Some(LexStringFormat::RecordKey) => quote!(
24
+
jacquard_common::types::string::RecordKey<jacquard_common::types::string::Rkey<'s>>
25
+
),
26
+
Some(LexStringFormat::Uri) => quote!(jacquard_common::types::string::Uri<'s>),
27
+
Some(LexStringFormat::AtUri) => quote!(jacquard_common::types::string::AtUri<'s>),
28
+
// TODO: other formats (uri, at-uri)
29
+
_ => quote!(CowStr<'s>),
30
+
};
31
+
Ok((description, typ))
32
+
}
33
+
34
+
fn description<'s>(description: &Option<CowStr<'s>>) -> TokenStream {
35
+
if let Some(description) = description {
36
+
let description = description.as_ref();
37
+
quote!(#[doc = #description])
38
+
} else {
39
+
quote!()
40
+
}
41
+
}
+142
crates/jacquard-lexicon/src/schema.rs
+142
crates/jacquard-lexicon/src/schema.rs
···
1
+
// Forked from atrium-codegen
2
+
// https://github.com/sugyan/atrium/blob/main/lexicon/atrium-codegen/src/schema.rs
3
+
4
+
use crate::lexicon::*;
5
+
use heck::ToPascalCase;
6
+
use jacquard_common::{
7
+
CowStr, IntoStatic,
8
+
smol_str::{self, SmolStr, ToSmolStr},
9
+
};
10
+
use std::collections::BTreeMap;
11
+
12
+
pub(crate) fn find_ref_unions<'s>(
13
+
defs: &'s BTreeMap<SmolStr, LexUserType<'s>>,
14
+
) -> Vec<(SmolStr, LexRefUnion<'s>)> {
15
+
let mut unions = Vec::new();
16
+
for (key, def) in defs {
17
+
match def {
18
+
LexUserType::Record(record) => {
19
+
let LexRecordRecord::Object(object) = &record.record;
20
+
find_ref_unions_in_object(object, SmolStr::new_static("Record"), &mut unions);
21
+
}
22
+
LexUserType::XrpcQuery(query) => {
23
+
if let Some(output) = &query.output {
24
+
if let Some(schema) = &output.schema {
25
+
find_ref_unions_in_body_schema(
26
+
schema,
27
+
SmolStr::new_static("Output"),
28
+
&mut unions,
29
+
);
30
+
}
31
+
}
32
+
}
33
+
LexUserType::XrpcProcedure(procedure) => {
34
+
if let Some(input) = &procedure.input {
35
+
if let Some(schema) = &input.schema {
36
+
find_ref_unions_in_body_schema(
37
+
schema,
38
+
SmolStr::new_static("Input"),
39
+
&mut unions,
40
+
);
41
+
}
42
+
}
43
+
if let Some(output) = &procedure.output {
44
+
if let Some(schema) = &output.schema {
45
+
find_ref_unions_in_body_schema(
46
+
schema,
47
+
SmolStr::new_static("Output"),
48
+
&mut unions,
49
+
);
50
+
}
51
+
}
52
+
}
53
+
LexUserType::XrpcSubscription(subscription) => {
54
+
if let Some(message) = &subscription.message {
55
+
if let Some(schema) = &message.schema {
56
+
find_ref_unions_in_subscription_message_schema(
57
+
schema,
58
+
SmolStr::new_static("Message"),
59
+
&mut unions,
60
+
);
61
+
}
62
+
}
63
+
}
64
+
LexUserType::Array(array) => {
65
+
find_ref_unions_in_array(
66
+
array,
67
+
CowStr::Borrowed(&key.to_pascal_case()).into_static(),
68
+
&mut unions,
69
+
);
70
+
}
71
+
LexUserType::Object(object) => {
72
+
find_ref_unions_in_object(object, key.to_pascal_case().to_smolstr(), &mut unions);
73
+
}
74
+
_ => {}
75
+
}
76
+
}
77
+
unions.sort_by_cached_key(|(name, _)| name.clone());
78
+
unions
79
+
}
80
+
81
+
fn find_ref_unions_in_body_schema<'s>(
82
+
schema: &'s LexXrpcBodySchema,
83
+
name: SmolStr,
84
+
unions: &mut Vec<(SmolStr, LexRefUnion<'s>)>,
85
+
) {
86
+
match schema {
87
+
LexXrpcBodySchema::Union(_) => unimplemented!(),
88
+
LexXrpcBodySchema::Object(object) => find_ref_unions_in_object(object, name, unions),
89
+
_ => {}
90
+
}
91
+
}
92
+
93
+
fn find_ref_unions_in_subscription_message_schema<'s>(
94
+
schema: &'s LexXrpcSubscriptionMessageSchema,
95
+
name: SmolStr,
96
+
unions: &mut Vec<(SmolStr, LexRefUnion<'s>)>,
97
+
) {
98
+
match schema {
99
+
LexXrpcSubscriptionMessageSchema::Union(union) => {
100
+
unions.push((name.into(), union.clone()));
101
+
}
102
+
LexXrpcSubscriptionMessageSchema::Object(object) => {
103
+
find_ref_unions_in_object(object, name, unions)
104
+
}
105
+
_ => {}
106
+
}
107
+
}
108
+
109
+
fn find_ref_unions_in_array<'s>(
110
+
array: &'s LexArray,
111
+
name: CowStr<'s>,
112
+
unions: &mut Vec<(SmolStr, LexRefUnion<'s>)>,
113
+
) {
114
+
if let LexArrayItem::Union(union) = &array.items {
115
+
unions.push((smol_str::format_smolstr!("{}", name), union.clone()));
116
+
}
117
+
}
118
+
119
+
fn find_ref_unions_in_object<'s>(
120
+
object: &'s LexObject,
121
+
name: SmolStr,
122
+
unions: &mut Vec<(SmolStr, LexRefUnion<'s>)>,
123
+
) {
124
+
for (k, property) in &object.properties {
125
+
match property {
126
+
LexObjectProperty::Union(union) => {
127
+
unions.push((
128
+
smol_str::format_smolstr!("{name}{}", k.to_pascal_case()),
129
+
union.clone(),
130
+
));
131
+
}
132
+
LexObjectProperty::Array(array) => {
133
+
find_ref_unions_in_array(
134
+
array,
135
+
CowStr::Borrowed(&(name.to_string() + &k.to_pascal_case())).into_static(),
136
+
unions,
137
+
);
138
+
}
139
+
_ => {}
140
+
}
141
+
}
142
+
}