commits
The websocket-lwt-unix library delivers individual WebSocket frames,
not complete messages. Large messages can be split across multiple
frames (Binary/Text followed by Continuation frames until final=true).
This fix:
- Adds frag_buf buffer to accumulate fragmented message data
- Tracks message type (Binary/Text) via frag_opcode
- Only returns complete binary messages when final frame received
- Properly handles fragmented text messages (skipped)
Without this fix, partial binary data was passed to the CBOR decoder
causing 'invalid payload CBOR' errors after ~1-2 seconds of operation.
- Run indefinitely until Ctrl+C, then print stats and exit cleanly
- Use Lwt_unix.on_signal for proper signal handling integration
- Suppress error messages on graceful exit (decode errors are expected)
- Check interrupted flag before and after WebSocket reads
- Add 'Press Ctrl+C to stop' message at startup
- Uses websocket-lwt-unix library for WebSocket handling
- Implements Firehose effects with Lwt_main.run in handlers
- 380 lines vs 528 for Eio version (28% reduction)
- Same CLI options: --filter, --limit, --json, --content, --cursor
- Use Dag_json.encode_string for record serialization instead of custom code
- Add Climate for CLI parsing with auto-generated help
- Add compact stats with SIGINT handling (prints on Ctrl+C)
- Remove custom record_content type and JSON escaping
- Simplify filter names to single form only
Line count: 805 → 547 (32% reduction)
Major architectural change:
- Use Firehose.subscribe from the library instead of manual event loop
- Move WebSocket implementation into Ws_handler module as effect handler
- Demo now focuses on filtering/formatting, not protocol details
The demo provides effect handlers for:
- Firehose.Ws_connect: TLS connection with WebSocket upgrade
- Firehose.Ws_recv: Frame parsing with fragmentation support
- Firehose.Ws_close: Connection cleanup
This separates concerns:
- Library: protocol logic (subscribe, decode_frame, event types)
- Demo: presentation (filtering, formatting, statistics)
- Ws_handler: I/O implementation (TLS, WebSocket framing)
Line count: 795 (down from 1204, -34%)
Major changes:
- Remove demo mode and sample data (~143 lines)
- Consolidate JSON formatting with shared helpers (json_opt, json_str)
- Merge duplicate formatting functions (format_event, format_event_json now take ~rich param)
- Simplify filter matching with prefix_match helper
- Compact statistics printing
- Remove verbose section comment banners
- Use more concise OCaml idioms throughout
All features preserved:
- --filter for event type filtering
- --json for JSON lines output
- --content for record content extraction
- --cursor for resuming from sequence
- --limit for stopping after N events
Remove duplicate CBOR helper functions and use Firehose module's
get_string, get_array, etc. instead of local reimplementations.
Changes:
- Remove get_cbor_string (use Firehose.get_string)
- Simplify get_string_array to use Firehose.get_array
- Simplify get_nested_string using library helpers
- Update decode_record to use Firehose.get_string throughout
The custom WebSocket implementation is kept in the demo as it serves
as the effect handler for the library's WebSocket effects, and is
needed due to server compatibility requirements (ALPN negotiation).
Add three major features to the firehose demo:
1. Event filtering (--filter): Filter by record types (posts, likes, follows,
reposts, blocks, lists, profiles, feeds) or event types (commits,
identities, accounts, handles, tombstones). Supports comma-separated
multiple filters.
2. JSON output (--json): Output events as JSON lines (JSONL) format for
piping to jq or other JSON processing tools. Suppresses human-readable
headers and statistics in this mode.
3. Content extraction (--content): Decode and display actual record content
from CAR blocks embedded in commit events:
- Posts: text, language tags, reply indicators
- Likes/Reposts: subject URI
- Follows/Blocks: target DID
- Profiles: display name and description
The --json and --content flags can be combined for structured output with
full record data.
Example usage:
--filter posts --limit 10 # Show only posts
--json --filter likes # Likes as JSON
--content --filter posts # Posts with text content
--json --content --filter posts # Full JSON with record data
- Show sample firehose events with colored output
- Demonstrate frame encoding/decoding roundtrip
- Add statistics tracking example
- Include integration pattern for WebSocket handlers
- Remove non-functional WebSocket code (requires TLS setup)
- Add Float variant to DAG-CBOR value type with decode_mode (Ipld/Atproto)
- Floats encoded as 64-bit IEEE 754 doubles, NaN/Infinity rejected
- Add CIDv0 support: parsing, encoding, version detection
- Add CBOR strictness checking: reject indefinite-length encoding
- Implement DAG-JSON codec with {/: cid} links and {/: {bytes: b64}} bytes
- Add 22 new IPLD tests (66 total), all 494 compliance tests pass
- Add README.md with project overview and package descriptions
- Add CONTRIBUTING.md with development guidelines
- Add doc/index.mld with odoc documentation entry point
- Add odoc dependency to dune-project for documentation generation
- Enhance module-level docs for effects, ipld, lexicon, mst, and repo packages
Compliance Report Generator:
- Add test/compliance/ with compliance_report.ml and run_compliance.ml
- Generate JSON, Markdown, and HTML reports from atproto-interop-tests
- Achieve 100% pass rate (494/494 tests) across all test suites:
- Syntax Validation: 448/448 (handle, DID, NSID, TID, record key, AT-URI, datetime, language)
- Cryptography: 12/12 (signature verification, P-256/K-256 did:key)
- Data Model (IPLD): 21/21 (DAG-CBOR/CID computation, CID syntax)
- Merkle Search Tree: 13/13 (key heights, common prefix)
Generated reports:
- compliance-report.json (machine-readable)
- COMPLIANCE.md (human-readable Markdown)
- compliance-report.html (interactive HTML with styling)
All core functionality implemented:
- 11 packages (syntax, crypto, multibase, ipld, mst, repo, identity, xrpc, sync, lexicon, api)
- 272 passing tests covering all 42 interop fixtures
- Effects-based I/O for runtime-agnostic code
- Firehose demo example
Documentation task remains open for future work.
Shows how to use the firehose module with OCaml 5 effects for
subscribing to AT Protocol real-time events (commits, identity
changes, etc.).
Implement comprehensive AT Protocol support in OCaml with 11 packages:
- atproto-syntax: Identifier parsing (handle, DID, NSID, TID, AT-URI, etc.)
- atproto-crypto: P-256/K-256 cryptography, did:key, JWT
- atproto-multibase: Base32, Base58btc encoding
- atproto-ipld: DAG-CBOR, CIDs, CAR files, blobs
- atproto-mst: Merkle Search Tree implementation
- atproto-repo: Repository operations and commits
- atproto-identity: DID and handle resolution
- atproto-xrpc: HTTP API client/server, OAuth
- atproto-sync: Firehose and repository synchronization
- atproto-lexicon: Schema language parser, validator, codegen
- atproto-api: High-level client API with rich text
All 272 tests pass, covering all 42 fixture files from atproto-interop-tests.
Uses OCaml 5.4 effects for I/O abstraction.
The websocket-lwt-unix library delivers individual WebSocket frames,
not complete messages. Large messages can be split across multiple
frames (Binary/Text followed by Continuation frames until final=true).
This fix:
- Adds frag_buf buffer to accumulate fragmented message data
- Tracks message type (Binary/Text) via frag_opcode
- Only returns complete binary messages when final frame received
- Properly handles fragmented text messages (skipped)
Without this fix, partial binary data was passed to the CBOR decoder
causing 'invalid payload CBOR' errors after ~1-2 seconds of operation.
- Run indefinitely until Ctrl+C, then print stats and exit cleanly
- Use Lwt_unix.on_signal for proper signal handling integration
- Suppress error messages on graceful exit (decode errors are expected)
- Check interrupted flag before and after WebSocket reads
- Add 'Press Ctrl+C to stop' message at startup
- Use Dag_json.encode_string for record serialization instead of custom code
- Add Climate for CLI parsing with auto-generated help
- Add compact stats with SIGINT handling (prints on Ctrl+C)
- Remove custom record_content type and JSON escaping
- Simplify filter names to single form only
Line count: 805 → 547 (32% reduction)
Major architectural change:
- Use Firehose.subscribe from the library instead of manual event loop
- Move WebSocket implementation into Ws_handler module as effect handler
- Demo now focuses on filtering/formatting, not protocol details
The demo provides effect handlers for:
- Firehose.Ws_connect: TLS connection with WebSocket upgrade
- Firehose.Ws_recv: Frame parsing with fragmentation support
- Firehose.Ws_close: Connection cleanup
This separates concerns:
- Library: protocol logic (subscribe, decode_frame, event types)
- Demo: presentation (filtering, formatting, statistics)
- Ws_handler: I/O implementation (TLS, WebSocket framing)
Line count: 795 (down from 1204, -34%)
Major changes:
- Remove demo mode and sample data (~143 lines)
- Consolidate JSON formatting with shared helpers (json_opt, json_str)
- Merge duplicate formatting functions (format_event, format_event_json now take ~rich param)
- Simplify filter matching with prefix_match helper
- Compact statistics printing
- Remove verbose section comment banners
- Use more concise OCaml idioms throughout
All features preserved:
- --filter for event type filtering
- --json for JSON lines output
- --content for record content extraction
- --cursor for resuming from sequence
- --limit for stopping after N events
Remove duplicate CBOR helper functions and use Firehose module's
get_string, get_array, etc. instead of local reimplementations.
Changes:
- Remove get_cbor_string (use Firehose.get_string)
- Simplify get_string_array to use Firehose.get_array
- Simplify get_nested_string using library helpers
- Update decode_record to use Firehose.get_string throughout
The custom WebSocket implementation is kept in the demo as it serves
as the effect handler for the library's WebSocket effects, and is
needed due to server compatibility requirements (ALPN negotiation).
Add three major features to the firehose demo:
1. Event filtering (--filter): Filter by record types (posts, likes, follows,
reposts, blocks, lists, profiles, feeds) or event types (commits,
identities, accounts, handles, tombstones). Supports comma-separated
multiple filters.
2. JSON output (--json): Output events as JSON lines (JSONL) format for
piping to jq or other JSON processing tools. Suppresses human-readable
headers and statistics in this mode.
3. Content extraction (--content): Decode and display actual record content
from CAR blocks embedded in commit events:
- Posts: text, language tags, reply indicators
- Likes/Reposts: subject URI
- Follows/Blocks: target DID
- Profiles: display name and description
The --json and --content flags can be combined for structured output with
full record data.
Example usage:
--filter posts --limit 10 # Show only posts
--json --filter likes # Likes as JSON
--content --filter posts # Posts with text content
--json --content --filter posts # Full JSON with record data
- Add Float variant to DAG-CBOR value type with decode_mode (Ipld/Atproto)
- Floats encoded as 64-bit IEEE 754 doubles, NaN/Infinity rejected
- Add CIDv0 support: parsing, encoding, version detection
- Add CBOR strictness checking: reject indefinite-length encoding
- Implement DAG-JSON codec with {/: cid} links and {/: {bytes: b64}} bytes
- Add 22 new IPLD tests (66 total), all 494 compliance tests pass
- Add README.md with project overview and package descriptions
- Add CONTRIBUTING.md with development guidelines
- Add doc/index.mld with odoc documentation entry point
- Add odoc dependency to dune-project for documentation generation
- Enhance module-level docs for effects, ipld, lexicon, mst, and repo packages
Compliance Report Generator:
- Add test/compliance/ with compliance_report.ml and run_compliance.ml
- Generate JSON, Markdown, and HTML reports from atproto-interop-tests
- Achieve 100% pass rate (494/494 tests) across all test suites:
- Syntax Validation: 448/448 (handle, DID, NSID, TID, record key, AT-URI, datetime, language)
- Cryptography: 12/12 (signature verification, P-256/K-256 did:key)
- Data Model (IPLD): 21/21 (DAG-CBOR/CID computation, CID syntax)
- Merkle Search Tree: 13/13 (key heights, common prefix)
Generated reports:
- compliance-report.json (machine-readable)
- COMPLIANCE.md (human-readable Markdown)
- compliance-report.html (interactive HTML with styling)
Implement comprehensive AT Protocol support in OCaml with 11 packages:
- atproto-syntax: Identifier parsing (handle, DID, NSID, TID, AT-URI, etc.)
- atproto-crypto: P-256/K-256 cryptography, did:key, JWT
- atproto-multibase: Base32, Base58btc encoding
- atproto-ipld: DAG-CBOR, CIDs, CAR files, blobs
- atproto-mst: Merkle Search Tree implementation
- atproto-repo: Repository operations and commits
- atproto-identity: DID and handle resolution
- atproto-xrpc: HTTP API client/server, OAuth
- atproto-sync: Firehose and repository synchronization
- atproto-lexicon: Schema language parser, validator, codegen
- atproto-api: High-level client API with rich text
All 272 tests pass, covering all 42 fixture files from atproto-interop-tests.
Uses OCaml 5.4 effects for I/O abstraction.