plcbundle-rs High-Level API Design#
Overview#
The plcbundle-rs library provides a unified, high-level API through the BundleManager type. This API is designed to be:
- Consistent: Same patterns across all operations
- Efficient: Minimal allocations, streaming where possible
- FFI-friendly: Clean C bindings that map naturally to Go
- CLI-first: The Rust CLI uses only public high-level APIs
Quick Reference: All Methods#
// === Initialization ===
BundleManager::new<O: IntoManagerOptions>(directory: PathBuf, options: O) -> Result<Self>
// === Repository Setup ===
init_repository(directory: PathBuf, origin: String) -> Result<()> // Creates plc_bundles.json
// === Bundle Loading ===
load_bundle(num: u32, options: LoadOptions) -> Result<LoadResult>
get_bundle_info(num: u32, flags: InfoFlags) -> Result<BundleInfo>
// === Single Operation Access ===
get_operation_raw(bundle_num: u32, position: usize) -> Result<String>
get_operation(bundle_num: u32, position: usize) -> Result<Operation>
get_operation_with_stats(bundle_num: u32, position: usize) -> Result<OperationResult>
// === Batch Operations ===
get_operations_batch(requests: Vec<OperationRequest>) -> Result<Vec<Operation>>
get_operations_range(start: u32, end: u32, filter: Option<OperationFilter>) -> RangeIterator
// === Query & Export ===
query(spec: QuerySpec) -> QueryIterator
export(spec: ExportSpec) -> ExportIterator
export_to_writer<W: Write>(spec: ExportSpec, writer: W) -> Result<ExportStats>
// === DID Operations ===
get_did_operations(did: &str) -> Result<Vec<Operation>>
resolve_did(did: &str) -> Result<DIDDocument>
batch_resolve_dids(dids: Vec<String>) -> Result<HashMap<String, Vec<Operation>>>
sample_random_dids(count: usize, seed: Option<u64>) -> Result<Vec<String>>
// === Verification ===
verify_bundle(num: u32, spec: VerifySpec) -> Result<VerifyResult>
verify_chain(spec: ChainVerifySpec) -> Result<ChainVerifyResult>
// === Rollback ===
rollback_plan(spec: RollbackSpec) -> Result<RollbackPlan>
rollback(spec: RollbackSpec) -> Result<RollbackResult>
// === Repository Cleanup ===
clean() -> Result<CleanResult>
// === Performance & Caching ===
prefetch_bundles(nums: Vec<u32>) -> Result<()>
warm_up(spec: WarmUpSpec) -> Result<()>
clear_caches()
// === DID Index ===
build_did_index(flush_interval: u32, progress_cb: Option<F>, num_threads: Option<usize>, interrupted: Option<Arc<AtomicBool>>) -> Result<RebuildStats>
verify_did_index(verbose: bool, flush_interval: u32, full: bool, progress_callback: Option<F>) -> Result<did_index::VerifyResult>
repair_did_index(num_threads: usize, flush_interval: u32, progress_callback: Option<F>) -> Result<did_index::RepairResult>
get_did_index_stats() -> HashMap<String, serde_json::Value>
// === Sync Operations (Async) ===
sync_next_bundle(client: &PLCClient) -> Result<u32>
sync_once(client: &PLCClient) -> Result<usize>
// === Remote Access (Async) ===
fetch_remote_index(target: &str) -> Result<Index>
fetch_remote_bundle(base_url: &str, bundle_num: u32) -> Result<Vec<Operation>>
fetch_remote_operation(base_url: &str, bundle_num: u32, position: usize) -> Result<String>
// === Server API Methods ===
get_plc_origin() -> String
stream_bundle_raw(bundle_num: u32) -> Result<File>
stream_bundle_decompressed(bundle_num: u32) -> Result<Box<dyn Read + Send>>
get_current_cursor() -> u64
resolve_handle_or_did(input: &str) -> Result<(String, u64)>
get_resolver_stats() -> HashMap<String, serde_json::Value>
get_handle_resolver_base_url() -> Option<String>
get_did_index() -> Arc<RwLock<did_index::Manager>>
// === Mempool Operations ===
get_mempool_stats() -> Result<MempoolStats>
get_mempool_operations() -> Result<Vec<Operation>>
add_to_mempool(ops: Vec<Operation>) -> Result<usize>
clear_mempool() -> Result<()>
// === Observability ===
get_stats() -> ManagerStats
get_last_bundle() -> u32
directory() -> &PathBuf
11. Sync & Repository Management#
Repository Initialization#
// Standalone function (used by CLI init command)
pub fn init_repository(directory: PathBuf, origin: String) -> Result<()>
Purpose: Initialize a new PLC bundle repository (like git init).
What it does:
- Creates directory if it doesn't exist
- Creates empty
plc_bundles.jsonindex file - Sets origin identifier
CLI Usage:
plcbundle init
plcbundle init /path/to/bundles --origin my-node
Sync from PLC Directory#
pub async fn sync_next_bundle(&mut self, client: &PLCClient) -> Result<u32>
Purpose: Fetch operations from PLC directory and create next bundle.
What it does:
- Gets boundary CIDs from last bundle (prevents duplicates)
- Fetches operations from PLC until mempool has 10,000
- Deduplicates using boundary CID logic
- Creates and saves bundle
- Returns bundle number
pub async fn sync_once(&mut self, client: &PLCClient) -> Result<usize>
Purpose: Sync until caught up with PLC directory (like git fetch).
Returns: Number of bundles synced
CLI Usage:
# Sync once
plcbundle sync
# Continuous daemon
plcbundle sync --continuous --interval 30s
# Max bundles
plcbundle sync --max-bundles 10
PLC Client#
pub struct PLCClient {
// Private fields
}
impl PLCClient {
pub fn new(base_url: impl Into<String>) -> Result<Self>
pub async fn fetch_operations(&self, after: &str, count: usize) -> Result<Vec<PLCOperation>>
}
Features:
- Rate limiting (90 requests/minute)
- Automatic retries with exponential backoff
- 429 (rate limit) handling
Mempool Operations#
pub fn get_mempool_stats(&self) -> Result<MempoolStats>
Purpose: Get current mempool statistics.
pub struct MempoolStats {
pub count: usize,
pub can_create_bundle: bool, // count >= 10,000
pub target_bundle: u32,
pub min_timestamp: DateTime<Utc>,
pub first_time: Option<DateTime<Utc>>,
pub last_time: Option<DateTime<Utc>>,
pub size_bytes: Option<usize>,
pub did_count: Option<usize>,
}
pub fn get_mempool_operations(&self) -> Result<Vec<Operation>>
pub fn add_to_mempool(&self, ops: Vec<Operation>) -> Result<usize>
pub fn clear_mempool(&self) -> Result<()>
CLI Usage:
plcbundle mempool status
plcbundle mempool dump
plcbundle mempool clear
Boundary CID Deduplication#
Critical helper functions for preventing duplicates across bundle boundaries:
pub fn get_boundary_cids(operations: &[Operation]) -> HashSet<String>
Purpose: Extract CIDs that share the same timestamp as the last operation.
pub fn strip_boundary_duplicates(
operations: Vec<Operation>,
prev_boundary: &HashSet<String>
) -> Vec<Operation>
Purpose: Remove operations that were in the previous bundle's boundary.
Why this matters: Operations can have identical timestamps. When creating bundles, the last N operations might share a timestamp. The next fetch might return these same operations again. Must deduplicate!
Design Principles#
- Single Entry Point: All operations go through
BundleManager - Options Pattern: Complex operations use dedicated option structs
- Result Types: Operations return structured result types, not raw tuples
- Streaming by Default: Use iterators for large datasets
- No Direct File Access: CLI never opens bundle files directly
API Structure#
Core Manager#
pub struct BundleManager {
// Private fields
}
impl BundleManager {
/// Create a new BundleManager with options
///
/// # Examples
///
/// ```rust
/// use plcbundle::{BundleManager, ManagerOptions};
/// use std::path::PathBuf;
///
/// // With default options
/// let manager = BundleManager::new(PathBuf::from("."), ())?;
///
/// // With custom options
/// let options = ManagerOptions {
/// handle_resolver_url: Some("https://example.com".to_string()),
/// preload_mempool: true,
/// verbose: true,
/// };
/// let manager = BundleManager::new(PathBuf::from("."), options)?;
/// ```
pub fn new<O: IntoManagerOptions>(directory: PathBuf, options: O) -> Result<Self>
}
ManagerOptions#
/// Options for configuring BundleManager initialization
pub struct ManagerOptions {
/// Optional handle resolver URL for resolving @handle.did identifiers
pub handle_resolver_url: Option<String>,
/// Whether to preload the mempool at initialization (for server use)
pub preload_mempool: bool,
/// Whether to enable verbose logging
pub verbose: bool,
}
impl Default for ManagerOptions {
fn default() -> Self {
Self {
handle_resolver_url: None,
preload_mempool: false,
verbose: false,
}
}
}
Usage:
- Pass
()to use default options:BundleManager::new(dir, ()) - Pass
ManagerOptionsfor custom configuration (including verbose mode)
1. Bundle Loading#
Individual Bundle Loading#
pub fn load_bundle(&self, bundle_num: u32, options: LoadOptions) -> Result<LoadResult>
Purpose: Load a single bundle with control over parsing, verification, and caching.
Options:
pub struct LoadOptions {
pub verify_hash: bool, // Verify bundle integrity
pub decompress: bool, // Decompress bundle data
pub cache: bool, // Cache in memory
pub parse_operations: bool, // Parse JSON into Operation structs
}
Result:
pub struct LoadResult {
pub bundle_number: u32,
pub operations: Vec<Operation>,
pub metadata: BundleMetadata,
pub hash: Option<String>,
}
Use Cases:
- CLI:
plcbundle info --bundle 42 - CLI:
plcbundle verify --bundle 42
2. Operation Access#
Single Operation#
pub fn get_operation(&self, bundle_num: u32, position: usize, options: OperationOptions) -> Result<Operation>
Purpose: Efficiently retrieve a single operation without loading entire bundle.
Options:
pub struct OperationOptions {
pub raw_json: bool, // Return raw JSON string (faster)
pub parse: bool, // Parse into Operation struct
}
Use Cases:
- CLI:
plcbundle op get 42 1337 - CLI:
plcbundle op show 420000
Batch Operations#
pub fn get_operations_batch(&self, requests: Vec<OperationRequest>) -> Result<Vec<Operation>>
Purpose: Retrieve multiple operations in one call (optimizes file I/O).
pub struct OperationRequest {
pub bundle_num: u32,
pub position: usize,
}
Range Operations#
pub fn get_operations_range(&self, start: u64, end: u64, filter: Option<OperationFilter>) -> Result<RangeIterator>
Purpose: Stream operations across bundle boundaries by global position.
3. DID Operations#
DID Lookup#
pub fn get_did_operations(&self, did: &str) -> Result<Vec<Operation>>
Purpose: Get all operations for a specific DID (requires DID index).
Batch DID Lookup#
pub fn batch_resolve_dids(&self, dids: Vec<String>) -> Result<HashMap<String, Vec<Operation>>>
Purpose: Efficiently resolve multiple DIDs in one call.
Use Cases:
- Bulk DID resolution
- Identity verification workflows
Random DID Sampling#
pub fn sample_random_dids(&self, count: usize, seed: Option<u64>) -> Result<Vec<String>>
Purpose: Retrieve pseudo-random DIDs directly from the DID index without touching bundle files.
Details:
- Reads identifiers from memory-mapped shard data (no decompression).
- Deterministic when
seedis provided; otherwise uses current timestamp. - Useful for benchmarks, sampling, or quick spot checks.
4. Query & Export (Streaming)#
Query Operations#
pub fn query(&self, spec: QuerySpec) -> Result<QueryIterator>
Purpose: Execute JMESPath or simple path queries across bundles.
pub struct QuerySpec {
pub expression: String, // JMESPath or simple path
pub bundles: BundleRange, // Which bundles to search
pub mode: QueryMode, // Auto, Simple, or JMESPath
pub filter: Option<OperationFilter>,
pub parallel: Option<usize>, // Number of threads (0=auto)
pub batch_size: usize, // Output batch size
}
pub enum QueryMode {
Auto,
Simple, // Fast path: direct field access
JMESPath, // Full JMESPath evaluation
}
Iterator:
pub struct QueryIterator {
// Implements Iterator<Item = Result<String>>
}
Use Cases:
- CLI:
plcbundle query "did" --bundles 1-100 - CLI:
plcbundle query "operation.type" --mode simple
Export Operations#
pub fn export(&self, spec: ExportSpec) -> Result<ExportIterator>
Purpose: Export operations in various formats with filtering.
pub struct ExportSpec {
pub bundles: BundleRange,
pub format: ExportFormat, // JSONL only
pub filter: Option<OperationFilter>,
pub count: Option<usize>, // Limit number of operations
pub after_timestamp: Option<String>,// Filter by timestamp
}
pub enum ExportFormat {
Jsonl,
}
pub enum BundleRange {
Single(u32),
Range(u32, u32),
List(Vec<u32>),
All,
}
Use Cases:
- CLI:
plcbundle export --range 1-100 --format jsonl - CLI:
plcbundle export --all --after 2024-01-01T00:00:00Z
Export with Callback#
pub fn export_to_writer<W: Write>(&self, spec: ExportSpec, writer: W) -> Result<ExportStats>
Purpose: Export directly to a writer (file, stdout, network).
pub struct ExportStats {
pub records_written: u64,
pub bytes_written: u64,
pub bundles_processed: u32,
pub duration: Duration,
}
Use Cases:
- FFI: Stream to Go
io.Writer - Direct file writing without buffering
5. Verification#
Bundle Verification#
pub fn verify_bundle(&self, bundle_num: u32, spec: VerifySpec) -> Result<VerifyResult>
Purpose: Verify bundle integrity with various checks.
pub struct VerifySpec {
pub check_hash: bool,
pub check_compression: bool,
pub check_json: bool,
pub parallel: bool,
}
pub struct VerifyResult {
pub valid: bool,
pub hash_match: Option<bool>,
pub compression_ok: Option<bool>,
pub json_valid: Option<bool>,
pub errors: Vec<String>,
}
Chain Verification#
pub fn verify_chain(&self, spec: ChainVerifySpec) -> Result<ChainVerifyResult>
Purpose: Verify chain continuity and prev pointers.
pub struct ChainVerifySpec {
pub bundles: BundleRange,
pub check_prev_links: bool,
pub check_timestamps: bool,
}
pub struct ChainVerifyResult {
pub valid: bool,
pub bundles_checked: u32,
pub chain_breaks: Vec<ChainBreak>,
}
pub struct ChainBreak {
pub bundle_num: u32,
pub did: String,
pub reason: String,
}
Use Cases:
- CLI:
plcbundle verify --chain --bundles 1-100
6. Bundle Information#
pub fn get_bundle_info(&self, bundle_num: u32, flags: InfoFlags) -> Result<BundleInfo>
Purpose: Get metadata and statistics for a bundle.
pub struct InfoFlags {
pub include_dids: bool,
pub include_types: bool,
pub include_timeline: bool,
}
pub struct BundleInfo {
pub bundle_number: u32,
pub operation_count: u32,
pub did_count: Option<u32>,
pub compressed_size: u64,
pub uncompressed_size: u64,
pub start_time: String,
pub end_time: String,
pub hash: String,
pub operation_types: Option<HashMap<String, u32>>,
}
Use Cases:
- CLI:
plcbundle info --bundle 42 --detailed
7. Rollback Operations#
Plan Rollback#
pub fn rollback_plan(&self, spec: RollbackSpec) -> Result<RollbackPlan>
Purpose: Preview what would be rolled back (dry-run).
pub struct RollbackSpec {
pub target_bundle: u32,
pub keep_index: bool,
}
pub struct RollbackPlan {
pub bundles_to_remove: Vec<u32>,
pub total_size: u64,
pub operations_affected: u64,
}
Execute Rollback#
pub fn rollback(&self, spec: RollbackSpec) -> Result<RollbackResult>
Purpose: Execute the rollback operation.
pub struct RollbackResult {
pub bundles_removed: Vec<u32>,
pub bytes_freed: u64,
pub success: bool,
}
7.5. Repository Cleanup#
Clean Temporary Files#
pub fn clean(&self) -> Result<CleanResult>
Purpose: Remove all temporary files from the repository.
What it does:
- Removes all
.tmpfiles from the repository root directory (e.g.,plc_bundles.json.tmp) - Removes temporary files from the DID index directory
.plcbundle/(e.g.,config.json.tmp) - Removes temporary shard files from
.plcbundle/shards/(e.g.,00.tmp,01.tmp, etc.)
Result:
pub struct CleanResult {
pub files_removed: usize,
pub bytes_freed: u64,
pub errors: Option<Vec<String>>,
}
Use Cases:
- CLI:
plcbundle clean - Cleanup after interrupted operations
- Maintenance tasks
Note: Temporary files are normally cleaned up automatically during atomic write operations. This method is useful for cleaning up leftover files from interrupted operations.
8. Performance & Caching#
Cache Management#
pub fn prefetch_bundles(&self, bundle_nums: Vec<u32>) -> Result<()>
Purpose: Preload bundles into cache for faster access.
pub fn warm_up(&self, spec: WarmUpSpec) -> Result<()>
Purpose: Warm up caches with intelligent prefetching.
pub struct WarmUpSpec {
pub bundles: BundleRange,
pub strategy: WarmUpStrategy,
}
pub enum WarmUpStrategy {
Sequential, // Load in order
Parallel, // Load concurrently
Adaptive, // Based on access patterns
}
pub fn clear_caches(&self)
Purpose: Clear all in-memory caches.
9. DID Index Management#
Build Index#
pub fn build_did_index<F>(
&self,
flush_interval: u32,
progress_cb: Option<F>,
num_threads: Option<usize>,
interrupted: Option<Arc<AtomicBool>>,
) -> Result<RebuildStats>
where
F: Fn(u32, u32, u64, u64) + Send + Sync, // (current, total, bytes_processed, total_bytes)
Purpose: Build or rebuild the DID → operations index from scratch.
Parameters:
flush_interval: Flush to disk every N bundles (0 = only at end)progress_cb: Optional progress callbacknum_threads: Number of threads (None = auto-detect)interrupted: Optional flag to check for cancellation
Result:
pub struct RebuildStats {
pub bundles_processed: u32,
pub operations_indexed: u64,
pub unique_dids: usize,
pub duration: Duration,
}
Verify Index#
pub fn verify_did_index<F>(
&self,
verbose: bool,
flush_interval: u32,
full: bool,
progress_callback: Option<F>,
) -> Result<did_index::VerifyResult>
where
F: Fn(u32, u32, u64, u64) + Send + Sync, // (current, total, bytes_processed, total_bytes)
Purpose: Verify DID index integrity. Performs standard checks by default, or full verification (rebuild and compare) if full is true.
Parameters:
verbose: Enable verbose loggingflush_interval: Flush interval for full rebuild (iffullis true)full: If true, rebuilds index in temp directory and compares with existingprogress_callback: Optional progress callback for full verification
Result:
pub struct VerifyResult {
pub errors: usize,
pub warnings: usize,
pub missing_base_shards: usize,
pub missing_delta_segments: usize,
pub shards_checked: usize,
pub segments_checked: usize,
pub error_categories: Vec<(String, usize)>,
pub index_last_bundle: u32,
pub last_bundle_in_repo: u32,
}
Use Cases:
- CLI:
plcbundle index verify(standard check) - CLI:
plcbundle index verify --full(full rebuild and compare) - Server: Startup integrity check (call with
full=falseand checkmissing_base_shards/missing_delta_segments)
Repair Index#
pub fn repair_did_index<F>(
&self,
num_threads: usize,
flush_interval: u32,
progress_callback: Option<F>,
) -> Result<did_index::RepairResult>
where
F: Fn(u32, u32, u64, u64) + Send + Sync, // (current, total, bytes_processed, total_bytes)
Purpose: Intelligently repair the DID index by:
- Rebuilding missing delta segments (if base shards are intact)
- Performing incremental update (if < 1000 bundles behind)
- Performing full rebuild (if > 1000 bundles behind or base shards corrupted)
- Compacting delta segments (if > 50 segments)
Parameters:
num_threads: Number of threads (0 = auto-detect)flush_interval: Flush to disk every N bundles (0 = only at end)progress_callback: Optional progress callback
Result:
pub struct RepairResult {
pub repaired: bool,
pub compacted: bool,
pub bundles_processed: u32,
pub segments_rebuilt: usize,
}
Use Cases:
- CLI:
plcbundle index repair - Maintenance: Fix corrupted or incomplete index
- Recovery: Restore index after data loss
Index Statistics#
pub fn get_did_index_stats(&self) -> HashMap<String, serde_json::Value>
Purpose: Get comprehensive DID index statistics.
Returns: JSON-compatible map with fields like:
exists: Whether index existstotal_dids: Total number of unique DIDslast_bundle: Last bundle indexeddelta_segments: Number of delta segmentsshard_count: Number of shards (should be 256)
Use Cases:
- CLI:
plcbundle index status - Server: Status endpoint
- Monitoring: Health checks
10. Server API Methods#
The following methods are specifically designed for use by the HTTP server component. They provide streaming capabilities, cursor tracking, and resolver functionality.
Streaming Bundle Data#
/// Stream bundle raw (compressed) data
pub fn stream_bundle_raw(&self, bundle_num: u32) -> Result<std::fs::File>
/// Stream bundle decompressed (JSONL) data
pub fn stream_bundle_decompressed(&self, bundle_num: u32) -> Result<Box<dyn std::io::Read + Send>>
Purpose: Efficient streaming of bundle data for HTTP responses.
Use Cases:
- Server:
/data/:numberendpoint (raw compressed) - Server:
/jsonl/:numberendpoint (decompressed JSONL) - WebSocket: Streaming operations to clients
Cursor & Position Tracking#
/// Get current cursor (global position of last operation)
/// Cursor = (last_bundle * 10000) + mempool_ops_count
pub fn get_current_cursor(&self) -> u64
Purpose: Track the global position in the operation stream for WebSocket streaming.
Use Cases:
- WebSocket:
/ws?cursor=Nto resume from a specific position - Server: Status endpoint showing current position
Global Position Mapping#
- Positions are 0-indexed per bundle (
0..9,999). - Global position formula:
global = ((bundle - 1) × 10,000) + position. - Conversion back:
bundle = floor(global / 10,000) + 1,position = global % 10,000. - Examples:
global 0 → bundle 1, position 0global 10000 → bundle 2, position 0global 88410345 → bundle 8842, position 345
- Shorthand: Small numbers
< 10000are treated asbundle 1positions.
Handle & DID Resolution#
/// Resolve handle to DID or validate DID format
/// Returns (did, handle_resolve_time_ms)
pub fn resolve_handle_or_did(&self, input: &str) -> Result<(String, u64)>
/// Get handle resolver base URL (if configured)
pub fn get_handle_resolver_base_url(&self) -> Option<String>
/// Get resolver statistics
pub fn get_resolver_stats(&self) -> HashMap<String, serde_json::Value>
Purpose: Support handle-to-DID resolution and resolver metrics.
Use Cases:
- Server:
/:didendpoints (accepts both DIDs and handles) - Server:
/statusendpoint (resolver stats) - Server:
/debug/resolverendpoint
DID Index Access#
/// Get DID index manager (for stats and direct access)
pub fn get_did_index(&self) -> Arc<RwLock<did_index::Manager>>
Purpose: Direct access to DID index for server endpoints.
Use Cases:
- Server:
/debug/didindexendpoint - Server:
/statusendpoint (DID index stats)
PLC Origin#
/// Get PLC origin from index
pub fn get_plc_origin(&self) -> String
Purpose: Get the origin identifier for the repository.
Use Cases:
- Server: Root endpoint (display origin)
- Server:
/statusendpoint
11. Observability#
Manager Statistics#
pub fn get_stats(&self) -> ManagerStats
Purpose: Get runtime statistics for monitoring.
pub struct ManagerStats {
pub cache_hits: u64,
pub cache_misses: u64,
pub bundles_loaded: u64,
pub operations_read: u64,
pub bytes_read: u64,
}
Common Types#
Operation Filter#
pub struct OperationFilter {
pub did: Option<String>,
pub operation_type: Option<String>,
pub time_range: Option<(String, String)>,
pub include_nullified: bool,
}
Used across query, export, and range operations.
Operation#
pub struct Operation {
pub did: String,
pub operation: serde_json::Value,
pub cid: Option<String>,
pub nullified: bool,
pub created_at: String,
// Optional: raw_json field for zero-copy access
}
CLI Usage Examples#
All CLI commands use only the high-level API:
# Uses: get_operation()
plcbundle op get 42 1337
# Uses: query()
plcbundle query "did" --bundles 1-100
# Uses: export_to_writer()
plcbundle export --range 1-100 -o output.jsonl
# Uses: get_bundle_info()
plcbundle info --bundle 42
# Uses: verify_bundle()
plcbundle verify --bundle 42 --checksums
# Uses: get_did_operations()
plcbundle did-lookup did:plc:xyz123
FFI Mapping#
The high-level API maps cleanly to C/Go:
C Bindings#
// Direct 1:1 mapping
// Note: C bindings use default options (equivalent to passing () in Rust)
CBundleManager* bundle_manager_new(const char* path);
int bundle_manager_load_bundle(CBundleManager* mgr, uint32_t num, CLoadOptions* opts, CLoadResult* result);
int bundle_manager_get_operation(CBundleManager* mgr, uint32_t bundle, size_t pos, COperation* out);
int bundle_manager_export(CBundleManager* mgr, CExportSpec* spec, ExportCallback cb, void* user_data);
Go Bindings#
type BundleManager struct { /* ... */ }
func (m *BundleManager) LoadBundle(num uint32, opts LoadOptions) (*LoadResult, error)
func (m *BundleManager) GetOperation(bundle uint32, position int, opts OperationOptions) (*Operation, error)
func (m *BundleManager) Query(spec QuerySpec) (*QueryIterator, error)
func (m *BundleManager) Export(spec ExportSpec, opts ExportOptions) (*ExportStats, error)
Implementation Status#
✅ Implemented#
load_bundle()get_bundle_info()query()export()/export_to_writer()verify_bundle()verify_chain()get_stats()build_did_index()verify_did_index()repair_did_index()
🚧 Needs Refactoring#
get_operation()- Currently in CLI, should be inBundleManagerget_operations_batch()- Not yet implementedget_operations_range()- Partially implemented
📋 To Implement#
get_did_operations()- Uses DID indexbatch_resolve_dids()- Batch DID lookupprefetch_bundles()- Cache warmingwarm_up()- Intelligent prefetching
Migration Plan#
Phase 1: Move Operation Access to Manager ✅ PRIORITY#
// Add to BundleManager
impl BundleManager {
/// Get a single operation efficiently (reads only the needed line)
pub fn get_operation(&self, bundle_num: u32, position: usize) -> Result<OperationData>
/// Get operation with raw JSON (zero-copy)
pub fn get_operation_raw(&self, bundle_num: u32, position: usize) -> Result<String>
}
pub struct OperationData {
pub raw_json: String, // Original JSON from file
pub parsed: Option<Operation>, // Lazy parsing
}
Phase 2: Update CLI to Use API#
// Before (direct file access):
let file = std::fs::File::open(bundle_path)?;
let decoder = zstd::Decoder::new(file)?;
// ... manual line reading
// After (high-level API):
let op_data = manager.get_operation_raw(bundle_num, position)?;
println!("{}", op_data);
Phase 3: Add to FFI/Go#
Once in BundleManager, automatically available to FFI:
int bundle_manager_get_operation_raw(
CBundleManager* mgr,
uint32_t bundle_num,
size_t position,
char** out_json,
size_t* out_len
);
func (m *BundleManager) GetOperationRaw(bundle uint32, position int) (string, error)
Design Benefits#
- Single Source of Truth: All functionality in
BundleManager - Testable: High-level API is easy to unit test
- Consistent: Same patterns across Rust, C, and Go
- Maintainable: Changes in one place affect all consumers
- Efficient: API designed for performance (streaming, lazy parsing)
- Safe: No direct file access by consumers
Next Steps#
- ✅ Add
get_operation()/get_operation_raw()toBundleManager - ✅ Update CLI
op getto use new API - ✅ Add FFI bindings for operation access
- ⏭️ Implement
get_operations_batch() - ⏭️ Implement DID lookup operations
- ⏭️ Add cache warming APIs