commits
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add --skip-memory CLI option
- Add ?inode and ?name parameters to run_all_disk
- Update run_optims.sh to run both disk and memory variants
- Generate separate chart_optims_disk and chart_optims_memory SVGs
- Show disk optimizations before memory in README
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace orange (#f28e2b) and red (#e15759) with brown (#9c755f) and
purple (#b07aa1) so optim variants don't share colors with Irmin-Lwt
and Irmin-Eio in other charts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allow regenerating per-optimization benchmarks via run_optims.sh,
which runs 5 variants (baseline, +inline, +cache, +inode, +all) on
the memory backend and merges results into irmini_optims.json.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Recover benchmark data from previous commits (baseline, +inline, +cache,
+inode, +all) into irmini_optims.json. Add "optims" chart category to
gen_chart_all.py. Update README with optimization breakdown showing
inodes as the biggest single optimization (220x commits, 21x reads).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Regenerate all 3 charts (disk, memory, git) with irmini-git data.
Reorder README: disk first, then memory, then git. Add irmini-git
analysis — 3.6x faster than Irmin on git commits, 100% compatible.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add run_all_git with inode:false for full git compatibility
- Expose git_backend in git_interop.mli
- Add --skip-git option to bench runner
- Rewrite gen_chart_all.py: 3 charts (disk, memory, git), ordered
by family (irmin-lwt, irmin-eio, irmini), explicit color map,
dynamic legend layout
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tree.hash and Store.commit now accept ?inode:bool (default true).
When inode:false, large tree nodes are always written as flat nodes
instead of being split into inode tries. Existing inode nodes are
expanded back to flat format. This ensures 100% git-compatible output
without limiting tree size.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix bench-lwt adapters to use Irmin.Generic_key.KV (for irmin-pack
compatibility) and fully Lwt-native scenarios (no nested Lwt_main.run).
Add irmin-lwt benchmark results for memory, pack, fs, and git backends.
Update README with three-chart organization (memory, disk, git) and
comparative analysis across Irmin-Lwt, Irmin-Eio, and Irmini.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add bench-lwt/ adapters for irmin-lwt (main branch) with memory, fs,
git, and pack backends. Update bench-eio/ to include irmin-fs and
irmin-git backends alongside memory and pack. Add JSON output (--json)
to all benchmark runners for machine-readable results.
New orchestration script (bench/run_all.sh) coordinates benchmarks
across all implementations: irmin-lwt, irmin-eio, irmini-thomas, and
irmini with their respective backends.
New chart generator (bench/gen_chart_all.py) produces separate SVG
charts grouped by backend type: memory, disk (fs/pack/lavyek), and git.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CLI git backend passes inline_threshold:0 so tree nodes use pure
git format, readable by native git tools.
- git_interop rejects internal formats (\x01 inlined, \x02 inode)
with a clear error instead of silently corrupting data.
- Fix git.t cram test: use git init -b main for consistent branch name.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Thread inline_threshold parameter through Store.commit and all
benchmark scenarios, allowing experimentation with different
inlining thresholds from the command line.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three related bugs:
- Proof hash reconstruction used `Contents hash` for all values, but
small values were stored as `Contents_inlined data`, causing hash
mismatch on verify.
- MST codec had inline_threshold=48 but doesn't support inlining
(Contents_inlined is silently converted to a CID hash), causing
data loss. Set to 0.
- Blinded nodes test used 1-byte values that get inlined and thus
cannot be blinded individually. Use 60-byte values instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The legend rectangles now use bar_fill() (cross-hatch for +all, dots
for +inode, etc.) so they match the actual bar appearance.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reads now reach 1.48M ops/s (memory) and 1.30M ops/s (lavyek),
matching Irmin-Eio's 1.3M. Key observations and charts updated.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the assoc list with a Hashtbl for the per-node read cache.
With 1000 entries under a single node, List.assoc_opt was O(n) per
lookup. This change brings reads from 129k to 1.48M ops/s (11.5×),
now faster than Irmin-Eio's 1.3M.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Inode reads: 25k → 205k ops/s (8.3×) from caching resolved children.
All optimizations: commits 272k, reads 253k, incremental 15k ops/s.
Irmini now 1.7× faster than Irmin-Eio on commits.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When navigate resolves a child from the backend (inode lookup +
deserialization), store it in a per-node `resolved` cache. Subsequent
reads to the same path segment find the child directly without
re-reading from the backend.
Reads (inodes, no cache/inlining): 25k → 205k ops/s (8.3×).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New sections: Irmini+inode (memory, 100-byte) and Irmini+all (memory +
lavyek, 30-byte with inodes + cache + inlining). Key results:
- Inodes alone: 519 → 114k commits/s (220×)
- All optimizations: 244k commits/s, surpassing Irmin-Eio (158k)
- Visual separator between Irmini and Irmin groups in charts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hashtbl.hash returns 30 usable bits. At depth >= 6, all names map to
bucket 0, causing unbounded recursion in write_entries. Cap trie depth
at 30/5 = 6 levels and fall back to flat nodes beyond that limit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Git.Tree.add does not replace entries with the same name — it adds
duplicates. This caused inode buckets to grow without bound after
repeated updates (visible as a hang after exactly max_entries commits).
Fix: call Git.Tree.remove before Git.Tree.add for Node and Contents
entries, matching the existing behavior for Contents_inlined.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tree nodes with more than 32 entries are now stored as inode tries
instead of flat nodes. This enables structural sharing:
- Write path: large nodes are split into 32-way tries. When the base
is already an inode, only the affected buckets are re-written
(incremental update via Inode.update).
- Read path: resolve_entry/resolve_entries detect inode format
transparently. Lookups only load the relevant bucket.
Also extracts node_record as a named record type (required for passing
to helper functions) and stores the backend reference in the record
(preserved across state transitions).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entries are distributed across a 32-way trie (HAMT) based on the hash
of their name. Leaf nodes hold at most 32 entries as regular flat nodes.
Internal routing nodes use a compact serialization with a \x02 marker.
Operations:
- write: build inode trie from entry list
- find: traverse only the relevant bucket (O(log n))
- list_all: collect all entries across buckets
- update: incremental modification of affected buckets only
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Needed by the inode module to deserialize hash values from raw bytes
in inode tree node payloads.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cache with 100k entries improves reads significantly:
- disk: 4k → 14k ops/s (3.4×)
- memory: 9.5k → 13.4k (1.4×)
- lavyek: 8.5k → 12.6k (1.5×)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Runners accept an optional ~cache parameter that wraps the backend
with Backend.cached. The main CLI exposes --cache N to set the
capacity. Names include "+cache" suffix when cache is active.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Store.create now accepts ?cache:int to wrap the backend with an LRU
cache of the given capacity. Default: no cache (unchanged behavior).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace naive list-based cache with O(1) hashtable+linked-list LRU
- Increase default capacity from 1000 to 100000 entries
- Populate cache on write and write_batch (not just on read)
- Make capacity configurable via optional parameter
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hashtable for lookups + doubly-linked list for LRU ordering.
Replaces the naive O(n) list-based implementation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
100-byte for base, 30-byte for +inline, 10 KiB for large-values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Refactor gen_chart.py to generate both linear and log-scale SVG charts.
The log scale makes it easier to compare backends with very different
magnitudes (e.g. concurrent scenario: 263 vs 447k ops/s).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gen_chart.py now generates bench_chart_<timestamp>.svg, removes old
versions, and updates the README reference automatically.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous benchmarks had inline_threshold=0 in the codec, meaning
inlining was never active. Setting it to 48 and using 30-byte values
shows the real impact:
- Memory commits: 539 -> 127k ops/s (235x faster)
- Lavyek commits: 480 -> 114k ops/s (238x faster)
- Disk commits: 445 -> 4.6k ops/s (10x faster)
- Memory reads: 9.5k -> 19.5k ops/s (2x faster)
The speedup comes from avoiding separate content-addressable store
writes — inlined contents are stored directly in tree nodes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Each scenario now has its own independent Y axis, making it easier to
compare backends within the same scenario without log distortion.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Small but consistent improvement: memory commits +6%, disk reads +11%,
lavyek commits +6%. The gains are modest because the bottleneck is full
tree re-serialization, not node encoding.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Generated with gen_chart.py, grouped by scenario with log scale.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
irmin-git uses the Git object format (zlib + SHA-1 loose objects) via
the lwt_eio bridge. Results show it is the slowest Irmin backend:
commits at ~2.2k ops/s (17x slower than irmin-fs), reads at ~145k ops/s,
large-values at ~1.6k ops/s. Memory usage is ~552-682 MiB.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
irmin-fs stores one file per object on disk. Results: commits 37k ops/s
(vs irmin-pack 50k), reads 208k ops/s (vs 1.3M for irmin-pack — no LRU),
incremental very slow at 193 ops/s, large-values 2.6k ops/s.
Concurrent scenario skipped (Queue.Empty bug in irmin-fs Eio pool).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The comparison with Lavyek is not apples-to-apples: irmini/Lavyek
does raw backend read/write, while irmin-pack goes through the full
Irmin stack (tree, inodes, pack file, index). Also irmin-pack
serializes all writes behind a single mutex on the append-only pack.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
100 fibers across 12 domains, each writing to its own branch to avoid
CAS contention, reading from main. irmin-pack: ~1.5-1.7k ops/s,
~6× faster than irmini disk but far behind Lavyek (447k ops/s).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run irmin-pack benchmarks on both `eio` and
`cuihtlauac-inline-small-objects-v2` branches. Inlining gives ~25%
boost on irmin-pack commits; reads equally fast thanks to LRU cache.
Fix bench_irmin_pack.ml syntax (split KV functor application).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add results from official Irmin on both the eio branch and the
cuihtlauac-inline-small-objects-v2 branch (Eio + small object inlining).
All runs use the same configuration (50 commits, 500 adds, depth 10).
Irmin-Eio is ~300× faster on commits and ~160× on reads thanks to
inode-based structural sharing. Inlining has marginal impact on these
benchmarks (100-byte values, in-memory backend).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Document scenarios, CLI options, file layout and reference results
from a 50-commit / 500-adds run on a 12-core machine. Highlights
Lavyek's 1700× advantage on concurrent workloads.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test codec round-trip with inlined entries, and tree write/read with
inlining enabled (both flat and nested paths).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend Codec.S with a `Contents_inlined of string` entry variant that
allows small content values to be stored directly in parent tree nodes
instead of as separate content-addressed blobs. This reduces backend
lookups for small values.
Key changes:
- Codec.S: new `entry` type with `Contents_inlined`, `inline_threshold`
- Codec.Git: wrapped node type supporting both standard Git tree entries
and inlined entries, with versioned serialization (v0 backward compat,
v1 with inlined data)
- Tree: `hash` takes optional `~inline_threshold` parameter; contents
at or below the threshold are embedded in the parent node
- Proof: handles inlined entries in find/list/add/build_proof_tree
The default threshold is 0 (no inlining), preserving backward
compatibility. Callers opt in by passing ~inline_threshold:48 to
Tree.hash. Git interop backends should use the default (0) since
Git's object store cannot represent the extended format.
Inspired by Irmin's cuihtlauac-inline-small-objects-v2 branch.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
100 fibers distributed round-robin across available domains (12 on
this machine). Shows ~2190x throughput advantage for Lavyek over
the mutex-based disk backend under high concurrency.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Spawns N fibers on separate domains doing parallel reads+writes
directly on the backend. Shows Lavyek's lock-free advantage
(~2000x faster than disk backend under 8-fiber contention).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Irmin workspace ignores directories starting with _. Use bench-irmini/
as the temporary directory name and generate the dune+main files inline.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use Lavyek.create (fresh) instead of open_out, and give each scenario
its own subdirectory to avoid WAL replay issues between runs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comparable multi-scenario benchmarks for official Irmin (eio branch):
- Irmin_mem in-memory backend
- Irmin-pack persistent backend
Same 4 scenarios: commits, reads, incremental, large-values.
Includes run.sh script to run both irmini and Irmin-Eio benchmarks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Benchmark suite with 4 scenarios (commits, reads, incremental, large-values)
across 3 backends (memory, disk, lavyek). Includes comparison table output.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cmdliner 1.3.0 shadows exit with a deprecated binding.
Replace all bare exit calls with Stdlib.exit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Convert all packages from:
(source (uri https://tangled.org/handle/repo))
to:
(source (tangled handle/repo))
This uses dune 3.21's native tangled support for cleaner source
declarations. Also removes redundant homepage/bug_reports fields
that are auto-generated from tangled sources.
- Rename directory: ocaml-build-info -> monopam-info
- Rename module: Mono_info -> Monopam_info
- Rename package: ocaml-build-info -> monopam-info
- Update all consumers to use new module name
- Remove "Skipping pull" log noise from push output
Rename ocaml-version to ocaml-build-info with Mono_info module.
All homebrew binaries now use Mono_info.version for consistent
version reporting across the monorepo.
The library wraps dune-build-info and falls back to git hash
in dev mode. Can be extended with SBOM or other metadata later.
- Add --skip-memory CLI option
- Add ?inode and ?name parameters to run_all_disk
- Update run_optims.sh to run both disk and memory variants
- Generate separate chart_optims_disk and chart_optims_memory SVGs
- Show disk optimizations before memory in README
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Recover benchmark data from previous commits (baseline, +inline, +cache,
+inode, +all) into irmini_optims.json. Add "optims" chart category to
gen_chart_all.py. Update README with optimization breakdown showing
inodes as the biggest single optimization (220x commits, 21x reads).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add run_all_git with inode:false for full git compatibility
- Expose git_backend in git_interop.mli
- Add --skip-git option to bench runner
- Rewrite gen_chart_all.py: 3 charts (disk, memory, git), ordered
by family (irmin-lwt, irmin-eio, irmini), explicit color map,
dynamic legend layout
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tree.hash and Store.commit now accept ?inode:bool (default true).
When inode:false, large tree nodes are always written as flat nodes
instead of being split into inode tries. Existing inode nodes are
expanded back to flat format. This ensures 100% git-compatible output
without limiting tree size.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix bench-lwt adapters to use Irmin.Generic_key.KV (for irmin-pack
compatibility) and fully Lwt-native scenarios (no nested Lwt_main.run).
Add irmin-lwt benchmark results for memory, pack, fs, and git backends.
Update README with three-chart organization (memory, disk, git) and
comparative analysis across Irmin-Lwt, Irmin-Eio, and Irmini.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add bench-lwt/ adapters for irmin-lwt (main branch) with memory, fs,
git, and pack backends. Update bench-eio/ to include irmin-fs and
irmin-git backends alongside memory and pack. Add JSON output (--json)
to all benchmark runners for machine-readable results.
New orchestration script (bench/run_all.sh) coordinates benchmarks
across all implementations: irmin-lwt, irmin-eio, irmini-thomas, and
irmini with their respective backends.
New chart generator (bench/gen_chart_all.py) produces separate SVG
charts grouped by backend type: memory, disk (fs/pack/lavyek), and git.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CLI git backend passes inline_threshold:0 so tree nodes use pure
git format, readable by native git tools.
- git_interop rejects internal formats (\x01 inlined, \x02 inode)
with a clear error instead of silently corrupting data.
- Fix git.t cram test: use git init -b main for consistent branch name.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three related bugs:
- Proof hash reconstruction used `Contents hash` for all values, but
small values were stored as `Contents_inlined data`, causing hash
mismatch on verify.
- MST codec had inline_threshold=48 but doesn't support inlining
(Contents_inlined is silently converted to a CID hash), causing
data loss. Set to 0.
- Blinded nodes test used 1-byte values that get inlined and thus
cannot be blinded individually. Use 60-byte values instead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When navigate resolves a child from the backend (inode lookup +
deserialization), store it in a per-node `resolved` cache. Subsequent
reads to the same path segment find the child directly without
re-reading from the backend.
Reads (inodes, no cache/inlining): 25k → 205k ops/s (8.3×).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New sections: Irmini+inode (memory, 100-byte) and Irmini+all (memory +
lavyek, 30-byte with inodes + cache + inlining). Key results:
- Inodes alone: 519 → 114k commits/s (220×)
- All optimizations: 244k commits/s, surpassing Irmin-Eio (158k)
- Visual separator between Irmini and Irmin groups in charts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Git.Tree.add does not replace entries with the same name — it adds
duplicates. This caused inode buckets to grow without bound after
repeated updates (visible as a hang after exactly max_entries commits).
Fix: call Git.Tree.remove before Git.Tree.add for Node and Contents
entries, matching the existing behavior for Contents_inlined.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tree nodes with more than 32 entries are now stored as inode tries
instead of flat nodes. This enables structural sharing:
- Write path: large nodes are split into 32-way tries. When the base
is already an inode, only the affected buckets are re-written
(incremental update via Inode.update).
- Read path: resolve_entry/resolve_entries detect inode format
transparently. Lookups only load the relevant bucket.
Also extracts node_record as a named record type (required for passing
to helper functions) and stores the backend reference in the record
(preserved across state transitions).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entries are distributed across a 32-way trie (HAMT) based on the hash
of their name. Leaf nodes hold at most 32 entries as regular flat nodes.
Internal routing nodes use a compact serialization with a \x02 marker.
Operations:
- write: build inode trie from entry list
- find: traverse only the relevant bucket (O(log n))
- list_all: collect all entries across buckets
- update: incremental modification of affected buckets only
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cache with 100k entries improves reads significantly:
- disk: 4k → 14k ops/s (3.4×)
- memory: 9.5k → 13.4k (1.4×)
- lavyek: 8.5k → 12.6k (1.5×)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
100-byte for base, 30-byte for +inline, 10 KiB for large-values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous benchmarks had inline_threshold=0 in the codec, meaning
inlining was never active. Setting it to 48 and using 30-byte values
shows the real impact:
- Memory commits: 539 -> 127k ops/s (235x faster)
- Lavyek commits: 480 -> 114k ops/s (238x faster)
- Disk commits: 445 -> 4.6k ops/s (10x faster)
- Memory reads: 9.5k -> 19.5k ops/s (2x faster)
The speedup comes from avoiding separate content-addressable store
writes — inlined contents are stored directly in tree nodes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
irmin-git uses the Git object format (zlib + SHA-1 loose objects) via
the lwt_eio bridge. Results show it is the slowest Irmin backend:
commits at ~2.2k ops/s (17x slower than irmin-fs), reads at ~145k ops/s,
large-values at ~1.6k ops/s. Memory usage is ~552-682 MiB.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
irmin-fs stores one file per object on disk. Results: commits 37k ops/s
(vs irmin-pack 50k), reads 208k ops/s (vs 1.3M for irmin-pack — no LRU),
incremental very slow at 193 ops/s, large-values 2.6k ops/s.
Concurrent scenario skipped (Queue.Empty bug in irmin-fs Eio pool).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The comparison with Lavyek is not apples-to-apples: irmini/Lavyek
does raw backend read/write, while irmin-pack goes through the full
Irmin stack (tree, inodes, pack file, index). Also irmin-pack
serializes all writes behind a single mutex on the append-only pack.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add results from official Irmin on both the eio branch and the
cuihtlauac-inline-small-objects-v2 branch (Eio + small object inlining).
All runs use the same configuration (50 commits, 500 adds, depth 10).
Irmin-Eio is ~300× faster on commits and ~160× on reads thanks to
inode-based structural sharing. Inlining has marginal impact on these
benchmarks (100-byte values, in-memory backend).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test codec round-trip with inlined entries, and tree write/read with
inlining enabled (both flat and nested paths).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend Codec.S with a `Contents_inlined of string` entry variant that
allows small content values to be stored directly in parent tree nodes
instead of as separate content-addressed blobs. This reduces backend
lookups for small values.
Key changes:
- Codec.S: new `entry` type with `Contents_inlined`, `inline_threshold`
- Codec.Git: wrapped node type supporting both standard Git tree entries
and inlined entries, with versioned serialization (v0 backward compat,
v1 with inlined data)
- Tree: `hash` takes optional `~inline_threshold` parameter; contents
at or below the threshold are embedded in the parent node
- Proof: handles inlined entries in find/list/add/build_proof_tree
The default threshold is 0 (no inlining), preserving backward
compatibility. Callers opt in by passing ~inline_threshold:48 to
Tree.hash. Git interop backends should use the default (0) since
Git's object store cannot represent the extended format.
Inspired by Irmin's cuihtlauac-inline-small-objects-v2 branch.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comparable multi-scenario benchmarks for official Irmin (eio branch):
- Irmin_mem in-memory backend
- Irmin-pack persistent backend
Same 4 scenarios: commits, reads, incremental, large-values.
Includes run.sh script to run both irmini and Irmin-Eio benchmarks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>