commits
The Windows tests have been quite flaky lately,
`test_concurrent_read_write_commit` can run for 5+ hours. See e.g.
https://github.com/martinvonz/jj/actions/runs/6964394504/job/18951562247
This is not the most elegant solution, but fixing the test seems
trickier.
Normally, it takes ~3 minutes, but the timeout is 15 just to be safe.
PR #2627 checks that failed tests are not retried.
Example of this working (Windows tests pass after 2 attempts):
https://github.com/martinvonz/jj/actions/runs/6974955234/job/18981391917#step:5:1510
Additional bonus is that this allows us to avoid creating keep refs for the
intermediate commits in the data race preventing loop.
This allows, for example, creating a merge commit with `jj new a b --no-edit -m Merge`, without
affecting the working copy.
Repeating these is a no-op. This allows:
```shell
jj new -r a -r b # Equivalent to jj new a b
jj new --before a --before b # Equivalent to jj new a b --before
```
I keep typing the latter and getting an annoying error.
While it got faster to build a large BTreeMap<RepoPath, _>, there's still
a measurable cost. Let's eliminate it if watchman is enabled and the working
copy is clean. Perhaps, we should introduce new serialization format that
supports instant loading and lookup, but this hack works for the moment.
I'm not sure if the new tree_state format should be flat (RepoPath, _) list,
or tree like the backend storage btw.
In my "linux" repo (watchman enabled):
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
Time (mean ± σ): 768.9 ms ± 14.2 ms [User: 630.7 ms, System: 131.2 ms]
Range (min … max): 742.3 ms … 783.1 ms 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
Time (mean ± σ): 713.0 ms ± 16.8 ms [User: 587.9 ms, System: 116.2 ms]
Range (min … max): 681.5 ms … 731.1 ms 10 runs
Relative speed comparison
1.08 ± 0.03 target/release-with-debug/jj-0 -R ~/mirrors/linux status
1.00 target/release-with-debug/jj-1 -R ~/mirrors/linux status
Just minimizing the changes in the next commit. As we already have
file_states_from_proto(), it makes sense to extract the "to" function.
This ensures that the root fsmonitor_matcher matches nothing if there are no
working-copy changes. The query result can be observed by "jj debug watchman
query-changed-files".
I don't have expertise on watchman query language, but using the watchman API
is probably better than .filter()-ing the result manually.
Follows up on 13c93d52705.
The information I added is explained in that commit's description, but
I feel like it could reduce confusion for future readers of the code.
If I understand it, watchman returns changed files and directories, and a
directory change doesn't mean we need to scan all files under the directory.
We only need &self.working_copy_path here.
Since the caller wants repo-relative paths, it doesn't make sense to convert
them back and forth.
Since this is the error to spawn (or wait) process, command arguments aren't
important. Let's make that clear by not showing full command string.
#2614
Bumps the github-dependencies group with 1 update: [DeterminateSystems/nix-installer-action](https://github.com/determinatesystems/nix-installer-action).
- [Release notes](https://github.com/determinatesystems/nix-installer-action/releases)
- [Commits](https://github.com/determinatesystems/nix-installer-action/compare/5620eb4af6b562c53e4d4628c0b6e4f9d9ae8612...07b8bcba1b22d847db7ee507180c33e115499665)
---
updated-dependencies:
- dependency-name: DeterminateSystems/nix-installer-action
dependency-type: direct:production
update-type: version-update:semver-major
dependency-group: github-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
Bumps the cargo-dependencies group with 1 update: [config](https://github.com/mehcode/config-rs).
- [Changelog](https://github.com/mehcode/config-rs/blob/v0.13.4/CHANGELOG.md)
- [Commits](https://github.com/mehcode/config-rs/compare/0.13.3...v0.13.4)
---
updated-dependencies:
- dependency-name: config
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
Bumps the cargo-dependencies group with 1 update: [serde](https://github.com/serde-rs/serde).
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.192...v1.0.193)
---
updated-dependencies:
- dependency-name: serde
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
No callers pass in a legacy tree.
Bumps the cargo-dependencies group with 2 updates: [rustix](https://github.com/bytecodealliance/rustix) and [test-case](https://github.com/frondeus/test-case).
Updates `rustix` from 0.38.24 to 0.38.25
- [Release notes](https://github.com/bytecodealliance/rustix/releases)
- [Commits](https://github.com/bytecodealliance/rustix/compare/v0.38.24...v0.38.25)
Updates `test-case` from 3.3.0 to 3.3.1
- [Release notes](https://github.com/frondeus/test-case/releases)
- [Changelog](https://github.com/frondeus/test-case/blob/master/CHANGELOG.md)
- [Commits](https://github.com/frondeus/test-case/compare/v3.3.0...v3.3.1)
---
updated-dependencies:
- dependency-name: rustix
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
- dependency-name: test-case
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
The abstraction is no longer useful since we made the types not
self-referential.
This is similar to the following test:
https://github.com/martinvonz/jj/blob/5186066cf5064564466e58096b79fba6b0c0ecfd/cli/tests/test_rebase_command.rs#L264-L269
from https://github.com/martinvonz/jj/pull/538.
This was fixed in 7c8f66ab0, I believe.
Suppose the input list is presorted, sorting a sorted vec would be cheaper
than .insert()-ing sorted items one by one.
In my "linux" repo (watchman eanbled):
- jj-0: baseline
- jj-1: previous (don't randomize by HashMap)
- jj-2: this
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s]
Range (min … max): 1.011 s … 1.068 s 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms]
Range (min … max): 821.7 ms … 870.2 ms 10 runs
Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux status
Time (mean ± σ): 786.2 ms ± 16.7 ms [User: 650.7 ms, System: 204.1 ms]
Range (min … max): 760.8 ms … 805.2 ms 10 runs
Relative speed comparison
1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status
1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status
1.00 target/release-with-debug/jj-2 -R ~/mirrors/linux status
According to the doc, this is compatible with the map syntax.
https://protobuf.dev/programming-guides/proto3/#maps
This change means that the serialized file states are sorted by RepoPath,
so BTreeMap<RepoPath, _> can be reconstructed with fewer cache misses.
In my "linux" repo (watchman enabled):
- jj-0: baseline
- jj-1: this
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s]
Range (min … max): 1.011 s … 1.068 s 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms]
Range (min … max): 821.7 ms … 870.2 ms 10 runs
Relative speed comparison
1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status
1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status
Cache-misses got reduced:
% perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
-- ./target/release-with-debug/jj-0 -R ~/mirrors/linux --no-pager status
1,091.68 msec task-clock # 1.032 CPUs utilized
4,179,596,978 cycles # 3.829 GHz
6,166,231,489 instructions # 1.48 insn per cycle
134,032,047 cache-references # 122.776 M/sec
29,322,707 cache-misses # 21.88% of all cache refs
1.057474164 seconds time elapsed
0.897042000 seconds user
0.194819000 seconds sys
% perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
-- ./target/release-with-debug/jj-1 -R ~/mirrors/linux --no-pager status
927.05 msec task-clock # 1.083 CPUs utilized
3,451,299,198 cycles # 3.723 GHz
6,222,418,272 instructions # 1.80 insn per cycle
98,499,363 cache-references # 106.251 M/sec
11,998,523 cache-misses # 12.18% of all cache refs
0.855938336 seconds time elapsed
0.720568000 seconds user
0.207924000 seconds sys
Just a code cleanup. This allows us to consume proto fields if needed.
I also removed redundant .clone() and .as_str().
The `scm-record` library comments say that the `file_mode` is:
> The Unix file mode of the file (before any changes), if available.
This reverts commit ffd6884 and fixes #2591 and #2548.
It is now out of place in mod.rs. It is only used in two places, so
I copied it to each of them.
Follows up on @AntoineCezar 's work.
Before, `jj revert qq` would say that `qq` is an extraneous argument.
Now, it says "Error: No such subcommand: revert".
Bumps the cargo-dependencies group with 1 update: [test-case](https://github.com/frondeus/test-case).
- [Release notes](https://github.com/frondeus/test-case/releases)
- [Changelog](https://github.com/frondeus/test-case/blob/master/CHANGELOG.md)
- [Commits](https://github.com/frondeus/test-case/compare/v3.2.1...v3.3.0)
---
updated-dependencies:
- dependency-name: test-case
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: cargo-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
RevWalkQueue no longer depend on the RevWalkIndex trait, and the index field
can be removed.
The idea is the same as the heads_pos() change in 9832ee205d76. While
IndexEntry::position() should be cheap, saving 20 bytes per entry appears to
improve the performance in mid-size repos.
In my "linux" repo:
revsets/all()
-------------
baseline 1.24 156.0±1.06ms
this 1.00 126.0±0.51ms
I don't see significant difference in small-sized repos like "jj" or "git".
IndexEntryByPosition isn't removed since it's still used by the revset engine.
This removes the last use of `ouroboros`. Since `TreeEntriesDirItem`
is only used in "legacy trees" (before tree-level conflicts), I didn't
bother to check the performance impact. I also didn't bother to check
the matcher before adding the entries to the list, instead leaving
that where it is in `Iterator::next()`.
This removes the last use of `ouroboros` in `merged_tree.rs`. The set
of conflicts to iterate is usually so small that I didn't bother
checking the performance impact.
This removes another dependency on `ouroboros`, for a small
performance hit:
```
❯ hyperfine --warmup 3 --runs 30 \
'/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' \
'/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0'
Benchmark 1: /tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0
Time (mean ± σ): 689.7 ms ± 23.9 ms [User: 400.0 ms, System: 289.8 ms]
Range (min … max): 666.9 ms … 759.2 ms 30 runs
Benchmark 2: /tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0
Time (mean ± σ): 710.9 ms ± 19.2 ms [User: 420.4 ms, System: 290.6 ms]
Range (min … max): 688.5 ms … 752.0 ms 30 runs
Summary
'/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' ran
1.03 ± 0.05 times faster than '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0'
```
While importing the `ouroboros` crate and the `aliasable` crate it
depends on, the "unsafe Rust reviewer" expressed some concern that
they contain a lot of unsafe code that's hard to review. We can avoid
the unsafe code altogether by making `TreeEntriesIterator` not
self-refential. Instead, we can collect the matching entries in an
individual tree up front. It does have some performance cost:
```
❯ hyperfine --warmup 3 --runs 30 \
'/tmp/jj-before --ignore-working-copy files -r v6.0' \
'/tmp/jj-after --ignore-working-copy files -r v6.0'
Benchmark 1: /tmp/jj-before --ignore-working-copy files -r v6.0
Time (mean ± σ): 461.4 ms ± 14.3 ms [User: 232.1 ms, System: 229.4 ms]
Range (min … max): 443.4 ms … 496.3 ms 30 runs
Benchmark 2: /tmp/jj-after --ignore-working-copy files -r v6.0
Time (mean ± σ): 482.0 ms ± 14.3 ms [User: 257.2 ms, System: 224.9 ms]
Range (min … max): 461.8 ms … 513.3 ms 30 runs
Summary
'/tmp/jj-before --ignore-working-copy files -r v6.0' ran
1.04 ± 0.04 times faster than '/tmp/jj-after --ignore-working-copy files -r v6.0'
```
I think that's acceptable.
For the same reason as the heads_pos() change. We just want to omit duplicated
items.
I'm not sure if this is better, but common_ancestors_pos() would have a
similar property to heads_pos().
I think this is remainder of nightly shims.
This is much faster (maybe because of better cache locality?) Another option
is to use BTreeSet, but the BinaryHeap version is slightly faster.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 2.92 500.0±2.99ms
2 1.98 339.6±1.64ms
3 (this) 1.00 171.2±0.30ms
Apparently, IndexEntry::generation_number() isn't cheap probably because it
involves random access to larger memory region, and the u32 value might not
be aligned. Let's instead store the generation numbers in BinaryHeap.
Also, heads_pos() becomes slightly faster by keeping the BinaryHeap entries
small, so I've removed the IndexEntry at all.
This makes the default log and disambiguation revsets fast, which evaluate
'heads(immutable_heads())'.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 2.92 500.0±2.99ms
2 (this) 1.98 339.6±1.64ms
All callers just iterate over the parent entries.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 (this) 2.92 500.0±2.99ms
Bumps the cargo-dependencies group with 2 updates: [rustix](https://github.com/bytecodealliance/rustix) and [tracing-subscriber](https://github.com/tokio-rs/tracing).
Updates `rustix` from 0.38.21 to 0.38.24
- [Release notes](https://github.com/bytecodealliance/rustix/releases)
- [Commits](https://github.com/bytecodealliance/rustix/compare/v0.38.21...v0.38.24)
Updates `tracing-subscriber` from 0.3.17 to 0.3.18
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.3.17...tracing-subscriber-0.3.18)
---
updated-dependencies:
- dependency-name: rustix
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
- dependency-name: tracing-subscriber
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
The Windows tests have been quite flaky lately,
`test_concurrent_read_write_commit` can run for 5+ hours. See e.g.
https://github.com/martinvonz/jj/actions/runs/6964394504/job/18951562247
This is not the most elegant solution, but fixing the test seems
trickier.
Normally, it takes ~3 minutes, but the timeout is 15 just to be safe.
PR #2627 checks that failed tests are not retried.
Example of this working (Windows tests pass after 2 attempts):
https://github.com/martinvonz/jj/actions/runs/6974955234/job/18981391917#step:5:1510
While it got faster to build a large BTreeMap<RepoPath, _>, there's still
a measurable cost. Let's eliminate it if watchman is enabled and the working
copy is clean. Perhaps, we should introduce new serialization format that
supports instant loading and lookup, but this hack works for the moment.
I'm not sure if the new tree_state format should be flat (RepoPath, _) list,
or tree like the backend storage btw.
In my "linux" repo (watchman enabled):
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
Time (mean ± σ): 768.9 ms ± 14.2 ms [User: 630.7 ms, System: 131.2 ms]
Range (min … max): 742.3 ms … 783.1 ms 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
Time (mean ± σ): 713.0 ms ± 16.8 ms [User: 587.9 ms, System: 116.2 ms]
Range (min … max): 681.5 ms … 731.1 ms 10 runs
Relative speed comparison
1.08 ± 0.03 target/release-with-debug/jj-0 -R ~/mirrors/linux status
1.00 target/release-with-debug/jj-1 -R ~/mirrors/linux status
This ensures that the root fsmonitor_matcher matches nothing if there are no
working-copy changes. The query result can be observed by "jj debug watchman
query-changed-files".
I don't have expertise on watchman query language, but using the watchman API
is probably better than .filter()-ing the result manually.
Bumps the github-dependencies group with 1 update: [DeterminateSystems/nix-installer-action](https://github.com/determinatesystems/nix-installer-action).
- [Release notes](https://github.com/determinatesystems/nix-installer-action/releases)
- [Commits](https://github.com/determinatesystems/nix-installer-action/compare/5620eb4af6b562c53e4d4628c0b6e4f9d9ae8612...07b8bcba1b22d847db7ee507180c33e115499665)
---
updated-dependencies:
- dependency-name: DeterminateSystems/nix-installer-action
dependency-type: direct:production
update-type: version-update:semver-major
dependency-group: github-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
Bumps the cargo-dependencies group with 1 update: [config](https://github.com/mehcode/config-rs).
- [Changelog](https://github.com/mehcode/config-rs/blob/v0.13.4/CHANGELOG.md)
- [Commits](https://github.com/mehcode/config-rs/compare/0.13.3...v0.13.4)
---
updated-dependencies:
- dependency-name: config
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
Bumps the cargo-dependencies group with 1 update: [serde](https://github.com/serde-rs/serde).
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.192...v1.0.193)
---
updated-dependencies:
- dependency-name: serde
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
Bumps the cargo-dependencies group with 2 updates: [rustix](https://github.com/bytecodealliance/rustix) and [test-case](https://github.com/frondeus/test-case).
Updates `rustix` from 0.38.24 to 0.38.25
- [Release notes](https://github.com/bytecodealliance/rustix/releases)
- [Commits](https://github.com/bytecodealliance/rustix/compare/v0.38.24...v0.38.25)
Updates `test-case` from 3.3.0 to 3.3.1
- [Release notes](https://github.com/frondeus/test-case/releases)
- [Changelog](https://github.com/frondeus/test-case/blob/master/CHANGELOG.md)
- [Commits](https://github.com/frondeus/test-case/compare/v3.3.0...v3.3.1)
---
updated-dependencies:
- dependency-name: rustix
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
- dependency-name: test-case
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
Suppose the input list is presorted, sorting a sorted vec would be cheaper
than .insert()-ing sorted items one by one.
In my "linux" repo (watchman eanbled):
- jj-0: baseline
- jj-1: previous (don't randomize by HashMap)
- jj-2: this
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s]
Range (min … max): 1.011 s … 1.068 s 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms]
Range (min … max): 821.7 ms … 870.2 ms 10 runs
Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux status
Time (mean ± σ): 786.2 ms ± 16.7 ms [User: 650.7 ms, System: 204.1 ms]
Range (min … max): 760.8 ms … 805.2 ms 10 runs
Relative speed comparison
1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status
1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status
1.00 target/release-with-debug/jj-2 -R ~/mirrors/linux status
According to the doc, this is compatible with the map syntax.
https://protobuf.dev/programming-guides/proto3/#maps
This change means that the serialized file states are sorted by RepoPath,
so BTreeMap<RepoPath, _> can be reconstructed with fewer cache misses.
In my "linux" repo (watchman enabled):
- jj-0: baseline
- jj-1: this
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s]
Range (min … max): 1.011 s … 1.068 s 10 runs
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms]
Range (min … max): 821.7 ms … 870.2 ms 10 runs
Relative speed comparison
1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status
1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status
Cache-misses got reduced:
% perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
-- ./target/release-with-debug/jj-0 -R ~/mirrors/linux --no-pager status
1,091.68 msec task-clock # 1.032 CPUs utilized
4,179,596,978 cycles # 3.829 GHz
6,166,231,489 instructions # 1.48 insn per cycle
134,032,047 cache-references # 122.776 M/sec
29,322,707 cache-misses # 21.88% of all cache refs
1.057474164 seconds time elapsed
0.897042000 seconds user
0.194819000 seconds sys
% perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
-- ./target/release-with-debug/jj-1 -R ~/mirrors/linux --no-pager status
927.05 msec task-clock # 1.083 CPUs utilized
3,451,299,198 cycles # 3.723 GHz
6,222,418,272 instructions # 1.80 insn per cycle
98,499,363 cache-references # 106.251 M/sec
11,998,523 cache-misses # 12.18% of all cache refs
0.855938336 seconds time elapsed
0.720568000 seconds user
0.207924000 seconds sys
Bumps the cargo-dependencies group with 1 update: [test-case](https://github.com/frondeus/test-case).
- [Release notes](https://github.com/frondeus/test-case/releases)
- [Changelog](https://github.com/frondeus/test-case/blob/master/CHANGELOG.md)
- [Commits](https://github.com/frondeus/test-case/compare/v3.2.1...v3.3.0)
---
updated-dependencies:
- dependency-name: test-case
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: cargo-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>
The idea is the same as the heads_pos() change in 9832ee205d76. While
IndexEntry::position() should be cheap, saving 20 bytes per entry appears to
improve the performance in mid-size repos.
In my "linux" repo:
revsets/all()
-------------
baseline 1.24 156.0±1.06ms
this 1.00 126.0±0.51ms
I don't see significant difference in small-sized repos like "jj" or "git".
IndexEntryByPosition isn't removed since it's still used by the revset engine.
This removes the last use of `ouroboros`. Since `TreeEntriesDirItem`
is only used in "legacy trees" (before tree-level conflicts), I didn't
bother to check the performance impact. I also didn't bother to check
the matcher before adding the entries to the list, instead leaving
that where it is in `Iterator::next()`.
This removes another dependency on `ouroboros`, for a small
performance hit:
```
❯ hyperfine --warmup 3 --runs 30 \
'/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' \
'/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0'
Benchmark 1: /tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0
Time (mean ± σ): 689.7 ms ± 23.9 ms [User: 400.0 ms, System: 289.8 ms]
Range (min … max): 666.9 ms … 759.2 ms 30 runs
Benchmark 2: /tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0
Time (mean ± σ): 710.9 ms ± 19.2 ms [User: 420.4 ms, System: 290.6 ms]
Range (min … max): 688.5 ms … 752.0 ms 30 runs
Summary
'/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' ran
1.03 ± 0.05 times faster than '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0'
```
While importing the `ouroboros` crate and the `aliasable` crate it
depends on, the "unsafe Rust reviewer" expressed some concern that
they contain a lot of unsafe code that's hard to review. We can avoid
the unsafe code altogether by making `TreeEntriesIterator` not
self-refential. Instead, we can collect the matching entries in an
individual tree up front. It does have some performance cost:
```
❯ hyperfine --warmup 3 --runs 30 \
'/tmp/jj-before --ignore-working-copy files -r v6.0' \
'/tmp/jj-after --ignore-working-copy files -r v6.0'
Benchmark 1: /tmp/jj-before --ignore-working-copy files -r v6.0
Time (mean ± σ): 461.4 ms ± 14.3 ms [User: 232.1 ms, System: 229.4 ms]
Range (min … max): 443.4 ms … 496.3 ms 30 runs
Benchmark 2: /tmp/jj-after --ignore-working-copy files -r v6.0
Time (mean ± σ): 482.0 ms ± 14.3 ms [User: 257.2 ms, System: 224.9 ms]
Range (min … max): 461.8 ms … 513.3 ms 30 runs
Summary
'/tmp/jj-before --ignore-working-copy files -r v6.0' ran
1.04 ± 0.04 times faster than '/tmp/jj-after --ignore-working-copy files -r v6.0'
```
I think that's acceptable.
This is much faster (maybe because of better cache locality?) Another option
is to use BTreeSet, but the BinaryHeap version is slightly faster.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 2.92 500.0±2.99ms
2 1.98 339.6±1.64ms
3 (this) 1.00 171.2±0.30ms
Apparently, IndexEntry::generation_number() isn't cheap probably because it
involves random access to larger memory region, and the u32 value might not
be aligned. Let's instead store the generation numbers in BinaryHeap.
Also, heads_pos() becomes slightly faster by keeping the BinaryHeap entries
small, so I've removed the IndexEntry at all.
This makes the default log and disambiguation revsets fast, which evaluate
'heads(immutable_heads())'.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 2.92 500.0±2.99ms
2 (this) 1.98 339.6±1.64ms
Bumps the cargo-dependencies group with 2 updates: [rustix](https://github.com/bytecodealliance/rustix) and [tracing-subscriber](https://github.com/tokio-rs/tracing).
Updates `rustix` from 0.38.21 to 0.38.24
- [Release notes](https://github.com/bytecodealliance/rustix/releases)
- [Commits](https://github.com/bytecodealliance/rustix/compare/v0.38.21...v0.38.24)
Updates `tracing-subscriber` from 0.3.17 to 0.3.18
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.3.17...tracing-subscriber-0.3.18)
---
updated-dependencies:
- dependency-name: rustix
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
- dependency-name: tracing-subscriber
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: cargo-dependencies
...
Signed-off-by: dependabot[bot] <support@github.com>