This is much faster (maybe because of better cache locality?) Another option
is to use BTreeSet, but the BinaryHeap version is slightly faster.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 2.92 500.0±2.99ms
2 1.98 339.6±1.64ms
3 (this) 1.00 171.2±0.30ms
Apparently, IndexEntry::generation_number() isn't cheap probably because it
involves random access to larger memory region, and the u32 value might not
be aligned. Let's instead store the generation numbers in BinaryHeap.
Also, heads_pos() becomes slightly faster by keeping the BinaryHeap entries
small, so I've removed the IndexEntry at all.
This makes the default log and disambiguation revsets fast, which evaluate
'heads(immutable_heads())'.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 2.92 500.0±2.99ms
2 (this) 1.98 339.6±1.64ms
All callers just iterate over the parent entries.
"bench revset" result in my linux repo:
revsets/heads(tags())
---------------------
baseline 3.28 560.6±4.01ms
1 (this) 2.92 500.0±2.99ms
For loose refs, uninteresting directories can be just skipped. For packed refs,
gix will have to do binary search for each prefix to find the starting point.
Still it's better overall if the repository contains tons of refs/jj/keep refs.
With my linux repo containing ~5k loose jj refs, this saves ~40ms:
% hyperfine --warmup 3 --runs 10 \
"/tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux" \
"/tmp/jj-gix-iter --ignore-working-copy git import -R ~/mirrors/linux"
Benchmark 1: /tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux
Time (mean ± σ): 151.6 ms ± 11.4 ms [User: 38.8 ms, System: 111.6 ms]
Range (min … max): 129.8 ms … 159.5 ms 10 runs
Benchmark 2: /tmp/jj-gix-iter --ignore-working-copy git import -R ~/mirrors/linux
Time (mean ± σ): 109.9 ms ± 11.6 ms [User: 27.5 ms, System: 82.4 ms]
Range (min … max): 89.4 ms … 117.8 ms 10 runs
Gitoxide errors are boxed since there are various error types and they tend
to exceed the clippy size limit.
Apparently, gitoxide is faster than git2:
% hyperfine --warmup 3 --runs 10 \
"/tmp/jj-baseline --ignore-working-copy git import -R ~/mirrors/linux" \
"/tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux"
Benchmark 1: /tmp/jj-baseline --ignore-working-copy git import -R ~/mirrors/linux
Time (mean ± σ): 205.4 ms ± 15.7 ms [User: 59.6 ms, System: 144.6 ms]
Range (min … max): 189.7 ms … 223.9 ms 10 runs
Benchmark 2: /tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux
Time (mean ± σ): 176.2 ms ± 13.7 ms [User: 41.2 ms, System: 134.0 ms]
Range (min … max): 155.4 ms … 186.5 ms 10 runs
If a commit pointed to by HEAD or ref is missing, the ref is considered
invalid and excluded by import_refs(). The current test behavior appears to
depend on some in-memory cache of git2::Repository.
A coworker was having trouble finding a link to the jj Discord community. Add it to the top of the document, alongside the other badges, so it's easier to find.
We need to .collect_vec() the parents iterator to temporary buffer since the
borrowed iterator can't be returned back to the dag_walk functions. Another
option is to clone op_store and parent ids to remove &self lifetime from the
iterator, but that also means a temporary Vec is created.
Unlike dfs_ok(), this function short-circuits at an Err as we use non-lazy
topo_order_forward() internally. I think that's good enough. If we implement
GC on operation log, deleted parents will be excluded (or mapped to tombstone)
by caller. An Err shouldn't mean it's GC-ed.
This unblocks the use of Result<T, E> in op.parents().
There are two ways to encode errors:
a. impl IntoIterator<Item = Result<T, E>>
b. Result<V, E> where V: FromIterator<Item = T>
I think (a) is more natural to algorithms like dfs(), which can process error
nodes transparently.
Still the caller might have to collect the source iterator to temporary Vec
to conform to the neighbors_fn signature. It's not easy for neighbors_fn to
return an iterator borrowing the input node. We already have GAT, but doesn't
have return-position impl Trait in trait yet.
Recognize signature metadata from git commit objects, implement
a basic version of that for the native backend.
Extract the signed data (a commit binary repr without the signature) to
be verified later.
Otherwise, ref updates would fail if we port git::export_refs() to gitoxide.
This change isn't strictly needed for the backend itself, but we'll reuse the
gix::Repository instance created by the backend when importing and exporting
Git refs.
GitBackend will use it to configure gix::Repository. I think UserSettings
is generally useful to pass store-specific parameters, so I've updated all
factory functions.
I'll make it propagate OpStoreError, but OpStoreError is quite different
from the existing StaleWorkingCopyError. I think this error isn't actually
an "error" but a description of the working copy state.
Per discussion in https://github.com/martinvonz/jj/discussions/2555. I'm
okay with either way, but it's confusing if we had "branch create" and
"branch set" and both of these could create a new branch.
While the safe implementation is a bit more complex (and probably more branchy),
I don't think the runtime overhead would matter here. Let's remove one more
unsafe for better code maintainability.
Renamed `description_template_for_commit` to
`description_template_for_describe` since it's only used in
`cmd_describe`.
Renamed `description_template_for_cmd_split` to
`description_template_for_commit` and modified to accomodate empty
`intro` argument.
Fixes#2439.
We have a few places where we have a `MergedTreeValue` and need to
read the data associated with it so we can write to the working copy
or include it in a diff. Let's extract some of that shared logic to a
function so we can reuse it. I plan to use it for reading file
contents in advance while streaming a diff in `local_working_copy`
soon (and probably in `jj diff` thereafter), but I think it seems like
an improvement on its own.
I'd like to read N files ahead from the backend, to avoid serializing
too many server calls on backends that are backed by a server. Moving
the reads a little earlier is a little step towards that.
The `TreeState::write_*()` functions can now be made into free/static
functions if we prefer.
We no longer have "unsafe" in this function, so let's use the iterator API
instead of recursion. Apparently I haven't pushed this change before because
unsafe in .find_map() looked scary.
Remove a couple of unnecessary unsafes:
- The NonZeroUsize is a constant where the unwrap will optimize away
anyway and we don't have an unsafe without any good reason there :)
- The other two were simply not needed, lifetimes worked fine, maybe
Rust became better since that code was written? NLL? Anyway, they're
gone now