This removes uses of `DescendantRebaser::new` or
`MutRepo::create_descendant_rebaser` from most tests. The exceptions are the
tests having to do with abandoning empty commits on rebase, since adjusting
those is a bit more elaborate (see follow-up commits).
A possible use case is when doing some archaeology around a certain operation.
The current implementation is quadratic if + is repeated. Suppose op_id is
usually close to the current op heads, I think it'll practically work better
than building a reverse lookup table.
This removes CommandError dependency from these resolution functions. We might
want to refactor the error types again if we introduce a real "opset" evaluator.
The error message for unresolved op heads now includes "@" instead of the whole
expression.
The resolver callback usually returns wider error type, which I don't think
is a variant of OpHeadResolutionError.
To help type inference, resolver's error type is E, not E1 where E: From<E1>.
If the commit backend has high latency, it can make a big difference
to read files concurrently. This patch updates the working copy code
to do that in the update code (when reading files from the backend to
write to the working copy). Because our backend at Google reads files
from a local daemon process that already does a lot of prefetching,
this patch doesn't actually help us. I think it's still the right
thing to do for backends that don't do the same kind of
prefetching. It speeds up `jj sparse set --add` by >10x when I disable
the prefetching in our daemon (our `Backend::concurrency()` is 100).
My jj repo contains such head commits, and "jj debug reindex" fails. To address
this problem, we'll probably need to implement GC, and the user will discard
operations before the first bad op id.
I'm going to add a public method that rebuilds index, and its return type will
be different. I also added "build_" because "index" could be misinterpreted
as noun.
The method arguments are reordered to follow the public IndexStore interface.
The error types are shared with the commit store backend. We could add per-store
error types, but it's unlikely that the caller needs to discriminate them.
I think the idea behind `handle_ancestor_ops()` was to let our backend
at Google delegate the work to the server, which could then avoid
walking ancestors. However, we're now thinking that we're going to
make our server resolve divergent operations on its own instead, so
the client will never see more than one op head, unless it manually
creates the second op head itself (e.g. because the user ran two
concurrent commands). In those cases it should be fine to do the
walk. So let's simplify the trait by removing the function.
Consider how one would implment the current `OpHeadsStore` interface
for a cloud-based backend. After `OpHeadsStore::add_op_head()` is
called, the set of op heads temporarily contains two heads (typically)
until `OpHeadsStore::remove_op_head()` is called. That's not invalid,
but it's annoying to have to deal with that state more than
necessary. Also, it's unnecessarily inefficient to send the addition
and removal of op heads as separate RPCs. This patch therefore adds a
`update_op_heads()` method that takes a list of old heads to remove
and a single new head to add. Coming patches will start migrating to
that method.
We move `.jj/repo/op_heads/*` into `.jj/repo/op_heads/heads/*` almost
a year ago, in commits 90a66ec262 and 37ba17589d. We said we would
drop support for it in 0.9+. I think we said that before we started
doing monthly releases, but I we're still past the goal of 6 months
(which is what I think we were aiming for).
I'll probably add change id lookup methods to CompositeIndex. The Index trait
won't gain resolve_change_id_prefix(), but I also renamed its resolve_prefix()
for consistency.
Since Readonly/MutableIndexSegment no longer implement Index trait, there's
no ambiguity between segment-local and index-global operations. Let's shorten
the method names.
I'm thinking of changing some IndexSegment methods to return LocalPosition
instead of global IndexPosition, and using u32 there would be a source of bugs.
Both readonly and mutable segments know the commit ids to return, and the
caller only needs the ids. Since segment_commit_id(local_pos) scans the graph
entries, doing that would increase the chance of cache miss.
This requires creating a new public API as a substitute. I took the opportunity
to also add some comments to the
`MutRepo::record_rewritten_commit`/`record_abandoned_commit` functions.
I imade the simplest possible addition to the API; it is not a very elegant
one. Eventually, the entire `record_rewritten_commit` API should probably be
refactored again.
I also added some comments explaining what these functions do.
Spotted while adding error propagation there. This wouldn't likely be a real
problem because "jj debug reindex" removes all of the operation links.
The "} else {" condition is removed because it doesn't make sense to exclude
only the exact parent_op_id operation. This can be optimized to not walk
ancestors of the parent_op_id operation, but I don't see a motivation to add
tests covering such scenarios. It's pretty rare that an intermediate operation
link is missing.
There are many unit tests that call mutable_segment.save_in(), but I don't
think these callers expect that the segment file could be squashed depending
on the size. Let's make it caller's responsibility.
maybe_squash_with_ancestors() should be cheap if segment_num_commits() == 0,
so it's okay to call it before checking the emptiness.
I think "for" loop is easier to follow. Maybe it could be rewritten further to
.find_map() loop, but that would be too clever.
I also made ancestor_index_segments() pub(super) since it doesn't make sense
to only provide ancestor_files_without_local().
If the error is permanent (because the repo predates the no-gc-ref fix for
example), there's no easy way to recover. Still, panicking in this function
seems wrong.
This is the last use of Read/WriteBytesExt. The byteorder crate is great, but
we don't need an abstraction of endianness. Let's simply use the std functions.
If read_exact() or read_u32() reached to EOF, the index file should be
considered corrupted. File not found error is also treated as data corruption
because an invalid file name could be read from the child segment file. It
can't handle special file names like "..", though.
Index file name also applies to io::Error. New error type reuses io::Error to
represent data corruption. We could add an inner Corrupt|Io enum instead, but
we'll need to remap some io::Error variants (e.g. UnexpectedEof) to Corrupt
anyway.