Commit graph

2461 commits

Author SHA1 Message Date
Yuya Nishihara
51691ea22c tests: add lib tests for op id resolution, migrate some from cli
CLI testing is slow and harder to set up crafted environment.
2024-01-02 10:30:08 +09:00
Yuya Nishihara
dad890b960 operation: make parent_ids() return slice instead of Vec reference 2024-01-02 02:47:41 +09:00
Yuya Nishihara
c9b581589c op_walk: simplify arguments passed to high-level "opset" query functions 2024-01-01 10:22:23 +09:00
Yuya Nishihara
26b5f38f45 op_walk: move "opset" query functions from jj_cli 2024-01-01 10:22:23 +09:00
Yuya Nishihara
e4460d5386 op_walk: add error types for fake "opset" expression
This removes CommandError dependency from these resolution functions. We might
want to refactor the error types again if we introduce a real "opset" evaluator.

The error message for unresolved op heads now includes "@" instead of the whole
expression.
2024-01-01 10:22:23 +09:00
Yuya Nishihara
94fc32ab47 op_walk: extract walk_ancestors() to new module
I'm going to extract fake "opset" resolution functions there, and I think
walk_ancestors() belongs to the same category.
2024-01-01 10:22:23 +09:00
Yuya Nishihara
6dd936f72f op_heads: let caller decide resolve_op_heads() error type
The resolver callback usually returns wider error type, which I don't think
is a variant of OpHeadResolutionError.

To help type inference, resolver's error type is E, not E1 where E: From<E1>.
2024-01-01 10:22:23 +09:00
Martin von Zweigbergk
90744fb770 working copy: read files ahead when updating
If the commit backend has high latency, it can make a big difference
to read files concurrently. This patch updates the working copy code
to do that in the update code (when reading files from the backend to
write to the working copy). Because our backend at Google reads files
from a local daemon process that already does a lot of prefetching,
this patch doesn't actually help us. I think it's still the right
thing to do for backends that don't do the same kind of
prefetching. It speeds up `jj sparse set --add` by >10x when I disable
the prefetching in our daemon (our `Backend::concurrency()` is 100).
2023-12-29 13:37:13 -08:00
Yuya Nishihara
f9e9058b9b index: show bad operation id if commit lookup failed during reindexing
My jj repo contains such head commits, and "jj debug reindex" fails. To address
this problem, we'll probably need to implement GC, and the user will discard
operations before the first bad op id.
2023-12-29 13:05:58 +09:00
Yuya Nishihara
43e016a7d1 index: add explicit reindexing method that can propagate error 2023-12-29 13:05:58 +09:00
Yuya Nishihara
ab1c8656a4 index: rename private index_at_operation methods, reorder arguments
I'm going to add a public method that rebuilds index, and its return type will
be different. I also added "build_" because "index" could be misinterpreted
as noun.

The method arguments are reordered to follow the public IndexStore interface.
2023-12-29 13:05:58 +09:00
Yuya Nishihara
3abe6be384 index: propagate DefaultIndexStore::init/reinit() errors 2023-12-29 13:05:58 +09:00
Yuya Nishihara
955f6e356a repo: add error propagation path to IndexStore initialization and loading
The error types are shared with the commit store backend. We could add per-store
error types, but it's unlikely that the caller needs to discriminate them.
2023-12-29 13:05:58 +09:00
Yuya Nishihara
bb73cd491f clenaup: don't use debug format to embed ObjectId in error message
Also fixed typo, s/a/an/.
2023-12-29 13:05:58 +09:00
Martin von Zweigbergk
d06764eb7c op heads: remove now-unused methods for adding/removing op heads 2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
65a6aa61db op heads: replace last use of remove_op_head() by update_op_heads() 2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
76516bb46b op heads: inline handle_ancestor_ops()
This gets us closer to being able to use the new `update_op_heads()`
function here (without calling it multiple times).
2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
4221c7cf5c op heads: remove handle_ancestor_ops() from trait
I think the idea behind `handle_ancestor_ops()` was to let our backend
at Google delegate the work to the server, which could then avoid
walking ancestors. However, we're now thinking that we're going to
make our server resolve divergent operations on its own instead, so
the client will never see more than one op head, unless it manually
creates the second op head itself (e.g. because the user ran two
concurrent commands). In those cases it should be fine to do the
walk. So let's simplify the trait by removing the function.
2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
f969f4b0b0 op heads: remove lifetime from OpHeadsStoreLock 2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
c304777a35 op heads: remove promote_new_op()
`OpHeadsStoreLock::promote_new_op()` doesn't add much over the new
`update_op_heads()`, so let's switch to the latter.
2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
b8e45d196f op heads: add a new trait method combining add and remove of op heads
Consider how one would implment the current `OpHeadsStore` interface
for a cloud-based backend. After `OpHeadsStore::add_op_head()` is
called, the set of op heads temporarily contains two heads (typically)
until `OpHeadsStore::remove_op_head()` is called. That's not invalid,
but it's annoying to have to deal with that state more than
necessary. Also, it's unnecessarily inefficient to send the addition
and removal of op heads as separate RPCs. This patch therefore adds a
`update_op_heads()` method that takes a list of old heads to remove
and a single new head to add. Coming patches will start migrating to
that method.
2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
8137975785 op heads: drop support for old location/format
We move `.jj/repo/op_heads/*` into `.jj/repo/op_heads/heads/*` almost
a year ago, in commits 90a66ec262 and 37ba17589d. We said we would
drop support for it in 0.9+. I think we said that before we started
doing monthly releases, but I we're still past the goal of 6 months
(which is what I think we were aiming for).
2023-12-28 09:17:42 -08:00
Yuya Nishihara
dde42b9c05 index: rename resolve_prefix() to resolve_commit_id_prefix()
I'll probably add change id lookup methods to CompositeIndex. The Index trait
won't gain resolve_change_id_prefix(), but I also renamed its resolve_prefix()
for consistency.
2023-12-26 01:03:10 +09:00
Yuya Nishihara
0f2f566188 index: remove "segment_" prefix from IndexSegment methods
Since Readonly/MutableIndexSegment no longer implement Index trait, there's
no ambiguity between segment-local and index-global operations. Let's shorten
the method names.
2023-12-26 01:03:10 +09:00
Yuya Nishihara
c9b9e2864e index: introduce newtype that represents segment-local position
I'm thinking of changing some IndexSegment methods to return LocalPosition
instead of global IndexPosition, and using u32 there would be a source of bugs.
2023-12-26 01:03:10 +09:00
Yuya Nishihara
ee8d5e279a index: make segment-level lookup return neighbor commit ids instead of positions
Both readonly and mutable segments know the commit ids to return, and the
caller only needs the ids. Since segment_commit_id(local_pos) scans the graph
entries, doing that would increase the chance of cache miss.
2023-12-26 01:03:10 +09:00
Yuya Nishihara
0e7834feb9 index: inline segment_entry_by_pos()
There's no reasonable way to abstract the IndexEntry construction.
2023-12-26 01:03:10 +09:00
Ilya Grigoriev
1fb9df252b split.rs: stop using DescendantRebaser::new
This requires creating a new public API as a substitute. I took the opportunity
to also add some comments to the
`MutRepo::record_rewritten_commit`/`record_abandoned_commit` functions.

I imade the simplest possible addition to the API; it is not a very elegant
one. Eventually, the entire `record_rewritten_commit` API should probably be
refactored again.

I also added some comments explaining what these functions do.
2023-12-24 19:25:16 -08:00
Ilya Grigoriev
6bfd09009f move.rs: remove use of MutRepo::create_descenant_rebaser.
After this, the internal function is only used in tests.
2023-12-24 19:25:16 -08:00
Ilya Grigoriev
cde8ea8985 Make CommitBuilder constructors private to the library crate
The implementation of `CommitBuilder::write` is tightly bound to the MutRepo,
so only MutRepo should construct CommitBuilder-s.
2023-12-24 19:25:16 -08:00
Yuya Nishihara
b954bab0ca index: fix partial reindexing to not lose commits only reachable from one side
Spotted while adding error propagation there. This wouldn't likely be a real
problem because "jj debug reindex" removes all of the operation links.

The "} else {" condition is removed because it doesn't make sense to exclude
only the exact parent_op_id operation. This can be optimized to not walk
ancestors of the parent_op_id operation, but I don't see a motivation to add
tests covering such scenarios. It's pretty rare that an intermediate operation
link is missing.
2023-12-24 23:31:16 +09:00
Yuya Nishihara
320d15412b index: let caller of segment-level save-in() squash segments explicitly
There are many unit tests that call mutable_segment.save_in(), but I don't
think these callers expect that the segment file could be squashed depending
on the size. Let's make it caller's responsibility.

maybe_squash_with_ancestors() should be cheap if segment_num_commits() == 0,
so it's okay to call it before checking the emptiness.
2023-12-24 00:22:47 +09:00
Yuya Nishihara
1d80bbb70a index: leverage ancestor iterator to collect segments to be squashed
I think "for" loop is easier to follow. Maybe it could be rewritten further to
.find_map() loop, but that would be too clever.

I also made ancestor_index_segments() pub(super) since it doesn't make sense
to only provide ancestor_files_without_local().
2023-12-24 00:22:47 +09:00
Yuya Nishihara
55b4f69fb6 repo: propagate store error from add_heads() 2023-12-24 00:22:30 +09:00
Yuya Nishihara
0f6a7418f2 index: propagate store error from reindexing function
If the error is permanent (because the repo predates the no-gc-ref fix for
example), there's no easy way to recover. Still, panicking in this function
seems wrong.
2023-12-24 00:22:30 +09:00
Yuya Nishihara
7a44e590dc lock: remove byteorder dependency from tests, use fs helper functions
This is the last use of Read/WriteBytesExt. The byteorder crate is great, but
we don't need an abstraction of endianness. Let's simply use the std functions.
2023-12-23 00:14:17 +09:00
Yuya Nishihara
9de6273e10 index, stacked_table: inline read_u32::<LittleEndian>()
There aren't many callers of ReadBytesExt::read_u32().
2023-12-23 00:14:17 +09:00
Yuya Nishihara
21c22be96e stacked_table: use u32::from_le_bytes() to reinterpret bytes as integer
Apparently, I forgot to update this in fb06e89649.
2023-12-23 00:14:17 +09:00
Yuya Nishihara
6f5096e266 index, stacked_table: use u32::try_from() instead of numeric cast
These .unwrap()s wouldn't be compiled out, but I don't think they would
have measurable impact. Let's use the safer method.
2023-12-22 09:03:50 +09:00
Yuya Nishihara
9ec89bcf86 index, stacked_table: use u32::to_le_bytes() to reinterpret as bytes 2023-12-22 09:03:50 +09:00
Yuya Nishihara
392539fa29 index, stacked_table: simply extend Vec<u8> to not use .write_all()
I'm going to remove use of .write_u32() there. It's not super important, but
fewer .unwrap()s, the code looks slightly better.
2023-12-22 09:03:50 +09:00
Yuya Nishihara
fb06e89649 index: use u32::from_le_bytes() to reinterpret bytes as integer
It's less abstract than going through io::Read, so is probably easier for
compiler to optimize out. I also feel it's a bit more readable.
2023-12-22 09:03:50 +09:00
Yuya Nishihara
38ce914321 index: reindex on content-related I/O errors
If read_exact() or read_u32() reached to EOF, the index file should be
considered corrupted. File not found error is also treated as data corruption
because an invalid file name could be read from the child segment file. It
can't handle special file names like "..", though.
2023-12-21 08:05:30 +09:00
Yuya Nishihara
e98104d6f0 index: add file name to both io/corrupt errors, combine these variants
Index file name also applies to io::Error. New error type reuses io::Error to
represent data corruption. We could add an inner Corrupt|Io enum instead, but
we'll need to remap some io::Error variants (e.g. UnexpectedEof) to Corrupt
anyway.
2023-12-21 08:05:30 +09:00
Yuya Nishihara
88f3085bb1 index: extract function that opens file and loads index segments 2023-12-21 08:05:30 +09:00
Yuya Nishihara
eccb9b7a44 index: propagate index load errors from DefaultIndexStore 2023-12-19 07:41:57 +09:00
Yuya Nishihara
dd8e686127 index: don't reload parent files after saving new segment file
This should be cheaper, and more importantly, we no longer need to propagate
ReadonlyIndexLoadError to the caller.
2023-12-19 07:41:57 +09:00
Yuya Nishihara
fb07749291 index: split load function into header and local parts as well 2023-12-19 07:41:57 +09:00
Yuya Nishihara
616a8c7f54 index: split serialization function into header and local parts
The idea is that we don't have to reload parent files as we already have the
chain of the parent segments. The resulting readonly index will be constructed
from the loaded parent segments + local entries blob.
2023-12-19 07:41:57 +09:00
Yuya Nishihara
31b6e93c6e index: move IndexLoadError to "readonly" module, rename accordingly
I thought IndexLoadError and DefaultIndexStoreError would represent "load" and
"store" failures respectively, but they aren't. Actually, DefaultIndexStoreError
is the store-level error, and IndexLoadError should be wrapped in it.
2023-12-19 07:41:57 +09:00