Errors that may occur while loading backend would vary per backends, and
it's unlikely that these errors could be mapped to BackendError variants
other than BackendError::Other. So let's extract Other(_) of that kind as
a separate type to clarify there would be no other error variants.
Perhaps, Backend/Error will be renamed to CommitBackend/Error or
CommitStore/Error?, whereas I think BackendInit/LoadError can be shared
among store factories.
This is much simpler and I was slightly surprised that it doesn't have
much impact on performance. I tried `jj --ignore-working-copy diff -s
--from root --to v5.15` in the Linux kernel repo, and there was
perhaps a 1.5% slowdown (508 ms -> 515 ms). In more normal cases (like
diffing a single commit against its parent), I couldn't measure any
difference at all.
This helps to map initialization error to BackendError without too general
From impl. I don't think io::Error (or our PathError) should be automatically
translated to BackendError::Other because BackendError has more specific
variants depending on context. If the error is specific to initialization,
it makes sense to translate it to Other variant.
I don't think we'll want to record a label for each term, because such
labels would get stale, and it seems hard to make them make sense
after transferring a remote to another repo. I think we'll probably
want to infer labels on demand instead (#1176).
Perhaps, this would handle patterns like ["a(b", "c)"] better. It might not
be correct to error out on "(", but should be better than building wrong
regexp pattern "a(b|c)".
Now that we've replaced `MergeHunk` by a `Conflict`, it makes sense to
convert the input `Conflict<FileId>` by mapping each term. Unlike
`Option::map()` I made `Conflict::map()` take a reference `self`,
because it's not uncommon to want to map the same conflict multiple
times. I'm going to use that for producing a
`Conflict<Option<TreeValue>>` from a `Conflict<Tree>` and a set of
paths.
Since `Conflict`s can represent the resolved state, so
`Conflict<ContentHunk>` can represent the states that we use
`MergeHunk` for. `MergeHunk` does force the user to handle the
resolved case, which may be useful. I suppose one could use the same
argument for making `Conflict` an enum, i.e. if we think that
`MergeHunk`'s two variants are beneficial, then we should consider
making `Conflict` an enum with those two variants.
It's useful to have a more readable `Debug` format for `Vec<u8>`
(`"foo"` is better than `[102, 111, 111]`). It might also make types
in function signatures and elsewhere more readable.
This only parses the fields relevant to us, i.e.:
- name: the stable identifier of the submodule
- path: the path to the submodule in the current commit
- url: the remote we can clone the submodule from
The full list of .gitmodules fields can be found at
https://git-scm.com/docs/gitmodules.
This eliminates indirect access through Vec<u8> and improves cache locality
while sorting the index entries. We can achieve a similar result by using
SmallVec<[u8; 24]> in place of Commit/ChangeId(Vec<u8>), but we would have
to determine a reasonable id length across backends. Indexing [u8; 4] performs
better, at the cost of the API and implementation complexity.
For temporary Commit/ChangeId allocation in general, I think a borrowed type
like Path/PathBuf will help.
Testing with my "linux" repo, this saves ~670ms needed to initialize both
change id index and disambiguation indexes.
I'll rewrite resolve_prefix_range() to branch depending on the prefix length,
and the easiest way to do that is passing iterator to continuation function
instead of returning iterator as an either (or boxed) type.