Commit graph

2147 commits

Author SHA1 Message Date
Yuya Nishihara
55b4f69fb6 repo: propagate store error from add_heads() 2023-12-24 00:22:30 +09:00
Yuya Nishihara
0f6a7418f2 index: propagate store error from reindexing function
If the error is permanent (because the repo predates the no-gc-ref fix for
example), there's no easy way to recover. Still, panicking in this function
seems wrong.
2023-12-24 00:22:30 +09:00
Yuya Nishihara
7a44e590dc lock: remove byteorder dependency from tests, use fs helper functions
This is the last use of Read/WriteBytesExt. The byteorder crate is great, but
we don't need an abstraction of endianness. Let's simply use the std functions.
2023-12-23 00:14:17 +09:00
Yuya Nishihara
9de6273e10 index, stacked_table: inline read_u32::<LittleEndian>()
There aren't many callers of ReadBytesExt::read_u32().
2023-12-23 00:14:17 +09:00
Yuya Nishihara
21c22be96e stacked_table: use u32::from_le_bytes() to reinterpret bytes as integer
Apparently, I forgot to update this in fb06e89649.
2023-12-23 00:14:17 +09:00
Yuya Nishihara
6f5096e266 index, stacked_table: use u32::try_from() instead of numeric cast
These .unwrap()s wouldn't be compiled out, but I don't think they would
have measurable impact. Let's use the safer method.
2023-12-22 09:03:50 +09:00
Yuya Nishihara
9ec89bcf86 index, stacked_table: use u32::to_le_bytes() to reinterpret as bytes 2023-12-22 09:03:50 +09:00
Yuya Nishihara
392539fa29 index, stacked_table: simply extend Vec<u8> to not use .write_all()
I'm going to remove use of .write_u32() there. It's not super important, but
fewer .unwrap()s, the code looks slightly better.
2023-12-22 09:03:50 +09:00
Yuya Nishihara
fb06e89649 index: use u32::from_le_bytes() to reinterpret bytes as integer
It's less abstract than going through io::Read, so is probably easier for
compiler to optimize out. I also feel it's a bit more readable.
2023-12-22 09:03:50 +09:00
Yuya Nishihara
38ce914321 index: reindex on content-related I/O errors
If read_exact() or read_u32() reached to EOF, the index file should be
considered corrupted. File not found error is also treated as data corruption
because an invalid file name could be read from the child segment file. It
can't handle special file names like "..", though.
2023-12-21 08:05:30 +09:00
Yuya Nishihara
e98104d6f0 index: add file name to both io/corrupt errors, combine these variants
Index file name also applies to io::Error. New error type reuses io::Error to
represent data corruption. We could add an inner Corrupt|Io enum instead, but
we'll need to remap some io::Error variants (e.g. UnexpectedEof) to Corrupt
anyway.
2023-12-21 08:05:30 +09:00
Yuya Nishihara
88f3085bb1 index: extract function that opens file and loads index segments 2023-12-21 08:05:30 +09:00
Yuya Nishihara
eccb9b7a44 index: propagate index load errors from DefaultIndexStore 2023-12-19 07:41:57 +09:00
Yuya Nishihara
dd8e686127 index: don't reload parent files after saving new segment file
This should be cheaper, and more importantly, we no longer need to propagate
ReadonlyIndexLoadError to the caller.
2023-12-19 07:41:57 +09:00
Yuya Nishihara
fb07749291 index: split load function into header and local parts as well 2023-12-19 07:41:57 +09:00
Yuya Nishihara
616a8c7f54 index: split serialization function into header and local parts
The idea is that we don't have to reload parent files as we already have the
chain of the parent segments. The resulting readonly index will be constructed
from the loaded parent segments + local entries blob.
2023-12-19 07:41:57 +09:00
Yuya Nishihara
31b6e93c6e index: move IndexLoadError to "readonly" module, rename accordingly
I thought IndexLoadError and DefaultIndexStoreError would represent "load" and
"store" failures respectively, but they aren't. Actually, DefaultIndexStoreError
is the store-level error, and IndexLoadError should be wrapped in it.
2023-12-19 07:41:57 +09:00
Yuya Nishihara
b5de16007e index: add stub IndexReadError type
This is needed to remove .unwrap()s from DefaultIndexStore.
2023-12-19 07:41:57 +09:00
Yuya Nishihara
d49b079494 index: update file format comment about ReadonlyIndexSegment
Also made it a doc comment. I think 4-byte alignment is a nice property,
so added note about that.
2023-12-19 07:41:34 +09:00
Yuya Nishihara
8909647d86 index: pass base directory path by reference 2023-12-18 08:49:21 +09:00
Yuya Nishihara
b733d52557 index: split DefaultIndexStoreError::Io variant, extract save helper
Since OpStoreError can also include io::Error, it doesn't make much sense to
have Io variant at this level. Let's split it to context-specific errors, and
extract helper method that maps io::Error.
2023-12-18 08:49:21 +09:00
Yuya Nishihara
bf4a4e70b1 index: use DefaultMutableIndex wrapper when reconstructing missing index
This allows us to extract helper method that writes index file and associates
it with the operation.
2023-12-18 08:49:21 +09:00
Yuya Nishihara
50164bb36f index: have IndexWriteError carry opaque error type instead of string
I'm going to remove some .unwrap()s from DefaultIndexStore, and the inner
error type will be consolidated to DefaultIndexStoreError.
2023-12-18 08:49:21 +09:00
Yuya Nishihara
87a8238bee git: turn git.auto-local-branch off by default
As far as I can see in the chat, there's no objection to changing the default,
and git.auto-local-branch = false is generally preferred.

docs/branches.md isn't updated as it would otherwise conflict with #2625. I
think the "Remotes" section will need a non-trivial rewrite.

#1136, #1862
2023-12-17 08:30:24 +09:00
Yuya Nishihara
ac99145a28 working_copy: drop open file instance from PersistError
For the same reason as the file_util change.
2023-12-17 08:20:07 +09:00
Yuya Nishihara
c6df0ba4c3 file_util: don't try to overwrite existing content-addressed file on Windows
The doc says persist() replaces the destination file as rename() would do
on Unix. persist_noclobber() doesn't, and is probably more reliable on Windows.
I don't know if persist() is completely atomic on Windows, but if it isn't, it
might be the source of the "permission denied" error under highly contended
situation.

https://docs.rs/tempfile/latest/tempfile/struct.NamedTempFile.html#method.persist
https://github.com/Stebalien/tempfile/blob/v3.8.0/src/file/imp/windows.rs#L77

We could use persist_noclobber() on all platforms, but it's more involved on
Unix.

https://github.com/Stebalien/tempfile/blob/v3.8.0/src/file/imp/unix.rs#L107
2023-12-17 08:20:07 +09:00
Yuya Nishihara
dd325c089c file_util: drop open file instance from PersistError
PersistError is basically a pair of io::Error and NamedTempFile instance. It's
unlikely that we would want to propagate the open file instance to the CLI
error handler, leaving the temporary file alive.
2023-12-17 08:20:07 +09:00
Yuya Nishihara
4d91e4c196 revset: simplify type constraints on combination iterators
Just a minor cleanup to remove lifetime parameter from the types. I tried to
reimplement them by using itertools, but I couldn't find a simple way to
encode short-circuiting at the end of either left or right iterator.
2023-12-16 07:50:04 +09:00
Yuya Nishihara
6d59156858 revset: parameterize candidates set of FilterRevset as well 2023-12-16 07:50:04 +09:00
Yuya Nishihara
a36368bb88 revset: make revset combinators generic over set types, merge UnionPredicate
UnionRevset and UnionPredicate are conceptually the same. Let's unify them.
2023-12-16 07:50:04 +09:00
Yuya Nishihara
af6047a655 lib: forbid unsafe_code at all 2023-12-15 16:10:28 +09:00
Yuya Nishihara
9990c41a90 repo: remove unsafe lifetime hack from change_id_index() 2023-12-15 16:10:28 +09:00
Yuya Nishihara
d9e8297059 index: add 'static version of evaluate_revset() to ReadonlyIndex
We'll probably need a better abstraction, but a separate method is good
enough to remove unsafe code from ReadonlyRepo.

I'm not sure if this is feasible for the other backends, but I guess there
would be less lifetimed variables than DefaultReadonlyIndex.
2023-12-15 16:10:28 +09:00
Yuya Nishihara
2ba50c76c7 revset: abstract evaluated RevsetImpl over owned/borrowed index types 2023-12-15 16:10:28 +09:00
Yuya Nishihara
72d9cd019b index: extract as_composite() to trait method
The revset engine will accept abstract AsCompositeIndex type, and the
evaluated revset can be 'static if the index is behind Arc<T>.
2023-12-15 16:10:28 +09:00
Yuya Nishihara
8fdf9db6e0 revset: remove 'index lifetime from InternalRevset 2023-12-15 14:58:12 +09:00
Yuya Nishihara
c426d34c11 revset: pass in index to PurePredicateFn as an argument to make it 'static 2023-12-15 14:58:12 +09:00
Yuya Nishihara
71070e85d7 revset: add helper that coerces closure to PurePredicateFn
Also renamed the boxed version to discriminate it from the cast helper.
2023-12-15 14:58:12 +09:00
Yuya Nishihara
a9a7de4a5e revset: store RevWalk factory function in RevWalkRevset
The returned iterator is boxed by caller due to the limitation of the type
system. There's a workaround, but it's super ugly.

https://users.rust-lang.org/t/hrtb-on-multiple-generics/34255/3
2023-12-15 14:58:12 +09:00
Yuya Nishihara
575d3dc7bf revset: store IndexPosition in EagerRevset to drop 'index lifetime
This adds overhead to re-look up IndexEntry, but I don't think that would
have significant impact on performance.
2023-12-15 14:58:12 +09:00
Yuya Nishihara
261bf848a9 revset: pass in index to InternalRevset as an argument
The idea is that InternalRevset will store a 'static boilerplate function that
borrows an 'index passed by function argument. This way, we can abstract the
index type over Arc<T> and &T without introducing too much generics.
2023-12-15 14:58:12 +09:00
Yuya Nishihara
e332d39375 revset: extract inner method that constructs IndexEntry iterator 2023-12-15 14:58:12 +09:00
Yuya Nishihara
b8f60c4dd6 cargo: bump gix to 0.56.0
I don't know why the dependabot didn't catch this, but there are things to
fix manually. EntryMode was changed to a u16 wrapper, and the enum was renamed
to EntryKind. Other than that, I don't find anything breaking our codebase.
2023-12-15 14:17:02 +09:00
Yuya Nishihara
95a0cceb97 index: use loaded readonly data without splitting into vecs
Since lookup data isn't typically small, .split_off() can take a few
milliseconds to memcpy().
2023-12-14 08:43:50 +09:00
Yuya Nishihara
5121e1f4e9 index: move IndexSegment trait to "composite" module
Perhaps, this is the most controversial part. It could be moved to new
"segment" module (or something like "common"), but I think IndexSegment can be
considered a trait that enables the CompositeIndex abstraction.
2023-12-14 08:43:40 +09:00
Yuya Nishihara
b89ae7c0b5 index: use IndexEntry::position() instead of direct field access 2023-12-14 08:43:40 +09:00
Yuya Nishihara
9fb0f00f2d index: add IndexEntry constructor instead of pub(super)-ing fields 2023-12-14 08:43:40 +09:00
Yuya Nishihara
771f447d99 index: split IndexEntry and related types to "entry" module
Added pub(super) or pub where needed. I won't implement accessor methods on
IndexPositionByGeneration and IndexPosition as they are purely value types,
and protecting the inner values wouldn't make sense.
2023-12-14 08:43:40 +09:00
Martin von Zweigbergk
60fae3114e transaction: take description at end instead of start
It seems better to have the caller pass the transaction description
when we finish the transaction than when we start it. That way we have
all the information we want to include more readily available.
2023-12-13 08:12:49 -08:00
Ilya Grigoriev
316ab8efb8 rewrite.rs: refactor new_parents to depend only on parent_mapping
Previously, the function relied on both the `self.parent_mapping` and
`self.rebased`. If `(A,B)` was in `parent_mapping` and `(B,C)` was in `rebased`,
`new_parents` would map `A` to `C`.

Now, `self.rebased` is ignored by `new_parents`. In the same situation,
DescendantRebaser is changed so that both `(A,B)` and `(B,C)` are in
`parent_mapping` before. `new_parents` now applies `parent_mapping` repeatedly,
and will map `A` to `C` in this situation.

## Cons

- The semantics are changed; `new_parents` now panics if `self.parent_mapping`
  contain cycles. AFAICT, such cycles never happen in `jj` anyway, except for
one test that I had to fix. I think it's a sensible restriction to live with;
if you do want to swap children of two commits, you can call
`rebase_descendants` twice.

## Pros

- I find the new logic much easier to reason about. I plan to extract it into a
function, to be used in refactors for `jj rebase -r` and `jj new --after`. It
will make it much easier to have a correct implementation of `jj rebase -r
--after`, even when rebasing onto a descendant.

- The de-duplication is no longer O(n^2). I tried to keep the common case fast.

## Alternatives

- We could make `jj rebase` and `jj new` use a separate function with the
algorithm shown here, without changing DescendantRebaser. I believe that the new
algorithm makes DescendatRebaser easier to understand, though, and it feels more
elegant to reduce code duplication.

- The de-duplication optimization here is independent of other changes, and
could be used on its own.
2023-12-12 19:35:51 -08:00
Yuya Nishihara
2abbb637e3 index: add wrapper functions to DefaultReadonlyIndex to remove pub(super) field 2023-12-13 08:09:48 +09:00
Yuya Nishihara
c0a12a7cbc index: add methods that provides commit/change_id_length
We could add Layout struct holding these parameters, but I don't think that's
needed just for two parameters.
2023-12-13 08:09:48 +09:00
Yuya Nishihara
3831ad423c index: use as_composite().num_commits() instead of direct field access 2023-12-13 08:09:48 +09:00
Yuya Nishihara
30984b1505 index: use name() instead of direct field access 2023-12-13 08:09:48 +09:00
Yuya Nishihara
e5c8252fb4 index: use segment_parent_file() instead of direct field access 2023-12-13 08:09:48 +09:00
Yuya Nishihara
402e36bab7 index: split readonly index types to "readonly" module
Added pub(super) where needed. There are a few pub(super) fields that look
suspicious, which will be fixed by the subsequent patches.
2023-12-13 08:09:48 +09:00
Yuya Nishihara
fbec16b49f index: add wrapper functions to DefaultMutableIndex to remove pub(super) field
into_segment() could be added instead of save_in(), but I decided to wrap
save_in(). save_in() may squash ancestor files, so it could be considered an
index-level operation.
2023-12-13 08:09:48 +09:00
Yuya Nishihara
5aeeb5f723 index: split mutable index types to "mutable" module
Added pub(super) where needed or makes sense.
2023-12-13 08:09:48 +09:00
Yuya Nishihara
ab2742f2c9 index: split RevWalk types to "rev_walk" module
Added pub(super) where needed.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
caa1b99c24 index: add CompositeIndex constructor instead of pub(super)-ing field
This wouldn't matter, but seemed slightly better.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
679518fdf2 index: split CompositeIndex and stats types to "composite" module
Added pub(super) where needed or makes sense.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
2423558e68 index: split DefaultIndexStore and Load/StoreError types to "store" module
IndexLoadError isn't store-specific, but I think it's better to put I/O
stuff in the store module.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
cdcd465c79 index: move default_index_store.rs to sub directory named default_index
default_index_store.rs is relatively big, and it contains types and impls in
arbitrary order. Let's split them into sub modules. After everything moved,
mod.rs will only contain tests.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
f86b338681 revset: inline walk_ancestors() 2023-12-11 09:14:03 +09:00
Yuya Nishihara
cd0b24ef14 revset: inline walk_children()
There's only one caller, and we have common code at the call site.
2023-12-11 09:14:03 +09:00
Yuya Nishihara
d28bd8fa0f revset: inline collect_dag_range() 2023-12-11 09:14:03 +09:00
Yuya Nishihara
73fb922517 index: reimplement collect_dag_range() of revset engine as iterator
I'm going to remove 'index lifetime from InternalRevset so Revset<'static>
can be easily constructed from DefaultReadonlyIndex. As the first step, this
series removes some lifetime complexity from EvaluationContext methods.

We don't need an descendant iterator API, but it helps to add separate function
to collect into HashSet<IndexPosition> instead of returning a pair of
ordered vec and set.
2023-12-11 09:14:03 +09:00
Yuya Nishihara
cbbe38ba7b index: rename MutableIndexImpl to MutableIndexSegment 2023-12-10 11:03:07 +09:00
Yuya Nishihara
c94e1de6d2 index: add DefaultMutableIndex wrapper, move Index impls to it
The wrapper type isn't needed for the mutable layer, but this mirrors the
readonly type structure. Test cases are also migrated to be using the index
wrapper so long as we don't have to care for the nesting of the segment files.
2023-12-10 11:03:07 +09:00
Yuya Nishihara
ce312ae288 index: duplicate add_commit() to MutableIndexImpl 2023-12-10 11:03:07 +09:00
Yuya Nishihara
e0206a82f2 index: extract merge_in() function that works on segment types
Prepares for splitting MutableIndexImpl into segment and index wrapper types.
2023-12-10 11:03:07 +09:00
Yuya Nishihara
a110ec6d95 cli: print failed git export reason for each ref
Not all reasons are actionable, but we print hint in common cryptic cases.
2023-12-09 23:37:00 +09:00
Yuya Nishihara
990edcefc9 index: impl Index for DefaultReadonlyIndex instead of ReadonlyIndexSegment
The idea is that the ReadonlyIndexSegment is a sub component of the index. The
Index trait could be implemented for any Segment type, but we don't need a
public interface to access sub segment as an index.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
1cbd2ddb4b index: rename ReadonlyIndexImpl to ReadonlyIndexSegment
I'm going to split the internal Segment types and the public Index types
in order to clarify the layering concept. The public Index types will be
wrappers like DefaultReadonlyIndex.

Strictly speaking, ReadonlyIndexImpl is a segment + parent pointer pair,
but I think calling it a segment is pretty okay. It could be called a
ReadonlyIndexFile, but "File" can't apply to the mutable part.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
172043e968 index: make ReadonlyIndexImpl private
There are no external callers.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
6c57ba7f21 index: rename ReadonlyIndexWrapper to DefaultReadonlyIndex
This matches the store naming: impl IndexStore for DefaultIndexStore. I also
added minimal doc comment and Debug.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
5f6e28c8cf git: migrate export_refs() to gix::Repository
FailedToDelete/Set reasons are boxed because gix error types aren't small.
They could be casted to std::error::Error if needed.
2023-12-09 15:18:19 +09:00
Yuya Nishihara
2d76907048 git: unimplement PartialEq on FailedRefExportReason
Gitoxide errors don't implement PartialEq. We could instead stringify the
errors, but there aren't many callers who expect FailedRefExportReason to
be comparable.
2023-12-09 15:18:19 +09:00
Yuya Nishihara
9f8831e825 git: unimplement PartialEq on GitExportError
Gitoxide errors don't implement PartialEq, and I don't think it makes sense
to test equality of InternalGitError objects.
2023-12-09 15:18:19 +09:00
Yuya Nishihara
a77eed648b git: have export_refs() obtain git2::Repository instance from store 2023-12-09 15:18:19 +09:00
Yuya Nishihara
0f37027646 index: remove unneeded Any trait bound from MutableIndex
We use .as_any() to downcast to the backend impl instead.
2023-12-08 23:30:35 +09:00
Yuya Nishihara
c197add39b git_backend: do not try to resolve git_target path as working directory path
The git_target path is normalized and managed by jj, so we don't need a
fallback mechanism. Let's make it stricter.
2023-12-07 08:43:49 +09:00
Yuya Nishihara
77c811163f tests: make sure to specify external git repository path including ".git" 2023-12-07 08:43:49 +09:00
Yuya Nishihara
25fcc3e403 workspace: consider .git symlink when generating relative git_target path
Before, an absolute path would be saved in the git_target file if .git is a
symlink. That's not wrong, but seemed a bit weird. Let's consolidate the
behavior across .git file types.
2023-12-05 14:23:59 -08:00
Yuya Nishihara
787fa1340b workspace: remove redundant cloning from init_external_git()
Apparently, I forgot to update it in 1db033504c "repo, workspace: remove
'static lifetime bound from initializer functions."
2023-12-05 14:23:59 -08:00
Yuya Nishihara
899c6375a0 git_backend: don't fully canonicalize .git symlink
Apparently, libgit2 doesn't deduce "core.bare" config from the directory name,
but gitoxide implements it correctly. So we shouldn't blindly canonicalize
the Git repository path. Fortunately, the saved git_target path isn't a fully-
canonicalized form (unless user explicitly sepcified "--git-repo ./.git"), so
we don't need a hack to remap git_target back to the symlink path.

is_colocated_git_workspace() is adjusted since the git_workdir is no longer
resolved from the fully-canonicalized repo path, at least in our code. Still we
have the ".git/.." fallback because test_init_git_colocated_symlink_gitlink()
would otherwise fail. I haven't figured out why, and the test might be actually
wrong compared to the git CLI behavior, but let's not change that for now.

Fixes #2668
2023-12-05 14:23:59 -08:00
Martin von Zweigbergk
1cc271441f gc: implement basic GC for Git backend
This adds an initial `jj util gc` command, which simply calls `git gc`
when using the Git backend. That should already be useful in
non-colocated repos because it's not obvious how to GC (repack) such
repos. In my own jj repo, it shrunk `.jj/repo/store/` from 2.4 GiB to
780 MiB, and `jj log --ignore-working-copy` was sped up from 157 ms to
86 ms.

I haven't added any tests because the functionality depends on having
`git` binary on the PATH, which we don't yet depend on anywhere
else. I think we'll still be able to test much of the future parts of
garbage collection without a `git` binary because the interesting
parts are about manipulating the Git repo before calling `git gc` on
it.
2023-12-03 07:40:12 -08:00
Yuya Nishihara
35f718f212 merged_tree: remove canceling terms prior to resolving file-level conflict
I think this is a variant of the problem fixed by 7fda80fc22 "tree: simplify
conflict before resolving at hunk level." We need to simplify() the conflict
before and after extracting file ids because the source conflict values may
contain trees to be cancelled out, and the file values may differ only in exec
bits. Since the legacy tree passes a simplified conflict in to this function,
I made the merged tree do the same.

Fixes #2654
2023-12-03 07:44:58 +09:00
Yuya Nishihara
4ffbf40c82 merged_tree: do not propagate conflicting empty tree value to parent
Otherwise an empty subtree would be added to the parent tree.

If the stored tree contained an empty subtree, simplify() wouldn't work
against new "absent" subtree representation. I don't know if there's a
such code path, but I believe it's very rare to encounter the problem.

#2654
2023-12-03 07:44:58 +09:00
Yuya Nishihara
1db033504c repo, workspace: remove 'static lifetime bound from initializer functions 2023-12-03 07:44:41 +09:00
Yuya Nishihara
d747879aee signing: pass SigningFn by reference
write_commit() doesn't need ownership of the signing function.
2023-12-01 22:55:04 +09:00
Anton Bulakh
eb1c0ab4a2 sign: Implement a test signing backend and add a few basic tests 2023-11-30 23:36:56 +02:00
Anton Bulakh
d7229a3f90 sign: Define signing backend API and integrate it
Finished everything except actual signing backend implementation(s) and
the UI.
2023-11-30 23:36:56 +02:00
Yuya Nishihara
076b49b610 merged_tree: use merged_tree_entry_diff() in stream version 2023-12-01 00:05:06 +09:00
Yuya Nishihara
97a260b1bf merged_tree: reimplement TreeEntryDiffIterator by using iterator adapter
We don't need a named type anymore.
2023-12-01 00:05:06 +09:00
Yuya Nishihara
fd1c03d037 merged_tree: use sync get_tree() in TreeDiffIterator
This basically backs out the change 1b9a3e27e0 "merged_tree: read before/after
trees concurrently." As we decided to add a separate impl for async access, it
doesn't make sense to read before/after pair in parallel.

The async single_tree() is moved to TreeDiffStreamImpl. It will help remove
the sync version when the performance problem is solved.
2023-12-01 00:05:06 +09:00
Yuya Nishihara
601be0d480 working_copy: narrow file_states recursively while visiting directories
This saves another ~10ms.

Without watchman:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-1,jj-2 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     327.7 ms ±  24.9 ms    [User: 1059.1 ms, System: 654.3 ms]
  Range (min … max):   296.0 ms … 385.4 ms    20 runs

Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     311.0 ms ±  24.8 ms    [User: 960.0 ms, System: 643.1 ms]
  Range (min … max):   274.9 ms … 358.5 ms    20 runs
```
2023-11-30 12:09:31 +09:00
Yuya Nishihara
a935a4f70c working_copy: use proto file states without rebuilding BTreeMap
In snapshot(), changed_file_states are received in arbitrary order. For the
other callers, entries are in diff_stream order, so we don't have to sort
them.

With watchman enabled, we can see the cost of sorting the sorted proto entries.
I don't think this is significant, but we can mitigate it by adding
is_file_states_sorted flag to the proto message if needed:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     164.8 ms ±  16.6 ms    [User: 50.2 ms, System: 111.7 ms]
  Range (min … max):   148.1 ms … 195.0 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     171.8 ms ±  13.6 ms    [User: 61.7 ms, System: 109.0 ms]
  Range (min … max):   159.5 ms … 192.1 ms    20 runs
```

Without watchman:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     367.3 ms ±  30.3 ms    [User: 1415.2 ms, System: 633.8 ms]
  Range (min … max):   325.4 ms … 421.7 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     327.7 ms ±  24.9 ms    [User: 1059.1 ms, System: 654.3 ms]
  Range (min … max):   296.0 ms … 385.4 ms    20 runs
```

I haven't measured snapshotting against dirty working copy, but I don't think
it would be slower than the original implementation.
2023-11-30 12:09:31 +09:00
Yuya Nishihara
fca3690dda working_copy: add file states wrapper that provides map-like API
I'll replace the current lazy loading mechanism with this. Read-only methods
are implemented on the borrowed type so that we can narrow lookup scope
recursively.
2023-11-30 12:09:31 +09:00
Yuya Nishihara
9292af5e52 working_copy: update file states in bulk
This helps migrate BTreeMap<RepoPath, _> to sorted Vec.
2023-11-30 12:09:31 +09:00
Yuya Nishihara
c9150d02fc working_copy: don't look up file state twice while visiting directories 2023-11-30 12:09:31 +09:00
Yuya Nishihara
6ce7bd5338 repo_path: replace .contains() with .starts_with(), flipping the arguments
self.contains(other) means that the self tree contains the other tree (i.e.
the self path is prefix of the other), but it could be confused the other way
around if we were thinking about the path literal, not the tree. Let's add
.starts_with() instead by copying the std::path::Path definition.
2023-11-29 08:41:23 +09:00
Yuya Nishihara
266690a46b repo_path: make strip_prefix() public function returning &RepoPath
There are no external callers, but I think it's useful.
2023-11-29 08:41:23 +09:00
Yuya Nishihara
73690ed54e matchers: clean up .walk_to(dir) to yield &RepoPath instead of iterator 2023-11-29 08:41:23 +09:00
Yuya Nishihara
bc9725c73c working_copy: use RepoPath::parent() which no longer allocates temporary object 2023-11-29 08:41:23 +09:00
Yuya Nishihara
016fc2b5cc repo_path: change .split() and .parent() to return &RepoPath 2023-11-29 08:41:23 +09:00
Yuya Nishihara
28ab9593c3 repo_path: split RepoPath into owned and borrowed types
This enables cheap str-to-RepoPath cast, which is useful when sorting and
filtering a large Vec<(String, _)> list by using matcher for example. It
will also eliminate temporary allocation by repo_path.parent().
2023-11-28 07:33:28 +09:00
Yuya Nishihara
0a1bc2ba42 repo_path: add stub RepoPathBuf type, update callers
Most RepoPath::from_internal_string() callers will be migrated to the function
that returns &RepoPath, and cloning &RepoPath won't work.
2023-11-28 07:33:28 +09:00
Yuya Nishihara
f5938985f0 repo_path: make RepoPath::from_internal_string() accept owned string
I'm going to add borrowed RepoPath type, and most from_internal_string()
callers will be migrated to it. For the remaining callers, it makes more
sense to move the ownership of String to RepoPathBuf.
2023-11-28 07:33:28 +09:00
Yuya Nishihara
d322df0c8d matchers: make Files/PrefixMatcher constructors accept slice of borrowed paths
RepoPath will become slice type (like str), and it doesn't make sense to
require &[RepoPathBuf] here.
2023-11-28 07:33:28 +09:00
Yuya Nishihara
a23bb5b958 matchers: in tests, use alias to RepoPath::from_internal_string()
It looked verbose to fully spell the function name.
2023-11-28 07:33:28 +09:00
Ilya Grigoriev
6aef4bb52e cli rebase: do not allow -r --skip-empty
This follows up on 3967f63 (see that commit's description for more
motivation) and e79c8b6.

In a discussion linked below, it was decided that forbidding `-r --skip-empty`
entirely is preferable to the mixed behavior introduced in 3967f63.

3967f637dc (commitcomment-133539911)
2023-11-27 10:16:36 -08:00
Yuya Nishihara
55f75278bc repo_path: make to_internal_file_string() return &str, rename accordingly 2023-11-27 08:42:09 +09:00
Yuya Nishihara
12d7f8be16 repo_path: turn RepoPath into String wrapper
RepoPath::from_components() is removed since it is no longer a primitive
function.

The components iterator could be implemented on top of str::split(), but
it's not as we'll probably want to add components.as_path() -> &RepoPath.

Tree walking and tree_states map construction get slightly faster thanks to
fewer allocations and/or better cache locality. If we add a borrowed RepoPath
type, we can also implement a cheap &str to &RepoPath conversion on top. Then,
we can get rid of BTreeMap<RepoPath, FileState> construction at all.

Snapshot without watchman:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
  Time (mean ± σ):     950.1 ms ±  24.9 ms    [User: 1642.4 ms, System: 681.1 ms]
  Range (min … max):   913.8 ms … 990.9 ms    10 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
  Time (mean ± σ):     872.1 ms ±  14.5 ms    [User: 1922.3 ms, System: 625.8 ms]
  Range (min … max):   853.2 ms … 895.9 ms    10 runs

Relative speed comparison
        1.09 ±  0.03  target/release-with-debug/jj-0 -R ~/mirrors/linux status
        1.00          target/release-with-debug/jj-1 -R ~/mirrors/linux status
```

Tree walk:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files --ignore-working-copy"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files --ignore-working-copy
  Time (mean ± σ):     375.3 ms ±  15.4 ms    [User: 223.3 ms, System: 151.8 ms]
  Range (min … max):   359.4 ms … 394.1 ms    10 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files --ignore-working-copy
  Time (mean ± σ):     357.1 ms ±  16.2 ms    [User: 214.7 ms, System: 142.6 ms]
  Range (min … max):   341.6 ms … 378.9 ms    10 runs

Relative speed comparison
        1.05 ±  0.06  target/release-with-debug/jj-0 -R ~/mirrors/linux files --ignore-working-copy
        1.00          target/release-with-debug/jj-1 -R ~/mirrors/linux files --ignore-working-copy
```
2023-11-27 08:42:09 +09:00
Yuya Nishihara
974a6870b3 repo_path: make RepoPath::components() return iterator
This allows us to change the backing type from Vec<String> to String.
2023-11-27 08:42:09 +09:00
Yuya Nishihara
aba8c640be repo_path: capture current Vec<String> ordering by tests
The added test would fail if paths were purely ordered by concatenated strings.
I'm not sure if we want to preserve the current ordering, but let's not break
it for the moment.
2023-11-27 08:42:09 +09:00
Ilya Grigoriev
3967f637dc cli rebase: do not allow -r --skip-empty to drop emptied descendants
This follows up on @matts1 's #2609.

We still allow the `-r` commit to become empty. I would be more comfortable if
there was a test for that, but I haven't done that (yet?) and it seems pretty
safe. If that's a problem, I'm happy to forbid `-r --skip-empty` entirely,
since it is far less useful than `-s --skip-empty` or `-b --skip-empty`.

I think it is undesired to abandon emptied descendants. As far as descendants
of `A` are concerned, `jj rebase -r A` should be equivalent to `jj abandon A`,
and `jj abandon` does not remove emptied commits. It also doesn't seem very
useful to do that, since I think descendant commits of an abandoned (or moved
with `-r`) commit only become empty in pathological cases.

Additionally, if we did want -r to empty descendants of `A`, we'd have to add
thorough tests and possibly improve the algorithm. I want to refactor `rebase
-r` and add features to it, and having to consider cases of commits becoming
abandoned makes everything harder.

For example, if we have

```
root -> A -> B -> C
```

and `jj rebase -r A -d C` empties commit `B` (or `C`), I do not know whether
the current algorithm will work correctly. It seems possible that it would, but
that depends on the fact that empty merge commits are not abandoned for
descendants. That seems dangerous to rely on without tests.

I hope (but can't promise) that in the near future, making DescendantRebaser
return more information  should help make it possible to create such
functionality in a more robust way. I am likely to attempt this as part of
implementing `-r --after`.
2023-11-26 10:56:58 -08:00
Yuya Nishihara
59ef3f0023 repo_path: split RepoPathComponent into owned and borrowed types
This is a step towards introducing a borrowed RepoPath type. The current
RepoPath type is inefficient as each component String is usually short. We
could apply short-string optimization, but still each inlined component would
consume 24 bytes just for e.g. "src", and increase the chance of random memory
access. If the owned RepoPath type is backed by String, we can implement cheap
cast from &str to borrowed &RepoPath type.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
f2096da2d6 repo_path: add stub type to introduce borrowed RepoPathComponent type
The current RepoPathComponent will be renamed to RepoPathComponentBuf, and
new str wrapper will be added as RepoPathComponent.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
e14b31a033 repo_path: reject leading slash and empty path components
Leading/trailing slashes would introduce a bit of complexity if we migrate
the backing type from Vec<String> to String. Empty components are okay, but
let's reject them as they are cryptic and invalid.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
755af75c30 repo_path: in tests, use alias to RepoPath::from_internal_string()
It seemed too verbose to spell the full function name in tests.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
b7543f8a08 rewrite: fix check for newly-empty commit in optimized path
'old_base_tree_id == None' means the rebased tree is unchanged, so the commit
shouldn't be considered newly-empty.
2023-11-26 14:42:17 +09:00
Yuya Nishihara
2f93de9299 rewrite: flatten mapping from EmptyBehaviour to desired action
I think this is slightly easier to follow.
2023-11-26 14:42:17 +09:00
Ilya Grigoriev
c32847696d rewrite.rs: rename new_parents to parent_mapping
The function `new_parents` makes sense, but I found the mapping
being named `new_parents` confusing.
2023-11-25 21:36:35 -08:00
Yuya Nishihara
6344cd56b3 repo_path: remove RepoPathJoin trait, just implement join() on the type
I don't think we'll add join() that takes different types.
2023-11-26 07:14:47 +09:00
Yuya Nishihara
d7df2516c5 repo_path: remove RepoPathComponent::string(), use as_str() instead
There are only two callers, and one does further conversion to BString.
2023-11-26 07:14:47 +09:00
Martin von Zweigbergk
6d54afa60e revset: make evaluate_programmatic() optimize expression
It seems generally useful to optimize revset expressions in
`evaluate_programmatic()` so the caller doesn't have to remember to do
it. It should generally be cheap to do so even if it's often not
needed.
2023-11-24 21:13:58 -10:00
Martin von Zweigbergk
550164209c revset: add a RevsetExpression::evaluate_programmatic()
We often resolve a programmatic revset and then immediately evaluate
it. This patch adds a convenience method for those two steps.
2023-11-24 21:13:58 -10:00
Martin von Zweigbergk
f2602f78cf revset: make resolve_programmatic() not return a Result
I think it's always a programming error if `resolve_programmatic()`
returns a `Result`, so it shouldn't have to return a `Result`.
2023-11-24 21:13:58 -10:00
Martin von Zweigbergk
f27f52984e revset: rename resolve() to resolve_programmatic()
`RevsetExpression::resolve()` is meant for programmatically created
expressions. In particular, it may not contain symbols. Let's try to
clarify that by renaming the function and documenting it.
2023-11-24 21:13:58 -10:00
Matt Stark
0a95e20ebe lib: Implement skipping of empty commits 2023-11-24 14:48:06 +11:00
Matt Stark
dc89566039 lib: Create struct RebaseOptions 2023-11-24 14:48:06 +11:00
Anton Bulakh
5c3c0e9f6e sign: Implement generic commit signing on the backend 2023-11-23 22:52:20 +02:00
Anton Bulakh
5ab00e197a backend: Inline gix::Repository::commit_as to prepare for signing
Additional bonus is that this allows us to avoid creating keep refs for the
intermediate commits in the data race preventing loop.
2023-11-23 22:52:20 +02:00
Yuya Nishihara
042d26049c working_copy: lazily construct file_states BTreeMap
While it got faster to build a large BTreeMap<RepoPath, _>, there's still
a measurable cost. Let's eliminate it if watchman is enabled and the working
copy is clean. Perhaps, we should introduce new serialization format that
supports instant loading and lookup, but this hack works for the moment.
I'm not sure if the new tree_state format should be flat (RepoPath, _) list,
or tree like the backend storage btw.

In my "linux" repo (watchman enabled):
    % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
      "target/release-with-debug/{bin} -R ~/mirrors/linux status"
    Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
      Time (mean ± σ):     768.9 ms ±  14.2 ms    [User: 630.7 ms, System: 131.2 ms]
      Range (min … max):   742.3 ms … 783.1 ms    10 runs

    Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
      Time (mean ± σ):     713.0 ms ±  16.8 ms    [User: 587.9 ms, System: 116.2 ms]
      Range (min … max):   681.5 ms … 731.1 ms    10 runs

    Relative speed comparison
            1.08 ±  0.03  target/release-with-debug/jj-0 -R ~/mirrors/linux status
            1.00          target/release-with-debug/jj-1 -R ~/mirrors/linux status
2023-11-23 18:48:14 +09:00
Yuya Nishihara
12cd657837 working_copy: extract file_states_to_proto() helper
Just minimizing the changes in the next commit. As we already have
file_states_from_proto(), it makes sense to extract the "to" function.
2023-11-23 18:48:14 +09:00
Yuya Nishihara
74c4ef32aa fsmonitor: exclude .git and .jj directories from changed files
This ensures that the root fsmonitor_matcher matches nothing if there are no
working-copy changes. The query result can be observed by "jj debug watchman
query-changed-files".

I don't have expertise on watchman query language, but using the watchman API
is probably better than .filter()-ing the result manually.
2023-11-23 18:48:14 +09:00
Yuya Nishihara
1ddcaa43b3 fsmonitor: don't apply prefix matching to paths obtained from watchman
If I understand it, watchman returns changed files and directories, and a
directory change doesn't mean we need to scan all files under the directory.
2023-11-23 10:06:00 +09:00
Yuya Nishihara
767e94f5af fsmonitor: drop unneeded mut from make_fsmonitor_matcher()
We only need &self.working_copy_path here.
2023-11-23 10:06:00 +09:00
Yuya Nishihara
c16c89bc27 fsmonitor: keep paths relative to the workspace root
Since the caller wants repo-relative paths, it doesn't make sense to convert
them back and forth.
2023-11-23 10:06:00 +09:00
Yuya Nishihara
a4f6e0de0b repo_path: extract helper that converts Path to RepoPath literally 2023-11-23 10:06:00 +09:00
Yuya Nishihara
31def4b131 cleanup: don't use debug format to print source errors 2023-11-23 10:05:37 +09:00
Yuya Nishihara
16620e0e4c merged_tree: drop legacy tree handling from ConflictsDirItem constructor
No callers pass in a legacy tree.
2023-11-21 07:45:30 +09:00
Yuya Nishihara
4ad3db2e84 merged_tree: extract value() function of non-legacy trees 2023-11-21 07:45:30 +09:00
Yuya Nishihara
ca3f549c9e merged_tree: remove redundant clone() from ConflictIterator construction 2023-11-21 07:45:30 +09:00
Martin von Zweigbergk
acc35a89a8 merged_tree: inline non-recursive entry iterator 2023-11-19 20:29:40 -10:00
Martin von Zweigbergk
426f6d0cdd merged_tree: inline non-recursive conflict iterator
The abstraction is no longer useful since we made the types not
self-referential.
2023-11-19 20:29:40 -10:00
Yuya Nishihara
5186066cf5 working_copy: simply collect() proto file states into BTreeMap
Suppose the input list is presorted, sorting a sorted vec would be cheaper
than .insert()-ing sorted items one by one.

In my "linux" repo (watchman eanbled):
 - jj-0: baseline
 - jj-1: previous (don't randomize by HashMap)
 - jj-2: this

    % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
        "target/release-with-debug/{bin} -R ~/mirrors/linux status"
    Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
      Time (mean ± σ):      1.034 s ±  0.020 s    [User: 0.881 s, System: 0.212 s]
      Range (min … max):    1.011 s …  1.068 s    10 runs

    Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
      Time (mean ± σ):     849.3 ms ±  13.8 ms    [User: 710.7 ms, System: 199.3 ms]
      Range (min … max):   821.7 ms … 870.2 ms    10 runs

    Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux status
      Time (mean ± σ):     786.2 ms ±  16.7 ms    [User: 650.7 ms, System: 204.1 ms]
      Range (min … max):   760.8 ms … 805.2 ms    10 runs

    Relative speed comparison
            1.32 ±  0.04  target/release-with-debug/jj-0 -R ~/mirrors/linux status
            1.08 ±  0.03  target/release-with-debug/jj-1 -R ~/mirrors/linux status
            1.00          target/release-with-debug/jj-2 -R ~/mirrors/linux status
2023-11-20 08:29:33 +09:00
Yuya Nishihara
ee6a1e2c0a working_copy: don't build intermediate HashMap from proto file states
According to the doc, this is compatible with the map syntax.
https://protobuf.dev/programming-guides/proto3/#maps

This change means that the serialized file states are sorted by RepoPath,
so BTreeMap<RepoPath, _> can be reconstructed with fewer cache misses.

In my "linux" repo (watchman enabled):
 - jj-0: baseline
 - jj-1: this

    % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
      "target/release-with-debug/{bin} -R ~/mirrors/linux status"
    Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
      Time (mean ± σ):      1.034 s ±  0.020 s    [User: 0.881 s, System: 0.212 s]
      Range (min … max):    1.011 s …  1.068 s    10 runs

    Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
      Time (mean ± σ):     849.3 ms ±  13.8 ms    [User: 710.7 ms, System: 199.3 ms]
      Range (min … max):   821.7 ms … 870.2 ms    10 runs

    Relative speed comparison
            1.32 ±  0.04  target/release-with-debug/jj-0 -R ~/mirrors/linux status
            1.08 ±  0.03  target/release-with-debug/jj-1 -R ~/mirrors/linux status

Cache-misses got reduced:

    % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
      -- ./target/release-with-debug/jj-0 -R ~/mirrors/linux --no-pager status

              1,091.68 msec task-clock                       #    1.032 CPUs utilized
         4,179,596,978      cycles                           #    3.829 GHz
         6,166,231,489      instructions                     #    1.48  insn per cycle
           134,032,047      cache-references                 #  122.776 M/sec
            29,322,707      cache-misses                     #   21.88% of all cache refs

           1.057474164 seconds time elapsed

           0.897042000 seconds user
           0.194819000 seconds sys

    % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
      -- ./target/release-with-debug/jj-1 -R ~/mirrors/linux --no-pager status

                927.05 msec task-clock                       #    1.083 CPUs utilized
         3,451,299,198      cycles                           #    3.723 GHz
         6,222,418,272      instructions                     #    1.80  insn per cycle
            98,499,363      cache-references                 #  106.251 M/sec
            11,998,523      cache-misses                     #   12.18% of all cache refs

           0.855938336 seconds time elapsed

           0.720568000 seconds user
           0.207924000 seconds sys
2023-11-20 08:29:33 +09:00
Yuya Nishihara
56047cb7ec working_copy: don't pass all proto data to from_proto() functions
Just a code cleanup. This allows us to consume proto fields if needed.
I also removed redundant .clone() and .as_str().
2023-11-20 08:29:33 +09:00