Commit graph

797 commits

Author SHA1 Message Date
Martin von Zweigbergk
6e302bb3a2 op_store: add a virtual root operation, similar to root commit
It seems obvious in hindsight to have a virtual root operation just
like we have a virtual root commit. It removes the same kind of
problems by making sure there's always a common ancestor (or multiple)
between any two commits.

I think the reason I didn't add a root operation from the beginning
was that there used to be a mandatory working-copy commit in the view
(this was before support for multiple workspaces).

Perhaps we should remove the "initialize repo" operation now. The only
difference between their view objects is that the "initialize repo"
operation adds the root commit as a head. We could add that to the
root operation, but then the root operation's value depends on the
commit backend.
2024-01-14 10:15:14 -08:00
Martin von Zweigbergk
c9af8bf43a view: drop tracking of public heads
We've had the public_heads for as long as we've had the View object,
IIRC (I didn't check), but we still don't use it for anything. I don't
have any concrete plans for using it either. Maybe our config for
immutable commits is good enough, or maybe we'll want something more
generic (like Mercurial's phases). For now, I think we should simplify
by removing it the storage for public heads.
2024-01-13 22:23:57 -08:00
Yuya Nishihara
b7eb551cf7 index: fix reindexing to scan all referenced commits such as hidden remote refs
Since hidden commits can be looked up by remote_branches() revset for example,
reindexing should traverse ancestors from all named refs in addition to the
visible heads.
2024-01-12 12:53:16 +09:00
Yuya Nishihara
83ede241e3 op_walk: don't resolve heads beyond @ operation
Since `jj undo --at-op=OP @` resolves @ to OP, I think OP should be the head
in that context, and the descendants of OP shouldn't be accessible by @+.
2024-01-12 08:01:13 +09:00
Yuya Nishihara
d5a98df046 git_backend: teach "format.tree-level-conflicts" config by constructor
Since GitBackend constructors now depend on &UserSettings, it makes sense to
initialize the formatting options there.
2024-01-10 08:57:51 +09:00
Yuya Nishihara
e5286aed08 index: move lifetimed change_id_index() to MutableIndex, rename 'static version
change_id_index() is only used by Readonly/MutableRepo, so we don't need an
abstraction at Index. evaluate_revset() is somewhat similar, but the callers
rely on &dyn Repo.
2024-01-09 10:38:00 +09:00
Yuya Nishihara
dc68f1eeb2 revset: remove unused lifetime parameter from Revset<'index> 2024-01-09 10:37:43 +09:00
Yuya Nishihara
e9d31177cb op_store: implement GC of unreachble operations and views
Since new operations and views may be added concurrently by another process,
there's a risk of data corruption. The keep_newer parameter is a mitigation
for this problem. It's set to preserve files modified within the last 2 weeks,
which is the default of "git gc". Still, a concurrent process may replace an
existing view which is about to be deleted by the gc process, and the view
file would be lost.

#12
2024-01-09 10:37:03 +09:00
Yuya Nishihara
5894f3dfba operation: add shorthand for &store_operation().view_id 2024-01-09 10:37:03 +09:00
Martin von Zweigbergk
c98b0d76af index: move Revset::change_id_index() to Index
We current have `Revset::change_id_index()` for creating a
`ChangeIdIndex` for a given revset. I think it will be hard to make it
performant for general revsets, especially in very large repos and
with custom index implementations, like the one we have at Google. If
we instead restrict it to including all ancestors of a set of heads, I
think it will be much easier to implement. We only use
`Revset::change_id_index()` with revsets including all visible commits
today, so we won't lose any current functionality by making it more
restricted.
2024-01-08 06:06:47 -08:00
Martin von Zweigbergk
2f4594540a tests: move ChangeIdIndex test from test_revset to test_index 2024-01-08 06:06:47 -08:00
Martin von Zweigbergk
1508f28567 tests: update ChangeIdIndex test to include ancestors in set
I plan to replace `Revset::change_id_index()` by
`Index::change_id_index(heads)`, but one of the tests currently uses a
set of commits that does not include ancestors. This patch updates it
to include ancestors (and changes the set of heads to keep the set
small enough for the test).
2024-01-08 06:06:47 -08:00
Martin von Zweigbergk
f0182ad4b8 default_index: adopt revset engine and graph iterator modules
The revset engine and the graph iterator are specific to the default
index implementation, so they belong in the same module.
2024-01-07 05:37:47 -08:00
Yuya Nishihara
a6616e9cea object_id: don't allow ObjectId::from_hex() a dynamically allocated string
This isn't technically needed, but it prevents API misuse. Another option
is to do some compile-time substitution, but most callers are tests and the
runtime performance wouldn't matter.
2024-01-06 00:26:36 +09:00
Yuya Nishihara
95d83cbfe5 object_id: make ObjectId constructors non-trait methods
I'm going to add try_from_hex(), which requires Self: Sized. Such trait bound
could be added, but I don't think we'll need abstracted ObjectId constructors
at all.
2024-01-05 23:36:57 +09:00
Yuya Nishihara
31b236a70d object_id: move HexPrefix and PrefixResolution from index module 2024-01-05 10:20:57 +09:00
Yuya Nishihara
fa5e40719c object_id: extract ObjectId trait and macros to separate module
I'm going to add a prefix resolution method to OpStore, but OpStore is
unrelated to the index. I think ObjectId, HexPrefix, and PrefixResolution can
be extracted to this module.
2024-01-05 10:20:57 +09:00
Yuya Nishihara
e5255135bb op_walk: add function that reparents (and abandons) operation range
This will be used in "jj op abandon ..op_id" command. The "op_id..@" range will
be reparented onto the root operation.

The current implementation is good enough for local repos, but it won't scale.
We might want to extract it as a trait method or introduce OpIndex for
efficient DAG operation.
2024-01-04 11:44:36 +09:00
Matt Stark
3f0a49dafe Ensure you never drop the working commit with --skip-empty
See #2766 for discussions
2024-01-04 13:33:24 +11:00
Matt Stark
a4aed2391f Rewrite instead of abandoning empty commits.
Fixes #2760


Given the tree:
```
A-B-C
 \
  B2
```
And the command `jj rebase -s B -d B2`

We were previously marking B as abandoned, despite the comment stating that we were marking it as being succeeded by B2. This resulted in a call to `rewrite(rewrites={}, abandoned={B})` instead of `rewrite(rewrites={B=>B2}, abandoned={})`, which then made the new parent of `C` into `A` instead of `B2`
2024-01-04 13:33:24 +11:00
Ilya Grigoriev
45cd0bf11b test_rewrite.rs: stop using DescendantRebaser when testing EmptyBehavior
This completes the process of removing DescendantRebaser-related APIs from
tests. It requires creating some new test utils and a new
`rebase_descendants_with_option_return_map`.
2024-01-01 18:51:36 -08:00
Ilya Grigoriev
4461d61254 test_rewrite: test branches of descendants of divergent commits
A TODO left over from a previous PR
2024-01-01 18:51:36 -08:00
Ilya Grigoriev
b2abba07e9 tests: (mostly) stop using soon-to-be-private DescendantRebaser-related APIs
This removes uses of `DescendantRebaser::new` or
`MutRepo::create_descendant_rebaser` from most tests. The exceptions  are the
tests having to do with abandoning empty commits on rebase, since adjusting
those is a bit more elaborate (see follow-up commits).
2024-01-01 18:51:36 -08:00
Yuya Nishihara
3eafca65ea op_walk: add support for op_id+ (children) operator
A possible use case is when doing some archaeology around a certain operation.

The current implementation is quadratic if + is repeated. Suppose op_id is
usually close to the current op heads, I think it'll practically work better
than building a reverse lookup table.
2024-01-02 10:30:08 +09:00
Yuya Nishihara
51691ea22c tests: add lib tests for op id resolution, migrate some from cli
CLI testing is slow and harder to set up crafted environment.
2024-01-02 10:30:08 +09:00
Yuya Nishihara
f9e9058b9b index: show bad operation id if commit lookup failed during reindexing
My jj repo contains such head commits, and "jj debug reindex" fails. To address
this problem, we'll probably need to implement GC, and the user will discard
operations before the first bad op id.
2023-12-29 13:05:58 +09:00
Ilya Grigoriev
1fb9df252b split.rs: stop using DescendantRebaser::new
This requires creating a new public API as a substitute. I took the opportunity
to also add some comments to the
`MutRepo::record_rewritten_commit`/`record_abandoned_commit` functions.

I imade the simplest possible addition to the API; it is not a very elegant
one. Eventually, the entire `record_rewritten_commit` API should probably be
refactored again.

I also added some comments explaining what these functions do.
2023-12-24 19:25:16 -08:00
Yuya Nishihara
b954bab0ca index: fix partial reindexing to not lose commits only reachable from one side
Spotted while adding error propagation there. This wouldn't likely be a real
problem because "jj debug reindex" removes all of the operation links.

The "} else {" condition is removed because it doesn't make sense to exclude
only the exact parent_op_id operation. This can be optimized to not walk
ancestors of the parent_op_id operation, but I don't see a motivation to add
tests covering such scenarios. It's pretty rare that an intermediate operation
link is missing.
2023-12-24 23:31:16 +09:00
Yuya Nishihara
55b4f69fb6 repo: propagate store error from add_heads() 2023-12-24 00:22:30 +09:00
Yuya Nishihara
6971ec239a tests: set git_settings.auto_local_branch where it matters 2023-12-17 08:30:24 +09:00
Yuya Nishihara
d9e8297059 index: add 'static version of evaluate_revset() to ReadonlyIndex
We'll probably need a better abstraction, but a separate method is good
enough to remove unsafe code from ReadonlyRepo.

I'm not sure if this is feasible for the other backends, but I guess there
would be less lifetimed variables than DefaultReadonlyIndex.
2023-12-15 16:10:28 +09:00
Yuya Nishihara
2ba50c76c7 revset: abstract evaluated RevsetImpl over owned/borrowed index types 2023-12-15 16:10:28 +09:00
Yuya Nishihara
72d9cd019b index: extract as_composite() to trait method
The revset engine will accept abstract AsCompositeIndex type, and the
evaluated revset can be 'static if the index is behind Arc<T>.
2023-12-15 16:10:28 +09:00
Martin von Zweigbergk
60fae3114e transaction: take description at end instead of start
It seems better to have the caller pass the transaction description
when we finish the transaction than when we start it. That way we have
all the information we want to include more readily available.
2023-12-13 08:12:49 -08:00
Ilya Grigoriev
316ab8efb8 rewrite.rs: refactor new_parents to depend only on parent_mapping
Previously, the function relied on both the `self.parent_mapping` and
`self.rebased`. If `(A,B)` was in `parent_mapping` and `(B,C)` was in `rebased`,
`new_parents` would map `A` to `C`.

Now, `self.rebased` is ignored by `new_parents`. In the same situation,
DescendantRebaser is changed so that both `(A,B)` and `(B,C)` are in
`parent_mapping` before. `new_parents` now applies `parent_mapping` repeatedly,
and will map `A` to `C` in this situation.

## Cons

- The semantics are changed; `new_parents` now panics if `self.parent_mapping`
  contain cycles. AFAICT, such cycles never happen in `jj` anyway, except for
one test that I had to fix. I think it's a sensible restriction to live with;
if you do want to swap children of two commits, you can call
`rebase_descendants` twice.

## Pros

- I find the new logic much easier to reason about. I plan to extract it into a
function, to be used in refactors for `jj rebase -r` and `jj new --after`. It
will make it much easier to have a correct implementation of `jj rebase -r
--after`, even when rebasing onto a descendant.

- The de-duplication is no longer O(n^2). I tried to keep the common case fast.

## Alternatives

- We could make `jj rebase` and `jj new` use a separate function with the
algorithm shown here, without changing DescendantRebaser. I believe that the new
algorithm makes DescendatRebaser easier to understand, though, and it feels more
elegant to reduce code duplication.

- The de-duplication optimization here is independent of other changes, and
could be used on its own.
2023-12-12 19:35:51 -08:00
Yuya Nishihara
cdcd465c79 index: move default_index_store.rs to sub directory named default_index
default_index_store.rs is relatively big, and it contains types and impls in
arbitrary order. Let's split them into sub modules. After everything moved,
mod.rs will only contain tests.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
c94e1de6d2 index: add DefaultMutableIndex wrapper, move Index impls to it
The wrapper type isn't needed for the mutable layer, but this mirrors the
readonly type structure. Test cases are also migrated to be using the index
wrapper so long as we don't have to care for the nesting of the segment files.
2023-12-10 11:03:07 +09:00
Yuya Nishihara
6c57ba7f21 index: rename ReadonlyIndexWrapper to DefaultReadonlyIndex
This matches the store naming: impl IndexStore for DefaultIndexStore. I also
added minimal doc comment and Debug.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
cee69d1665 tests: remove index downcast helpers called only by as_<type>_composite()
I'm going to rename the impl types, and I don't want to think about the
names of these downcast functions.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
2d76907048 git: unimplement PartialEq on FailedRefExportReason
Gitoxide errors don't implement PartialEq. We could instead stringify the
errors, but there aren't many callers who expect FailedRefExportReason to
be comparable.
2023-12-09 15:18:19 +09:00
Yuya Nishihara
9f8831e825 git: unimplement PartialEq on GitExportError
Gitoxide errors don't implement PartialEq, and I don't think it makes sense
to test equality of InternalGitError objects.
2023-12-09 15:18:19 +09:00
Yuya Nishihara
a77eed648b git: have export_refs() obtain git2::Repository instance from store 2023-12-09 15:18:19 +09:00
Yuya Nishihara
77c811163f tests: make sure to specify external git repository path including ".git" 2023-12-07 08:43:49 +09:00
Yuya Nishihara
35f718f212 merged_tree: remove canceling terms prior to resolving file-level conflict
I think this is a variant of the problem fixed by 7fda80fc22 "tree: simplify
conflict before resolving at hunk level." We need to simplify() the conflict
before and after extracting file ids because the source conflict values may
contain trees to be cancelled out, and the file values may differ only in exec
bits. Since the legacy tree passes a simplified conflict in to this function,
I made the merged tree do the same.

Fixes #2654
2023-12-03 07:44:58 +09:00
Yuya Nishihara
4ffbf40c82 merged_tree: do not propagate conflicting empty tree value to parent
Otherwise an empty subtree would be added to the parent tree.

If the stored tree contained an empty subtree, simplify() wouldn't work
against new "absent" subtree representation. I don't know if there's a
such code path, but I believe it's very rare to encounter the problem.

#2654
2023-12-03 07:44:58 +09:00
Yuya Nishihara
1db033504c repo, workspace: remove 'static lifetime bound from initializer functions 2023-12-03 07:44:41 +09:00
Anton Bulakh
eb1c0ab4a2 sign: Implement a test signing backend and add a few basic tests 2023-11-30 23:36:56 +02:00
Yuya Nishihara
a935a4f70c working_copy: use proto file states without rebuilding BTreeMap
In snapshot(), changed_file_states are received in arbitrary order. For the
other callers, entries are in diff_stream order, so we don't have to sort
them.

With watchman enabled, we can see the cost of sorting the sorted proto entries.
I don't think this is significant, but we can mitigate it by adding
is_file_states_sorted flag to the proto message if needed:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     164.8 ms ±  16.6 ms    [User: 50.2 ms, System: 111.7 ms]
  Range (min … max):   148.1 ms … 195.0 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     171.8 ms ±  13.6 ms    [User: 61.7 ms, System: 109.0 ms]
  Range (min … max):   159.5 ms … 192.1 ms    20 runs
```

Without watchman:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     367.3 ms ±  30.3 ms    [User: 1415.2 ms, System: 633.8 ms]
  Range (min … max):   325.4 ms … 421.7 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     327.7 ms ±  24.9 ms    [User: 1059.1 ms, System: 654.3 ms]
  Range (min … max):   296.0 ms … 385.4 ms    20 runs
```

I haven't measured snapshotting against dirty working copy, but I don't think
it would be slower than the original implementation.
2023-11-30 12:09:31 +09:00
Yuya Nishihara
28ab9593c3 repo_path: split RepoPath into owned and borrowed types
This enables cheap str-to-RepoPath cast, which is useful when sorting and
filtering a large Vec<(String, _)> list by using matcher for example. It
will also eliminate temporary allocation by repo_path.parent().
2023-11-28 07:33:28 +09:00
Yuya Nishihara
0a1bc2ba42 repo_path: add stub RepoPathBuf type, update callers
Most RepoPath::from_internal_string() callers will be migrated to the function
that returns &RepoPath, and cloning &RepoPath won't work.
2023-11-28 07:33:28 +09:00