`CommitRewriter` wraps 3 of the arguments, so I think it makes sense
to pass it instead. More importantly, I hope to continue refactoring
so many of the callers already have a `CommitRewriter`.
It's cheap to look up commits again from the cache in `Store` but it
can be expensive to look up commits we didn't end up needing. This
will make it easier to refactor further and be able to cheaply set
preliminary parents for a rewritten commits and then let the caller
update them.
I'm going to add a helper struct to help with rewriting commits. I
want to make that struct own the old commit and the new parents to
simplify lifetimes. This patch prepares for that by passing the
commits by value to `rebase_commit()`.
I don't think we have any callers left that call
`record_rewritten_commit()` multiple times within a transaction and
expect it to result in divergence. I think we should consider it a bug
to do that.
This removes the special handling of the working-copy commit. By
recording when an empty/emptied commit was abanoned, we rebase
descendants correctly and create a new empty working-copy commit on
top.
I think the conclusion from #2600 is that at least auto-rebasing
should not simplify merge commits that merge a commit with its
ancestor. Let's start by adding an option for that in the library.
This mostly reverts https://github.com/martinvonz/jj/pull/2901 as well as its
fixup https://github.com/martinvonz/jj/pull/2903. The related bug is reopened,
see https://github.com/martinvonz/jj/issues/2869#issuecomment-1920367932.
The problem is that while the fix did fix#2869 in most cases, it did
reintroduce the more severe bug https://github.com/martinvonz/jj/issues/2760
in one case, if the working copy is the commit being rebased.
For example, suppose you have the tree
```
root -> A -> B -> @ (empty) -> C
```
### Before this commit
#### Case 1
`jj rebase -s B -d root --skip-empty` would work perfectly before this
commit, resulting in
```
root -> A
\-------B -> C
\- @ (new, empty)
```
#### Case 2
Unfortunately, if you run `jj rebase -s @ -d A --skip-empty`, you'd have the
following result (before this commit), which shows the reintroduction of #2760:
```
root -> A @ -> C
\-- B
```
with the working copy at `A`. The reason for this is explained in
https://github.com/martinvonz/jj/pull/2901#issuecomment-1920043560.
### After this commit
After this commit, both case 1 and case 2 will be wrong in the sense of #2869,
but it will no longer exhibit the worse bug #2760 in the second case.
Case 1 would result in:
```
root -> A
\-------B -> @ (empty) -> C
```
Case 2 would result in:
```
root -> A -> @ -> C
\-- B
```
with the working copy remaining a descendant of A
Fixes#2760
Given the tree:
```
A-B-C
\
B2
```
And the command `jj rebase -s B -d B2`
We were previously marking B as abandoned, despite the comment stating that we were marking it as being succeeded by B2. This resulted in a call to `rewrite(rewrites={}, abandoned={B})` instead of `rewrite(rewrites={B=>B2}, abandoned={})`, which then made the new parent of `C` into `A` instead of `B2`
This completes the process of removing DescendantRebaser-related APIs from
tests. It requires creating some new test utils and a new
`rebase_descendants_with_option_return_map`.
This removes uses of `DescendantRebaser::new` or
`MutRepo::create_descendant_rebaser` from most tests. The exceptions are the
tests having to do with abandoning empty commits on rebase, since adjusting
those is a bit more elaborate (see follow-up commits).
This requires creating a new public API as a substitute. I took the opportunity
to also add some comments to the
`MutRepo::record_rewritten_commit`/`record_abandoned_commit` functions.
I imade the simplest possible addition to the API; it is not a very elegant
one. Eventually, the entire `record_rewritten_commit` API should probably be
refactored again.
I also added some comments explaining what these functions do.
It seems better to have the caller pass the transaction description
when we finish the transaction than when we start it. That way we have
all the information we want to include more readily available.
Previously, the function relied on both the `self.parent_mapping` and
`self.rebased`. If `(A,B)` was in `parent_mapping` and `(B,C)` was in `rebased`,
`new_parents` would map `A` to `C`.
Now, `self.rebased` is ignored by `new_parents`. In the same situation,
DescendantRebaser is changed so that both `(A,B)` and `(B,C)` are in
`parent_mapping` before. `new_parents` now applies `parent_mapping` repeatedly,
and will map `A` to `C` in this situation.
## Cons
- The semantics are changed; `new_parents` now panics if `self.parent_mapping`
contain cycles. AFAICT, such cycles never happen in `jj` anyway, except for
one test that I had to fix. I think it's a sensible restriction to live with;
if you do want to swap children of two commits, you can call
`rebase_descendants` twice.
## Pros
- I find the new logic much easier to reason about. I plan to extract it into a
function, to be used in refactors for `jj rebase -r` and `jj new --after`. It
will make it much easier to have a correct implementation of `jj rebase -r
--after`, even when rebasing onto a descendant.
- The de-duplication is no longer O(n^2). I tried to keep the common case fast.
## Alternatives
- We could make `jj rebase` and `jj new` use a separate function with the
algorithm shown here, without changing DescendantRebaser. I believe that the new
algorithm makes DescendatRebaser easier to understand, though, and it feels more
elegant to reduce code duplication.
- The de-duplication optimization here is independent of other changes, and
could be used on its own.
This enables cheap str-to-RepoPath cast, which is useful when sorting and
filtering a large Vec<(String, _)> list by using matcher for example. It
will also eliminate temporary allocation by repo_path.parent().
We had similar code in two places for restoring paths from one tree to
another. Let's reuse it instead.
I put the new function in the `rewrite` module. I'm not sure if that's
right place. Maybe it belongs in `tree`?
The state field isn't saved yet. git import/export code paths are migrated,
but new tracking state is always calculated based on git.auto-local-branch
setting. So the tracking state is effectively a global flag.
As we don't know whether the existing remote branches have been merged in to
local branches, we assume that remote branches are "tracking" so long as the
local counterparts exist. This means existing locally-deleted branch won't
be pushed without re-tracking it. I think it's rare to leave locally-deleted
branches for long. For "git.auto-local-branch = false" setup, users might have
to untrack branches if they've manually "merged" remote branches and want to
continue that workflow. I considered using git.auto-local-branch setting in the
migration path, but I don't think that would give a better result. The setting
may be toggled after the branches got merged, and I'm planning to change it
default off for better Git interop.
Implementation-wise, the state enum can be a simple bool. It's enum just
because I originally considered to pack "forgotten" concept into it. I have
no idea which will be better for future extension.
Since set_remote_branch_target() is called while merging refs, its tracking
state shouldn't be reinitialized. The other callers are migrated to new setter
to keep the story simple.
I don't think the backend should matter for any of these tests, so
let's test with only one, and let's make that the strictest one - the
new test backend.
This reduces the number of tests by 74 (from 974 to 900), but saves no
measurable run time.
I don't think there's any reason to use the local backend in tests
instead of using the stricter test backend.
I think we should generally use the test backend in tests and only use
the local backend or git backend when there's a particular reason to
do so (such as in `test_bad_locking` where the on-disk directory
structure matters). But this patch only deals with the simpler cases
where we were only testing with the local backend.
It makes the call sites clearer if we pass the `TestRepoBackend` enum
instead of the boolean `use_git` value. It's also more extensible (I
plan to add another backend for tests).
I think most tests want a `MergedTree`, so this makes `create_tree()`
return that. I kept the old function as `create_single_tree()`. That's
now only used in `test_merge_trees` and `test_merged_tree`.
I also consistently imported the functions now, something I've
considered doing for a long time.
Copied from MergedTree::has_conflict(). I feel it's slightly better since
RefTarget "is" always a Conflict-based type. It could be inverted to
RefTarget::is_resolved(), but refs are usually resolved, and all callers
have special case for !is_resolved() state.
Since RefTarget will be reimplemented on top of Conflict<Option<CommitId>>,
we won't be able to simply return a slice of type &[CommitId]. These functions
are also renamed in order to disambiguate from Conflict::adds()/removes().
It's named after Conflict::from_legacy_form(). If RefTarget is migrated to
new Conflict type, from_legacy_form([], [add]) will create a normal target,
and from_legacy_form([], []) will be equivalent to the current None target.
That's why this function isn't named as RefTarget::conflict().
If RefTarget is migrated to new Conflict type, this function will create
RefTarget(Conflict::resolved(Some(id))).
We still need some .unwrap() to insert Option<RefTarget> into map, but maps
will be changed to store new RefTarget type, and their mutation API will
guarantee that Conflict::resolved(None) is eliminated.