If you import Git refs, then rebase a commit pointed to by some Git
ref, and then re-import Git refs, you don't want the old commit to be
made a visible head again. That's particularly annoying when Git refs
are automatically updated by every command.
I recently (0c441d9558) made it so we don't create an operation when
nothing changed. Soon thereafter (94e03f5ac8), I broke that when I
introduced a cache-invalidation bug when I made the filtering-out of
non-heads be lazy. This patch fixes that and also adds a test to
prevent regressions.
This patch adds a place for tracking the current `HEAD` commit in the
underlying Git repo. It updates `git::import_refs()` to record it. We
don't use it anywhere yet.
This is part of #44.
I'm about to change `ReadonlyRepo::load()` to take the path to the
`.jj/` directory, so this patch prepares for that. It already works
because `ReadonlyRepo::load()` will search up the directory tree for
the `.jj/` entry.
`ReadonlyRepo::init_*()` currently calls `WorkingCopy::init()`. In
order to remove that dependency, this patch wraps the
`ReadonlyRepo::init_*()` functions in new `Workspace` functions. A
later patch will have those functions call `WorkspaceCopy::init()`.`
The `Repo` doesn't do anything with the `WorkingCopy` except keeping a
reference to it for its users to use. In fact, the entire lib crate
doesn't do antyhing with the `WorkingCopy`. It therefore seems simpler
to have the users of the crate manage the `WorkingCopy` instance. This
patch does that by letting `Workspace` own it. By not keeping an
instance in `Repo`, which is `Sync`, we can also drop the
`Arc<Mutex<>>` wrapping.
I left `Repo::working_copy()` for convenience for now, but now it
creates a new instance every time. It's only used in tests.
This further decoupling should help us add support for multiple
working copies (#13).
The recent e5dd93cbf7, whose description says "cleanup: make Vec
inside CommitId etc. non-public", made all ID types in the `backend`
module *except* for `CommitId` non-public :P This patch makes
A while ago, I replaced a call to git2-rs's `Remote::fetch()` by calls
to `Remote::download()` and `Remote::update_tips()`. The function is
documented to be a convenience for those function, but it turns out
that the pruning of deleted remote refs is a separate call
(`Remote::prune()`), so we need to call that too.
Since the working copy can now handle conflicts, we don't need to
materialize conflicts when checking out a commit.
Before this patch, we used to create a new commit on top whenever we
checked out a commit with conflicts. That new commit was intended just
for resolving the conflicts. The typical workflow was the resolve the
conflicts and then amend. To use the same workflow after this patch,
one needs to explicitly create a new commit on top with `jj new` after
checking out a commit with conflict.
I realized only recently that we can try to parse conflict markers in
files and leave them as conflicted if they haven't changed. If they
have changed and some conflict markers have been removed, we can even
update the conflict with that partial resolution.
This change teaches the working copy to write conflicts to the working
copy. It used to expect that the caller had already updated the tree
by materializing conflicts. With this change, we also start parsing
the conflict markers and leave the conflicts unresolved in the working
copy if the conflict markers remain.
There are some cases that we don't handle yet. For example, we don't
even try to set the executable bit correctly when we write
conflicts. OTOH, we didn't do that even before this change.
We still never actually write conflicts to the working copy (outside
of tests) because we currently materialize conflicts in
`MutRepo::check_out()`. I'll change that next.
I initially made the working copy materialize conflicts in its
`check_out()` method. Then I changed it later (exactly a year ago, on
Halloween of 2020, actually) so that the working copy expected
conflicts to already have been materalized, which happens in
`MutableRepo::check_out`().
I think my reasoning then was that the file system cannot represent a
conflict. While it's true that the file system itself doesn't have
information to know whether a file represents a conflict, we can
record that ourselves. We already record whether a file is executable
or not and then preserve that if we're on a file system that isn't
able to record it. It's not that different to do the same for
conflicts if we're on a file system that doesn't understand conflicts
(i.e. all file systems).
The plan is to have the working copy remember whether a file
represents a conflict. When we check if it has changed, we parse the
file, including conflict markers, and recreate the conflict from
it. We should be able to do that losslessly (and we should adjust
formats to make it possible if we find cases where it's not).
Having the working copy preserve conflict states has several
advantages:
* Because conflicts are not materialized in the working copy, you can
rebase the conflicted commit and the working copy without causing
more conflicts (that's currently a UX bug I run into every now and
then).
* If you don't change anything in the working copy, it will be
unchanged compared to its parent, which means we'll automatically
abandon it if you update away from it.
* The user can choose to resolve only some of the conflicts in a file
and squash those in, and it'll work they way you'd hope.
* It should make it easier to implement support for external merge
tools (#18) without having them treat the working copy differently.
This patch prepares for that work by adding support for parsing
materialized conflicts.
While working on demos, I noticed that `jj log` output in the
octocat/Hello-World repo was unstable: sometimes the first parent of
the merge was on the left and sometimes it was on the right. This
patch fixes that by sorting the edges by position in the index just
before returning them. It seems that most applications would want
stable output so I put it in the `RevsetGraphIterator` rather than
doing at the call site in the CLI. I ordered them with the reverse
index position rather than forward because it seemed to make the
graphs in the git.git repo slight nicer, with the left-most edge going
between subsequent releases.
There performance difference is within the noise level.
If you rewrite a commit that's also available on some remote, you'll
currently see both the old version and the new version in the view,
which means they're divergent. They're not logically divergent (the
rewritten version should replace the old version), so this is a UX
bug. I think it indicates that the set of current heads should be
redefined to be the *desired* heads. That's also what I had suspected
in the TODO removed by this change. I think another indication that
we should hide the old heads even if they have e.g. a remote branch
pointing to them is that we don't want them to be rebased if we
rewrite an ancestor.
So that's what I decided to do: let the view's heads be the desired
heads. The user can still define revsets for showing non-current
commits pointed to by e.g. remote branches.
This fixes a bug I've run into somewhat frequently. What happens is
that if you have a conflict on top of another conflict and you resolve
the conflict in the bottom commit, we just simplify the `Conflict`
object in the second commit, but we don't try to resolve the new
conflict. That shows up as an unexpected "conflict" in `jj log`
output, and when you check out the commit, there are actually no
conflicts, so you can just `jj squash` right away.
This patch fixes that bug. It also teaches the code to work with more
than 3 parts in the merge, so if there's a 5-way conflict, for
example, we still try to resolve it if possible.
With this change, you can do e.g. `heads(remote_branches())`. That
should currently be the same as `public_heads()`, except that we don't
yet remove public heads when remote branches have been updated. Having
this support should be generally useful, but I may use it in the short
term specifically for depending less on the public heads, until I get
around to keeping them up to date.
It's been a lot of work, but now we're finally able to remove the
`Evolution` state! `jj obslog` still works as before (it just walks
the predecessor pointers).
The removal of hidden heads was just there to help with the transition
away from evolution (#32). Now that we no longer depend on evolution
for removing old heads, we can remove the hack.
This patch teaches `DescendantRebaser` to also update heads. That's
done at the end of the rebase (when `rebase_next()` starts returning
`None`), which is a little weird. We should probably change the
interface, but this will do for now.
With this change, we should no longer need to remove hidden heads when
the transaction commits. That will remove one of the last bits of
dependence on evolution from most commands (#32).
Now that we remove hidden heads whenever a transaction commits,
`non_obsolete_heads()` should always be the same as `all_heads()`,
except during a transaction. I don't think we depend on the difference
even during a transaction. Let's simplify a bit by removing the revset
function `all_heads()` and renaming `non_obsolete_heads()` to
`heads()`. This is part of issue #32.
This is similar to how a recent change taught `DescendantRebaser` to
update branches pointing to rewritten commits. Now we also update the
checkout if it pointed to a rewritten commit.
This patch moves the logic for updating branches from
`update_branches_after_rewrite()` into `DescendantRebaser`. The
branches are now updated along with each rebased commit rather than
all being updated at the end. The new code uses the information about
rewritten and abandoned commits that `DescendantRebaser` gets from
`MutableRepo`. That is different from the old code, which used the
evolution state. This patch thus moves us one step closer to removing
evolution (#32).
I'm going to teach `DescendantRebaser` to also update local branches
pointing to rewritten commits, taking over the responsibility from
`rewrite::update_branches_after_rewrite()`. For commits that have been
rewritten as multiple new commits (divergent, not split), that
function makes local branches pointing to the old commit point to all
the new commits. To replicate that behavior in `DescendantRebaser`, it
needs to know about divergent changes. This change addresses that.
I recently made the CLI remove hidden heads when a transaction is
committed (38474a9). Let's move that to `Transaction::commit()`, so
the library crate becomes more similar to how the CLI behaves and more
similar to our evolution-less future (#32).
The next patch would otherwise make this test fail because
"transaction 2" tries to point a branch to a commit that's not visible
(because it's created by the concurrent "transaction 1").
This is part of removing support for evolution (#32). Since
`CommitBuilder` now records rewritten commits in `MutableRepo`, we can
use that recorded information to automatically rebase descendants.
When we remove support for evolution (#32), we need to still make it
easy for application code to rebase descendants of rewritten and
abandoned commits. The way applications currently do that is by using
e.g. `CommitBuilder::for_rewrite_from()` followed by
`evolve_orphans()`. This patch puts some bookkeeping in `MutableRepo`
for rewritten and abandoned commits, along with a function for
creating a `DescendantRebaser` based on it. I'll then make
`CommitBuilder` record rewritten commits there.
The default branch relies on checking the value of `HEAD`. The `empty_git_commit` function updates the ref `refs/heads/main`, but since `HEAD` was never updated to point to that ref, the default branch can't be determined. The fix is to explicitly set `HEAD`.
Personally, this test failed reliably for me on macOS. I don't know why this behavior would be non-deterministic on other platforms.
It seems it wasn't Windows that behaved differently when it comes
getting the remote's default branch; the test failed on Ubuntu
too.
The documentation for `Remote::default_branch()` says that it can be
called even after the connection has been closed, but let's see if
calling it while the connection is open helps anyway. To do that, we
have to replicate what `Remote::fetch()` does.
Descendants of abandoned commits should be rebased onto their parents,
or the rewritten parents if they had been rewritten. This patch
teaches `DescendantRebaser` to do that. It updates `jj rebase -r` to
use the functionality. I plan to also use it in `jj abandon`
(naturally, given the name), and for rebasing descendants of deleted
refs imported from `jj git refresh/fetch/push`.
The fact that `DescendantRebaser` visits some commits that don't need
to be rebased is mostly an implementation detail. I can't think of a
reason that callers would care about these commits.
The command's help text says "Abandon a revision", which I think is a
good indication that the command's name should be `abandon`. This
patch renames the command and other user-facing occurrences of the
word. The remaining occurrences should be removed when I remove
support for evolution.
This patch moves the function for updating branches after rewrite from
`commands.rs` into `rewrite.rs`.
It also changes the function to update branches even if they were
conflicted or become conflicted. I think that seems better than
leaving branches on old commits. For example, let's say you have start
with this:
```
C main
|
B origin@main
|
A
```
You now pull from origin, which has updated the main branch from B to
B'. We apply that change to both the remote branch and the local
branch, which results in a conflict in the local branch:
```
C main?
|
B B' main? origin@main
|/
A
```
If you now rewrite C to C', the conflicted main branch will still
point to C, which is just weird. This patch changes that so the
conflicted side of main gets repointed to C'.
I also refactored the code to reuse our existing
`MutableRepo::merge_single_ref()`, which improves the behavior in
several cases, such as the conflict-resolution case in the last test
case.
As the updates test case shows, when rebasing forward, we missed
commits that fork off from the section between the source and the
destination.
As part of the fix, I also restructured the code a bit to prepare for
support for rebasing descendants of multiple rewritten commits.
Before this change, you could end up with an index segment with 10
commits, then a child segment with 9 commits, then another child with
8 commits, and so on. That's not what I had intended. This changes
makes it so we squash if a segment has more than half as many commits
as its parent instead.
Git doesn't want `.git` entries in its trees, so at least when using
the Git backend, we need to ignore such paths. Let's just ignore
`.git` paths regardless of backend to keep it simple.
Closes#24.
When I added the function for rebasing descendants, I forgot to call
the existing `rebase()` function and instead simply created a new
commit with the new parents but the old contents.
This should be useful in lots of places. For example, `jj rebase -r`
currently rebases all descendants, because that's what the auto-evolve
feature does. I think it would be nice to instead copy from
Mercurial's `-s` flag for also rebasing descendants. Then `jj rebase
-r` can be made to pull a commit out of a stack, rebasing descendants
onto the rebased commit's parents. I also intend to use this
functionality for rebasing descendants when remote branches have been
rewritten.
The auto-rebasing of descendants doesn't work if you have an open
commit checked out, which means that you may still end up with orphans
in that case (though that's usually a short-lived problem since they
get rebased when you close the commit). I'm also about to make
branches update to successors, but that also doesn't work when the
branch is on a working copy commit that gets rewritten. To fix this
problem, I've decided to let the caller of `WorkingCopy::commit()`
responsible for the transaction.
I expect that some of the code that this change moves from the lib
crate to the cli crate will later move back into the lib crate in some
form.
With this change, we no longer fail if the user moves a branch
sideways or backwards and then push.
The push should ideally only succeed if the remote branch is where we
thought it was (like `git push --force-with-lease`), but that requires
rust-lang/git2-rs#733 to be fixed first.
Now that we have our own representation of branches and tags, let's
update them when we import git refs. The View object's git refs are
now just a record of what the refs are in the underlying git ref last
time we imported them (we don't -- and won't -- provide a way for the
user to update our record of the git refs). We can therefore do a nice
3-way ref-merge using the `refs` module we added recently. That means
that we'll detect conflicts caused by changes made concurrently in the
underlying git repo and in jj's view.
I've finally decided to copy Git's branching model (issue #21), except
that I'm letting the name identify the branch across
remotes. Actually, now that I think about, that makes them more like
Mercurial's "bookmarks". Each branch will record the commit it points
to locally, as well as the commits it points to on each remote (as far
as the repo knows, of course). Those records are effectively the same
thing as Git's "remote-tracking branches"; the difference is that we
consider them the same branch. Consequently, when you pull a new
branch from a remote, we'll create that branch locally.
For example, if you pull branch "main" from a remote called "origin",
that will result in a local branch called "main", and also a record of
the position on the remote, which we'll show as "main@origin" in the
CLI (not part of this commit). If you then update the branch locally
and also pull a new target for it from "origin", the local "main"
branch will be divergent. I plan to make it so that pushing "main"
will update the remote's "main" iff it was currently at "main@origin"
(i.e. like using Git's `git push --force-with-lease`).
This commit adds a place to store information about branches in the
view model. The existing git_refs field will be used as input for the
branch information. For example, we can use it to tell if
"refs/heads/main" has changed and how it has changed. We will then use
that ref diff to update our own record of the "main" branch. That will
come later. In order to let git_refs take a back seat, I've also added
tags (like Git's lightweight tags) to the model in this commit.
I haven't ruled out *also* having some more persistent type of
branches (like Mercurials branches or topics).
I'm about to add some support for branches and tags (for issue #21)
and it seems that we didn't have explicit testing of merging of
views. There was `test_import_refs_merge()` in `test_git.rs` but
that's specifically for git refs. It seems that it's made obsolete by
the tests added by this commit, so I'm removing it.
I had previously created commit messages based only on the ref name,
which meant that `commit4` and `commit5` ended up being the same
commit. This fixes that problem.
There were some tests that discarded a transaction only because it
used to be easier to do that than to commit and reload the repo. We
get the new repo back when we commit the transaction these days, so
now it's often easier to commit the transaction instead.
When there are two concurrent operations, we would resolve conflicting
updates of git refs quite arbitrarily before this change. This change
introduces a new `refs` module with a function for doing a 3-way merge
of ref targets. For example, if both sides moved a ref forward but by
different amounts, we pick the descendant-most target. If we can't
resolve it, we leave it as a conflict. That's fine to do for git refs
because they can be resolved by simply running `jj git refresh` to
import refs again (the underlying git repo is the source of truth).
As with the previous change, I'm doing this now because mostly because
it is a good stepping stone towards branch support (issue #21). We'll
soon use the same 3-way merging for updating the local branch
definition (once we add that) when a branch changes in the git repo or
on a remote.
This adds support for having conflicting git refs in the view, but we
never create conflicts yet. The `git_refs()` revset includes all "add"
sides of any conflicts. Similarly `origin/main` (for example) resolves
to all "adds" if it's conflicted (meaning that `jj co origin/main` and
many other commands will error out if `origin/main` is
conflicted). The `git_refs` template renders the reference for all
"adds" and adds a "?" as suffix for conflicted refs.
The reason I'm adding this now is not because it's high priority on
its own (it's likely extremely uncommon to run two concurrent `jj git
refresh` and *also* update refs in the underlying git repo at the same
time) but because it's a building block for the branch support I've
planned (issue #21).
I don't know why these used to fail. Perhaps it was just that the
GitHub's Windows machines were not powerful to run them with 100
threads doing concurrent commits. Maybe they will pass now that we
limit the number of threads to the number of CPUs. This change enables
the tests so we can see what GitHub CI thinks.