We don't seem to have any tests that our protection from undetected
changes caused by writes happening right after checkout, so let's add
one. The test case loops 100 times and each iteration fails slightly
more than 80% of the time on my machine (if I remove the protection in
`TreeState::update_file_state()`), so it seems quite good at
triggering the race.
I don't see any reason for these tests to differ depending on backend,
so let's avoid running them twice (there are probably lots of other
tests that we're also running with different backends for little
reason).
Copied from MergedTree::has_conflict(). I feel it's slightly better since
RefTarget "is" always a Conflict-based type. It could be inverted to
RefTarget::is_resolved(), but refs are usually resolved, and all callers
have special case for !is_resolved() state.
With `MergedTree`, we can iterate over conflicts by descending into
only the subdirectories that cannot be trivially resolved. We assume
that the trees have previously been resolved as much as possible, so
we don't attempt to resolve conflicts again.
This adds a function for resolving conflicts that can be automatically
resolved, i.e. like our current `merge_trees()` function. However, the
new function is written to merge an arbitrary number of trees and, in
case of unresolvable conflicts, to produce a `Conflict<TreeId>` as
result instead of writing path-level conflicts to the backend. Like
`merge_trees()`, it still leaves conflicts unresolved at the file
level if any hunks conflict, and it resolves paths that can be
trivially resolved even if there are other paths that do conflict.
In order to store conflicts in the commit, as conflicts between a set
of trees, we want to be able merge those trees on the fly. This
introduces a type for that. It has a `Merge(Conflict(Tree))` variant,
where the individual trees cannot have path-level conflicts. It also
has a `Legacy(Tree)` variant, which does allow path-level conflicts. I
think that should help us with the migration.
Alternatively, we can wrap BTreeMap<String, Option<RefTarget>> to flatten
Option<&Option<..>> internally, but doing that would be tedious. It would
also be unclear if map.remove(name) should construct an absent RefTarget if
the ref doesn't exist.
The next commit will change these maps to store Option<RefTarget> entries, but
None entries will still be omitted from the serialized data. Since ContentHash
should describe the serialized data, relying on the generic ContentHash would
cause future hash conflict where absent RefTarget entries will be preserved.
For example, ([remove], [None, add]) will be serialized as ([remove], [add]),
and deserialized to ([remove], [add, None]). If we add support for lossless
serialization, hash(([remove], [None, add])) should differ from the lossy one.
Summary: Unneeded with the MSRV bump. The other one
will have to wait until Rust 1.72.0
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: Ifb3d862aedd3f2aeb05a86ce76978d4f
Since RefTarget will be reimplemented on top of Conflict<Option<CommitId>>,
we won't be able to simply return a slice of type &[CommitId]. These functions
are also renamed in order to disambiguate from Conflict::adds()/removes().
We saw a failure in `test_merge_views_git_heads()` in the GitHub CI,
but I wasn't able to reproduce it locally. Using the
`commit_transactions()` helper in the test should fix it.
It's named after Conflict::from_legacy_form(). If RefTarget is migrated to
new Conflict type, from_legacy_form([], [add]) will create a normal target,
and from_legacy_form([], []) will be equivalent to the current None target.
That's why this function isn't named as RefTarget::conflict().
If RefTarget is migrated to new Conflict type, this function will create
RefTarget(Conflict::resolved(Some(id))).
We still need some .unwrap() to insert Option<RefTarget> into map, but maps
will be changed to store new RefTarget type, and their mutation API will
guarantee that Conflict::resolved(None) is eliminated.
Perhaps, all view.get_<kind>(name) functions can return reference, but I'm
not willing to change the interface at this point. I'll revisit this after
migrating Option<RefTarget> to new Conflict-based type.
If we migrate RefTarget to new Conflict-based type, it won't store
Conflict<CommitId>, but Conflict<Option<CommitId>>. As the Option will
be internalized, new RefTarget type will also represent an absent target.
The 'target: Option<RefTarget>' argument will be replaced with new RefTarget
type.
I've also renamed the function for consistency with the following changes.
It would be surprising if set_local_branch(name, target) could remove the
branch. I feel the name set_local_branch_target() is less confusing.
The order of conflicted ids slightly changed since Conflict::simplify()
tries to preserve diff pairs. It shouldn't matter so long as the result is
stable.
Git CLI rejects it (though the data model would probably work fine with
"HEAD" branch.) This ensures that HEAD@git in JJ world is not an exported
branch named "HEAD".
Almost everyone calls the project "jj", and there seeems to be
consensus that we should rename the crates. I originally wanted the
crates to be called `jj` and `jj-lib`, but `jj` was already
taken. `jj-cli` is probably at least as good for it anyway.
Once we've published a 0.8.0 under the new names, we'll release 0.7.1
versions under the old names with pointers to the new crates names.
It seemed too verbose to always include @remote branches, so synced remotes
are omitted by default. If the given symbol contained '@', all remote symbols
are populated so that the distance of remote fragment is taken into account.
Since `Conflict`s can represent the resolved state, so
`Conflict<ContentHunk>` can represent the states that we use
`MergeHunk` for. `MergeHunk` does force the user to handle the
resolved case, which may be useful. I suppose one could use the same
argument for making `Conflict` an enum, i.e. if we think that
`MergeHunk`'s two variants are beneficial, then we should consider
making `Conflict` an enum with those two variants.
It's useful to have a more readable `Debug` format for `Vec<u8>`
(`"foo"` is better than `[102, 111, 111]`). It might also make types
in function signatures and elsewhere more readable.
This only parses the fields relevant to us, i.e.:
- name: the stable identifier of the submodule
- path: the path to the submodule in the current commit
- url: the remote we can clone the submodule from
The full list of .gitmodules fields can be found at
https://git-scm.com/docs/gitmodules.
This commit fixes#1305
Before this commit, running `jj init --git-repo=./` in a folder that
does not have a .git would cause jj to panick and leave an unfinished corrupted jj repo.
This commit fixes that by changing the call chain to return an error
instead of calling .unwrap() and panicking. This commit also adds logic to delete the unfinished jj
repository when the git backend initialization failed.
Before this commit, running the above command would result in the following
```
Running `jj/target/debug/jj init --git-repo=./`
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { code: -3, klass: 2, message: "failed to resolve path '/Users/kevincliao/github/jj/test-repo/.jj/repo/store/../../../.git': No such file or directory" }', lib/src/git_backend.rs:83:75
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
After this commit, the result is the following and the jj repo is deleted:
```
Running `jj/target/debug/jj init --git-repo=./`
Error: Failed to access the repository: Error: Failed to open git repository: failed to resolve path '/Users/kevincliao/github/jj/test-repo/.jj/repo/store/../../../.git': No such file or directory; class=Os (2); code=NotFound (-3)
```
..and other assorted boilerplate. These are just stubs for now, but now
that we've reserved the `submodule_store` subdirectory, we can start
adding more functionality.
Before we had `conflicts::Conflict`, most of these functions took a
`backend::Conflict`. I think I didn't want to pollute the `backend`
module with this kind of logic, trying to keep it focused on
storage. Now that we have the type in `conflicts`, however, I think it
makes sense to move these functions onto it.
This gets rid of round-trip conversion from queries like "(main..)-". I have
such expression in my default log/disambiguation revset, and the query could
take ~150ms to convert head positions back and forth if the repository had
tons of unmerged commits.
For tree-level conflicts (#1624), I plan to remove `ConflictId`
completely. This commit removes `ConflictId` from
`update_conflict_from_content()` by instead making it take a
`Conflict<Option<TreeValue>>` and return a possibly different such
value.
I made the call site in `working_copy` avoid writing the conflict to
the store if it's unchanged, but I didn't make the same optimization
in `merge_tools` becuase it's much more likely to have changed there.
Use `br@git` instead.
Before, if there is not a local branch `br`, jj tried to resolve
it as a git ref `refs/heads/br`. Unchanged from before, `br` can
still be resolved as a tag `refs/tag/br`.
Currently, if the user modifies a modify/delete conflict, we always
consider the result resolved. That happens because we materialize the
missing side of the conflict as an empty string but when we parse the
conflict, we expect only the number of sides in the input
conflict. For example, if the input is a regular modify/delete
conflict with one remove and one add, the materialized markers will
have one remove and two adds (one of them empty), but when we try to
parse it, we expect one remove and only one add. When we fail to parse
it, we consider it resolved.
This commit fixes the bug by using
`conflicts::Conflict<Option<TreeValue>>` and keeping track of which
sides were supposed to be empty. We could have fixed the bug without
switching to `conflicts::Conflict`, but we want to switch anyway, and
the fix happens naturally when switching.
I made a typo and got something like this:
```
Error: Commit or change id prefix "wl" is ambiguous
```
Since we can tell commit ids from change ids these days, let's make
the error message say which kind of id it is. Changing that also kind
of forced me to make a special error for empty strings. Otherwise we
would have to arbitrarily say that an empty string is a commit id or
change id. A specific error message for empty strings seems helpful,
so that's probably for the better anyway.
We can't get rid of the other "impl Index"es because .as_composite() must
return a real reference type. Maybe we could turn CompositeIndex into an
owned wrapper, but I don't know if that would be worth the effort.
It might sound scary to add public .mutable_index() accessor, but I think
it's okay because immutable MutableIndex reference has no more power than
Index.
This allows us to implement Index for lifetime-bound type such as
CompositeIndex<'_>.
The idea is that .as_composite() is equivalent to .as_index(), but for the
implementation type. I'm going to add "impl Index for CompositeIndex" to
clean up index references passed to revset engine.
Otherwise, "jj init --git-repo ." would create extra table files per commit,
and merge them.
I considered adding an explicit GitBackend method to be called from
git::import_refs(), but the call order matters. The method should be invoked
before calling store.get_commit(..) or mut_repo.add_head(..). Since commits
are likely to be loaded from the head, we can instead make read_commit()
import ancestor metadata at all.
Alternatively, we could make a Git commit hidden until it's inserted into
the extra table. It's rather big change, and I wouldn't like to do that
without thinking more thoroughly.
My first attempt was to fix up corrupted index when merging, but it turned
out to be not easy because the self side may contain corrupted data. It's
also possible that two concurrent commit operations have exactly the same
view state (because change id isn't hashed into commit id), and only the
table heads diverge.
#924
This bug concerns the way `import_refs` that gets called by `fetch` computes
the heads that should be visible after the import.
Previously, the list of such heads was computed *before* local branches were
updated based on changes to the remote branches. So, commits that should have
been abandoned based on this update of the local branches weren't properly
abandoned.
Now, `import_refs` tracks the heads that need to be visible because of some ref
in a mapping keyed by the ref. If the ref moves or is deleted, the
corresponding heads are updated.
Fixes#864
Now that we return the written commit from `write_commit()`, let's
make the timestamps match what was actually written, accounting for
the whole-second precision and the adjustment we do to avoid
collisions.
This has several advantages:
* Makes it possible to downcast to non-Git custom backends (might be
useful at Google, but we haven't needed it yet)
* Lets us access more specific functionality on the `GitBackend`,
making it possible to access the `git2::Repository` without
creating a copy of it.
* Removes the dependency on Git from the backend
In large repos, the unique prefixes can get somewhat long (~6 hex
digits seems typical in the Linux repo), which makes them less useful
for manually entering on the CLI. The user typically cares most about
a small set of commits, so it would be nice to give shorter unique ids
to those. That's what Mercurial enables with its
`experimental.revisions.disambiguatewithin` config. This commit
provides an implementation of that feature in `IdPrefixContext`.
In very large repos, it can also be slow to calculate the unique
prefixes, especially if it involves a request to a server. This
feature becomes much more important in such repos.
When creating `RevsetExpression` programmatically, I think we should
use commit ids instead of symbols in the expression. This commit adds
a check for that by using a `SymbolResolver` that always errors
out.
I would eventually want the `SymbolResolver` to be customizable (in
custom `jj` binaries), so we want to make sure we always use the
customized version of it.
I left `RevsetExpression::resolve()` unchanged. I consider that to be
for programmatically created expressions.
The current behavior was introduced by 20eb9ecec1 "git: don't abandon
HEAD commit when it loses a branch." While the change made HEAD mutation
behavior more consistent with a plain ref operation, HEAD can also move on
checkout, and checkout shouldn't be considered a history rewriting operation.
I'm not saying the new behavior is always correct, but I think it's safer
than losing old HEAD branch. I also think this change will help if we want
to extract HEAD management function from git::import_refs().
Fixes#1042.
The substitution rule and tests are copied from ancestors/parents. The backend
logic will be reimplemented later. For now, it naively repeats children().
It no longer needs to be on the `Index` trait, thereby removing the
last direct use of `IndexEntry` in the trait (it's still used
indirectly in `walk_revs()`).
New ResolvedExpression enum ensures that the evaluation engine doesn't have
to know the symbol resolution details. In this commit, I've moved Filter
and NotIn resolution to resolve_visibility(). Implicit All/VisibleHeads
resolution will be migrated later.
It's tempting to combine resolve_symbols() and resolve_visibility() to get
rid of panic!()s, but the resolution might have to be two passes to first
resolve&collect explicit commit ids, and then substitute "all()" with
"(:visible_heads())|commit_id|..". It's also possible to apply some tree
transformation after symbol resolution.