ok/jj
1
0
Fork 0
forked from mirrors/jj
Commit graph

1893 commits

Author SHA1 Message Date
Martin von Zweigbergk
9138bb5517 working_copy: use MergedTree for current tree when snapshotting
We now have all the pieces in place to read the current tree as a
`MergedTree` when snapshotting the working copy. For now, it's still
always a legacy tree. We'll need to update the working copy state file
to support storing multiple trees before we can create a `MergedTree`
with multiple sides here.
2023-08-15 07:56:55 -07:00
Martin von Zweigbergk
c126e75b2b working_copy: make write_path_to_store() work with merged values
For tree-level conflicts, we're going to be getting
`Merge<Option<TreeValue>>` from the current tree and produce a new
such value if contents changes on disk. This commit gets us a little
closer to that by passing in a value of that type into
`write_path_to_store()`.

This seems to have a small but measurable performance
impact. Snapshotting the working copy in the git repo with all files
`touch`ed went from 2.36 s to 2.43 s (3%). I think that's okay,
especially since most files' mtimes rarely change, and we only pay the
price when it has.
2023-08-15 07:56:55 -07:00
Martin von Zweigbergk
1d55a404cc merged_tree: add path_value() 2023-08-15 07:56:55 -07:00
Martin von Zweigbergk
2238c87da1 merged_tree: import create_tree() in tests to reduce line wrapping 2023-08-15 07:56:55 -07:00
Martin von Zweigbergk
3f97a6da78 working_copy: avoid adding unchanged values to tree builder
If the value at a path hasn't changed, there's no need to send it over
the channel and have the receiver add it to `TreeBuilder`. I couldn't
measure any performance impact.

Now we should no longer send `TreeValue::Conflict` variants over the
tree entry channel.
2023-08-14 23:32:52 -07:00
Martin von Zweigbergk
eacdad3ebd working_copy: move writing of conflicts to receiver side of channel
When writing tree-level conflicts, we're going to be writing multiple
tree (maybe using some new `MergedTreeBuilder`), so we'll need the
full `Merge<Option<TreeValue>>` object. This gets us closer to that by
sending such objects over the channel and having the receiver write
the conflict object.

Note that we still sometimes send `TreeValue::Conflict` variants over
the channel. That only happens if they're unchanged.
2023-08-14 23:32:52 -07:00
Martin von Zweigbergk
03f00bbf30 working_copy: return Merge<Option<TreeValue>> over channel
When writing tree-level conflicts, we won't pass `TreeValue::Conflict`
over the `tree_entries` channel. Instead, we're going to pass possibly
unresolved `Merge<Option<TreeValue>>` instances. This commit prepares
for that by changing the type even though we'll only pass
`Merge::normal()` over the channel at this point.

I did this partly to see what the performance impact is. I tested that
by touching all files in the git.git repo to force the trees (and
files) to be rewritten. There was no measurable impact at all
(best-of-10 time was 2.44 s before and 2.40 s after, but I assume that
was a fluke).
2023-08-14 23:32:52 -07:00
Martin von Zweigbergk
6c5d6d7e39 working_copy: delete duplicate comment
I copied a comment that I should have just moved in 37a770e8b4.
2023-08-14 23:32:52 -07:00
Martin von Zweigbergk
4eadb06251 working_copy: propagate errors from writing conflict parts to store 2023-08-14 23:32:52 -07:00
Yuya Nishihara
6286cde543 index: import commits in chronological order
This basically means that heads in a filtered graph appear in reverse
chronological order. Before, "jj log -r 'tags()'" in linux-stable repo would
look randomly sorted once you ran "jj debug reindex" in it.

With this change, indexing is more like breadth-first search, and BFS is
known to be bad at rendering nice graph (because branches run in parallel.)
However, we have a post process to group topological branches, so we don't
have this problem. For serialization formats like Mercurial's revlog iirc,
BFS leads to bad compression ratio, but our index isn't that kind of data.

Reindexing gets slightly slower, but I think this is negligible.

  (in Git repository)
  % hyperfine --warmup 3 --runs 10 "jj debug reindex --ignore-working-copy"
  (original)
  Time (mean ± σ):      1.521 s ±  0.027 s    [User: 1.307 s, System: 0.211 s]
  Range (min … max):    1.486 s …  1.573 s    10 runs
  (new)
  Time (mean ± σ):      1.568 s ±  0.027 s    [User: 1.368 s, System: 0.197 s]
  Range (min … max):    1.531 s …  1.625 s    10 runs

Another idea is to sort heads chronologically and run DFS-based topological
sorting. It's ad-hoc, but worked surprisingly well for my local repositories.
For repositories with lots of long-running branches, this commit will provide
more predictable result than DFS-based one.
2023-08-15 15:03:45 +09:00
Yuya Nishihara
cc6e9150d5 dag_walk: add topological sort that runs Kahn's algorithm with heap queue
This is a bit more involved than DFS-based implementation, but it allows us
to sort commits chronologically without breaking topological ordering.
2023-08-15 15:03:45 +09:00
Martin von Zweigbergk
f1b817e8ca cleanup: fix warnings from nightly clippy 2023-08-14 22:11:56 -07:00
Martin von Zweigbergk
b16fd3b6b9 conflicts: combine loops for adds/removes in update_from_content()
Similar to the previous commit, now that we can `Merge::iter()`, we
can combine that with `zip()` and simplify.
2023-08-14 08:44:38 -07:00
Martin von Zweigbergk
f45b8052e1 conflicts: check earlier for edited absent part in conflict markers
With the new `Merge::iter()`, we can simplify the code a bit by
combining that with `zip`.

I'll simplify the last part of `update_from_content()` next.
2023-08-14 08:44:38 -07:00
Martin von Zweigbergk
01ac97f999 merge: implement Iterator and FromIterator
Implementing `Iterator` and `FromIterator` on `Merge<T>` provides much
more flexibility than the current `map()`, `try_map()`, etc.

`Merge::from_iter()` wouldn't have a way of failing if it's given an
unexpected (even) number of items. I would be fine with having it
panic, but we can't even usefully do that, because
e.g. `Option::from_iter()` will pass us an iterator ends early if the
input interator ends early. For example,
`Merge::resolved(None).iter().collect()` would call
`Merge::from_iter()` with an empty iterator (first item `None`). So, I
instead created a `MergeBuilder` type implementing `FromIterator`, and
let `MergeBuilder::build()` panic if there were an even number of
items.

I re-implemented some existing `Merge` methods using the new
facilities in this commit. Maybe we should remove some of the methods.
2023-08-14 08:44:38 -07:00
Martin von Zweigbergk
dffe069985 conflicts: remove redundant check of num_sides from condition
Since `Merge` always has one more "adds" than "removes", there's no
need to check both of them. I really should have noticed this in
0b3b62a777.
2023-08-14 08:44:38 -07:00
Yuya Nishihara
7ddced7f3f git: scan new commits all at once from multiple heads
The visiting order is DFS from heads sorted in lexicographical order, but
I plan to change it to chronological order.
2023-08-14 07:48:55 +09:00
Yuya Nishihara
73a4b7f5bf repo: extract add_heads() that can import commits from multiple heads
This allows us to reorder commits to be indexed in bulk.

The incremental update optimization is applied only for a single head. This
could be tried for multiple heads, but it's unlikely that every head has
a single new commit for each.
2023-08-14 07:48:55 +09:00
Yuya Nishihara
157a0e748b git: add separate step to apply HEAD@git change
I'm going to extract a step to import new commits all at once.
2023-08-14 07:48:55 +09:00
Yuya Nishihara
359c871545 git: remove redundant id.clone() from diff_refs_to_import() 2023-08-14 07:48:55 +09:00
Martin von Zweigbergk
e414f3b73c cleanup: use fs:read() instead of File::open().read_to_end() 2023-08-13 14:04:59 +00:00
Martin von Zweigbergk
0b3b62a777 conflicts: remove redundant num_removes argument from parse_conflict()
Merges always have exactly one more "adds" than "removes" these days.
2023-08-13 09:54:16 +00:00
Yuya Nishihara
72271c0d1f repo: micro-optimize add_head() to not instantiate indexed commit object 2023-08-13 18:52:17 +09:00
Yuya Nishihara
15fb8b95b0 index: rewrite topological sort by leveraging dag_walk function
This is similar to what mut_repo.add_head() does.

I'm going to adjust the visiting order so the bulk-imported history preserves
chronological order. It might be a small adjustment on the current DFS
approach, or new function based on Kahn's algorithm. Either way, it's important
that both "jj git import" and "jj debug reindex" use the same underlying
function.
2023-08-13 18:52:17 +09:00
Yuya Nishihara
8652bae925 index: add tracing output to "jj debug reindex" path 2023-08-13 18:52:17 +09:00
Martin von Zweigbergk
f9e0feaaf8 working_copy: return early from write_path_to_store() for non-files
Almost the entire method deals with `FileType::Normal`, so we can
reduce indentation and repeated matching on the file type by doing it
early and returning in the non-normal-file cases.
2023-08-13 01:00:31 +00:00
Martin von Zweigbergk
23f54b8151 working_copy: propagate errors when reading conflicted file 2023-08-13 01:00:31 +00:00
Martin von Zweigbergk
33a93b6d2d working_copy: reduce scope of a content variable
This also avoids reading non-file conflict from disk.
2023-08-13 01:00:31 +00:00
Martin von Zweigbergk
585c212617 working_copy: reduce scope of an executable variable 2023-08-13 01:00:31 +00:00
Martin von Zweigbergk
2102de94b0 working_copy: inline write_conflict_to_store()
For tree-level conflicts, we're eventually not going to have
`ConflictId`. We'd want to make `write_conflict_to_store()` take a
`Merge<Option<TreeValue>>` and return an updated such value. That
would leave very little logic in the function, so let's just inline it
instead.
2023-08-13 01:00:31 +00:00
Martin von Zweigbergk
4c46398b1c conflicts: make update_from_content() write resolved content to store
`update_from_content()` already writes file content for each term of
an unresolved merge, so it seems consistent for it to also write the
file content for resolved merges. I think this should simplify further
refactoring for tree-level conflicts and for preserving the executable
bit.
2023-08-11 23:59:44 +00:00
Martin von Zweigbergk
0b85f06e3d conflicts: make update_from_content() work with only FileIds
Since `update_from_contents()` only works with file contents and not
the executable or other kinds of paths, I think it makes more sense
for it to deal with `FileId`s instead of `TreeValue`s.
2023-08-11 23:59:44 +00:00
Martin von Zweigbergk
94c14d454a tests: levarage the materialize_conflict_string() helper in more places 2023-08-11 23:59:44 +00:00
Martin von Zweigbergk
adf9679d4c tree: inline simplify_conflict()
The function is just a few lines now. I don't think we need the long
documentation in it either since that's now in
docs/technical/conflicts.md.
2023-08-11 21:11:25 +00:00
Martin von Zweigbergk
d4e755b4e4 merged_tree: rename some symbols away from "conflict"
There were still many instances of `conflict` left from before we
renamed `Conflict<T>` to `Merge<T>`. I decided to rename many of them
based on the type parameter instead of the container. I think that
made it more readable in many cases.
2023-08-11 21:11:25 +00:00
Martin von Zweigbergk
a995c66635 merge: move some methods back to conflicts as free functions
I think I moved way too many functions onto `Merge<Option<TreeValue>>`
in 82883e648d. This effectively reverts almost all of that
commit. The `Merge<T>` type is simple container and it seems like it
should be at fairly low level in the dependency graph. By moving
functions off of it, we can get rid of the back-depdencies from the
`merge` module to the `conflict` module that I introduced when I moved
`Merge` to the `merge` module. I'm thinking the `conflict` module can
focus on materialized conflicts.
2023-08-11 21:11:25 +00:00
Yuya Nishihara
925d54614d revset: remove round-trip conversion from heads() evaluation
This wouldn't matter much in practice, but I think it's better to stick to
low-level index primitives during revset evaluation.
2023-08-12 02:16:29 +09:00
Martin von Zweigbergk
d1dbe6de98 git: propagate errors for missing commits when importing refs 2023-08-11 05:06:36 +00:00
Martin von Zweigbergk
abc7312dbc working_copy: avoid an unused variable on Windows 2023-08-11 01:14:52 +00:00
Martin von Zweigbergk
0570963fe3 merge: add a Merge::into_resolved() to avoid cloning
I don't know if this has any measurable impact. It just seems like we
should be able to take a resolved value out of a `Merge` without
clonning.
2023-08-09 21:58:15 +00:00
Martin von Zweigbergk
f7160cf936 merge: add absent() and normal() to Merge<Option<T>>
These mimic the `RefTarget` functions. They're very useful in
`MergedTree`.

I might copy over other helpers from `RefTarget` later.
2023-08-09 21:58:15 +00:00
Yuya Nishihara
530547eb9c tests: test that git::import_refs() can update conflicted remote branch
Per discussion in #2009. This behavior isn't affected by e7e49527ef "git:
ensure that remote branches never diverge", but it's subtle enough to write
a test.
2023-08-10 06:27:16 +09:00
Yuya Nishihara
552c71ed36 tests: move commit_transactions() helper to testutils 2023-08-10 06:27:16 +09:00
Yuya Nishihara
e7e49527ef git: ensure that remote branches never diverge
I was considering how refs would be imported if we had a per-remote view of
named branches (and tags): Each remote has a view, and jj remembers the last
known view state to compute diffs. That's the same for the pseudo "git" remote.
Under the current storage, these view states are represented as follows:

  git_refs["refs/heads/{name}"]             # pseudo "git" remote branches
  git_refs["refs/tags/{name}"]              # pseudo "git" remote tags
  git_refs["refs/remotes/{remote}/{name}"]  # real remote branches

and the diffs are merged in to branches[name].local_target and tags[name].

We also have branches[name].remote_targets[remote], but I think it's redundant
because a tracking branch should also be the last known state, not something
that can diverge from the actual state. To make that clear, this commit
replaces the use of the "merge" API.
2023-08-09 15:22:45 +09:00
Martin von Zweigbergk
1d2324ae5c git: refactor SSH key callbacks to allow multiple keys
This is to prepare for adding support for checking other keys than
just id_rsa.
2023-08-09 03:44:03 +00:00
Benjamin Saunders
75636d626f local_backend: don't reference uninitialized memory 2023-08-08 13:08:26 -07:00
Martin von Zweigbergk
c752b43db1 git: only try to use ssh-agent once per connection
As reported in #1970, SSH authentication would sometimes run into a
loop where it repeatedly tries to use ssh-agent for authentication
without making progess. The problem can be reproduced by simply
removing `$SSH_AUTH_KEY` from your environment (and not having a Git
credentials helper configured, I think).

This seems to be a bug introduced by b104f8e154c21. That commit meant
to make it so we attempt to use ssh-agent and fall back to using
(password-less) keys after that. The problem is that
`git2::Cred::ssh_key_from_agent()` just returns an object that will be
used later for looking up the credentials from ssh-agent, so the call
will not fail because ssh-agent is not reachable.

This commit attempts to fix the problem by having the credentials
callback attempt to use ssh-agent only once.
2023-08-08 07:41:13 +00:00
Ilya Grigoriev
74d9970908 config: Rename push.branch-prefix option to git.push-branch-prefix
This is for consistency with other `git.` options. See also
https://github.com/martinvonz/jj/pull/1962#discussion_r1282605185
2023-08-07 19:10:10 -07:00
Yuya Nishihara
2619200657 refs: rename RefTarget::as_conflict() to as_merge()
Follows up ecc030848d. It's also nice that we have more distinction between
has_conflict() ans as_merge().
2023-08-07 08:05:57 +09:00
Martin von Zweigbergk
b9b285c985 conflicts: move Merge tests to merge module
I missed the tests when I moved the type.
2023-08-06 23:05:21 +00:00