Commit graph

3124 commits

Author SHA1 Message Date
Yuya Nishihara
085e17e1cc files: micro-optimzie DiffLine::reset_line() to not clone hunks 2024-08-16 09:30:30 +09:00
Yuya Nishihara
a973c7b0ea files: replace precomputed has_left/right_content flags with functions
I don't think the iteration cost would matter here, and it doesn't make sense
that has_left/right_content are cached whereas is_unmodified() isn't.
2024-08-16 09:30:30 +09:00
Yuya Nishihara
cca5277184 diff: clarify that DiffLine hunk doesn't have [left, right] diff pair
This will simplify users of line.hunks[] which I'm going to add.
2024-08-16 09:30:30 +09:00
Yuya Nishihara
ac27365290 files: ensure that DiffLineIterator hunk has exactly two inputs
I've made the constructor public, so let's add more sanity checks.
2024-08-16 09:30:30 +09:00
Matt Kulukundis
95e8dd51eb copy-tracking: add support for diff --git 2024-08-15 11:03:39 -04:00
Yuya Nishihara
78c0128ec3 files: make DiffLineIterator accept generic DiffLine iterator
I'm thinking of adding some heuristics to render hunks containing lots of
small word changes differently, in a similar manner to the unified diffs. This
patch might help add some pre/post-processing at consumer.

files::diff() is inlined to caller to get around 'self borrowing.
2024-08-15 20:06:12 +09:00
Yuya Nishihara
f85792288f files: replace Vec + index access with iterator in DiffLineIterator 2024-08-15 20:06:12 +09:00
Yuya Nishihara
54f5c01eae files: use imported DiffHunk type in DiffLineIterator 2024-08-15 20:06:12 +09:00
Yuya Nishihara
a62c8776e8 diff: move empty content optimization from diff() to Diff::for_tokenizer()
unchanged_ranges() already has the fast path for empty content, but we can
also disable tokenization.
2024-08-15 20:06:12 +09:00
Yuya Nishihara
73e4daf5ce tests: add more empty content diff samples 2024-08-15 20:06:12 +09:00
Benjamin Tan
ab604b4ecd rewrite::move_commits(): preserve order of parent commits
When rebasing a new child commit on top of the moved commit(s), the
order of the new child commit's parent commits is now correctly
preserved if the original parent commit is now a parent of the moved
commit(s).

Closes #3969.
2024-08-15 17:51:03 +08:00
Matt Kulukundis
b67b774ba7 fix: small clippy warning 2024-08-14 10:21:28 -04:00
Matt Kulukundis
ec99a17ae8 copy-tracking: improve --summary and add --stat
- add support for copy tracking to `diff --stat`
- switch `--summary` to match git's output more closely
- rework `show_diff_summary` signature to be more consistent
2024-08-13 21:37:45 -04:00
Aaron Bull Schaefer
e803bed845 config: expand tilde in ssh key filepaths
Add home directory expansion for SSH key filepaths. This allows the
`signing.key` configuration value to work more universally across both
Linux and macOS without requiring an absolute path.

This moved and renamed the previous `expand_git_path` function to a more
generic location, and the prior use was updated accordingly.
2024-08-13 08:06:43 -07:00
Yuya Nishihara
a609580204 revset: avoid merging whole parent trees by file()/diff_contains() query
Perhaps, we should also cache merged trees, but this patch saves time until
we implement the bookkeeping. Even if we had a cache, it wouldn't be ideal to
calculate uncached merged trees during revset evaluation.

```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-1,jj-2 \
  'target/release-with-debug/{bin} -R ~/mirrors/git --ignore-working-copy \
   log -r "::@ & file(root:builtin)" --no-graph -n50'
Benchmark 1: target/release-with-debug/jj-1 ..
  Time (mean ± σ):      3.512 s ±  0.014 s    [User: 3.391 s, System: 0.119 s]
  Range (min … max):    3.489 s …  3.528 s    10 runs

Benchmark 2: target/release-with-debug/jj-2 ..
  Time (mean ± σ):      1.351 s ±  0.010 s    [User: 1.275 s, System: 0.074 s]
  Range (min … max):    1.332 s …  1.366 s    10 runs

Relative speed comparison
        2.60 ±  0.02  target/release-with-debug/jj-1 ..
        1.00          target/release-with-debug/jj-2 ..
```
2024-08-13 15:02:24 +09:00
Yuya Nishihara
13f0a2f008 revset: inline materialized_diff_stream() in diff_contains() evaluation function
I'll add conflict resolution there.

This change adds more synchronization points, which is probably bad for
concurrency. However, this module is a revset engine for the default index,
so the store backends are supposed to be fast local disks.
2024-08-13 15:02:24 +09:00
Yuya Nishihara
c651930e9a revset: pass valid file paths to diff_contains() error 2024-08-13 15:02:24 +09:00
Yuya Nishihara
145f942d99 merged_tree: add function that resolves file conflicts non-recursively
Conflict resolution is expensive, so I'm going to make file()/diff_contains()
revsets not resolve the whole parent trees.
2024-08-13 15:02:24 +09:00
Yuya Nishihara
a6566832c2 merged_tree: extract file-conflict resolution from merge_tree_values()
I'll add a public function that resolves file conflicts. This function will
take owned MergedTreeValue, and that's why the extracted function returns
None instead of cloning the passed value.
2024-08-13 15:02:24 +09:00
Benjamin Tan
38f6ee8918 cargo: bump git2 to 0.19.0
This includes a bump of `libgit2` to v1.8.1.
2024-08-13 11:47:21 +08:00
Yuya Nishihara
f7377fbbcd merged_tree: replace MergedTreeVal<'_> by Merge<Option<&TreeValue>>
MergedTreeVal was roughly equivalent to Merge<Option<Cow<_>>. As we've dropped
support for the legacy trees, it can be simplified to Merge<Option<&_>>.
2024-08-12 23:01:46 +09:00
Yuya Nishihara
2977900482 merge: move non-consuming Merge<Option<TreeValue>> methods to generic type
The next patch will add .is_tree() callers, and the other methods don't
required owned type.
2024-08-12 23:01:46 +09:00
Yuya Nishihara
8268af9b4f merge: add helper function to match Option<impl Borrow<TreeValue>>
More callers will be added by the next commit.
2024-08-12 23:01:46 +09:00
Yuya Nishihara
accd1e337a merge: add .cloned() method that maps inner Option<&T> to Option<T>
MergedTreeVal::to_merge() will be replaced with this.
2024-08-12 23:01:46 +09:00
Benjamin Tan
e2ab6d4f42 rewrite: migrate move_commits function from rebase command 2024-08-12 21:48:17 +08:00
Benjamin Tan
9c1b627f9b jj_lib: include indexmap as dependency
This is in preparation for shifting of `move_commits` function to
`jj_lib::rewrite`.
2024-08-12 21:48:17 +08:00
Yuya Nishihara
fd52efa0ba merged_tree: leverage Merge<Tree> entries iterator in all_tree_entries() 2024-08-12 10:20:34 +09:00
Yuya Nishihara
88018e84fc merged_tree: micro-optimize Merge<Tree> entries iterator to return &TreeValue
try_resolve_file_conflict() is also updated. It could be a generic function,
but there are only two callers, and the legacy tree one is used only in tests.
2024-08-12 10:20:34 +09:00
Yuya Nishihara
6d6f5990de merged_tree: add merge-join iterator over Merge<Tree> entries
For the same reason as 2cb7e91d "merged_tree: do not re-look up non-conflicting
tree values by name." This appears to bring a similar performance improvement.

I assume this change is/will be covered by test_merged_tree.rs. I considered
adding a few unit tests, but constructing Tree object isn't trivial, and the
iterator implementation is relatively straightforward.
2024-08-12 10:20:34 +09:00
Matt Kulukundis
5911e5c9b2 copy-tracking: Add copy tracking as a post iteration step
- force each diff command to explicitly enable copy tracking
- enable copy tracking in diff_summary
- post-process for diff iterator
- post-process for diff stream
- update changelog
2024-08-11 17:01:45 -04:00
Matt Kulukundis
0349d9ead3 copy-tracking: extract next_impl from next in diff iter/stream 2024-08-11 17:01:45 -04:00
Matt Kulukundis
34b0f87584 copy-tracking: plumb CopyRecordMap through diff method 2024-08-11 17:01:45 -04:00
Matt Kulukundis
6bae5eaf9d copy-tracking: create a MaterializedTreeDiffEntry type 2024-08-11 17:01:45 -04:00
Matt Kulukundis
e123eb21b9 copy-tracking: add source field to TreeDiffEntry
- add the field and make it compile, but don't use it yet
2024-08-11 17:01:45 -04:00
Matt Kulukundis
8e84c60157 copy-tracking: create an explicit TreeDiffEntry struct 2024-08-11 17:01:45 -04:00
Matt Kulukundis
ee6b922144 copy-tracking: create CopyRecordMap and add it to diff summaries 2024-08-11 17:01:45 -04:00
Matt Kulukundis
e667a2b403 copy-tracking: adjust backend signature
- use a single commit instead of an array of them.  This simplifies the
  implementation.  A higher level api can wrap this when an array of
  commits is desired and those semantics are figured out.
- since this API is directly 1-1 on parents, there are no conflicts
- if we introduce a higher level API that handles lists of commits, we
  may need to restore the conflict/resolved distinction, but for now
  simplify
2024-08-11 17:01:45 -04:00
Yuya Nishihara
c9e147c425 merged_tree: allow to postpone resolution of intermediate trees
This allows us to diff trees without fully resolving conflicts:

    let from_tree = merge_no_resolve(..);
    for (path, (from, to)) in from_tree.diff(to_tree, matcher) {
        let from = resolve_conflicts(from);
        if from == to {
            continue; // resolved file may be identical
        ...

I originally considered adding a matcher argument to merge() functions, but the
resulting API looked misleading. If merge() took a matcher, callers might expect
unmatched trees and files were omitted, not left unresolved. It's also slower
than diffing unresolved trees because merge(.., matcher) would have to write
partially resolved trees to the store.

Since "ancestor_tree" isn't resolved by itself, this patch has subtle behavior
change. For example, "jj diff -r9eaef582" in the "git" repository is no longer
empty. I think the new behavior is also technically correct, but I'm not pretty
sure.
2024-08-11 18:23:21 +09:00
Yuya Nishihara
5d141befc2 tests: evaluate file()/diff_contains() revset against merged parents
These tests would fail if trees are compared without resolving file conflicts.
2024-08-11 18:23:21 +09:00
Yuya Nishihara
dac04960f0 rewrite: remove redundant commit_id.clone() from merge_commit_trees*() 2024-08-11 18:23:21 +09:00
Yuya Nishihara
ed1c07e73e tree: fill in valid id to null tree, rename function to empty()
If a null tree were written to the store, GitBackend would crash because of
invalid hash length.
2024-08-11 18:23:21 +09:00
Yuya Nishihara
2cb7e91dc7 merged_tree: do not re-look up non-conflicting tree values by name
While measuring file(path) query, I noticed BTreeMap lookup appears in perf.
It actually has a measurable cost if the history is linear and parent trees
don't have to be merged dynamically. For merge-heavy history, the cost of
tree merges is more significant. I'll address that separately.

```
% hyperfine --sort command --warmup 3 --runs 50 -L bin jj-1,jj-2 \
  'target/release-with-debug/{bin} -R ~/mirrors/git --ignore-working-copy \
   log -r "::trunk() & ~merges() & file(root:builtin)" --no-graph -n100'
Benchmark 1: target/release-with-debug/jj-1 ..
  Time (mean ± σ):     239.7 ms ±   7.1 ms    [User: 192.1 ms, System: 46.5 ms]
  Range (min … max):   222.2 ms … 249.7 ms    50 runs

Benchmark 2: target/release-with-debug/jj-2 ..
  Time (mean ± σ):     201.7 ms ±   6.9 ms    [User: 153.7 ms, System: 46.6 ms]
  Range (min … max):   184.2 ms … 211.1 ms    50 runs

Relative speed comparison
        1.19 ±  0.05  target/release-with-debug/jj-1 ..
        1.00          target/release-with-debug/jj-2 ..
```
2024-08-09 00:17:37 +09:00
Yuya Nishihara
19b62d29ba merged_tree: leverage .to_tree_merge() in TreeDiffIterator 2024-08-08 23:05:37 +09:00
Yuya Nishihara
6fc7cec4a5 merged_tree: make TreeDiffIterator accept trees as &Merge<Tree>
For the same reason as the patch for TreeEntriesIterator. It's probably
better to assume that MergedTree represents the root tree.
2024-08-08 23:05:37 +09:00
Yuya Nishihara
9378adedb7 merged_tree: hold store globally by TreeDiffIterator
Since TreeDiffDirItem is now calculated eagerly, it doesn't make sense to
keep MergedTree in it.
2024-08-08 23:05:37 +09:00
Yuya Nishihara
37c41d0eaf tests: do not pass in commit objects loaded from different store
Otherwise the assertion would fail in the next patch.
2024-08-08 23:05:37 +09:00
Yuya Nishihara
8b72dad095 merged_tree: replace explicit .is_tree() call in TreeEntriesIterator
The value here shouldn't be absent, so .is_tree() is equivalent to
.to_tree_merge().is_some().
2024-08-08 23:05:37 +09:00
Yuya Nishihara
12434b49b8 merged_tree: make TreeEntriesIterator accept trees as &Merge<Tree>
Suppose we add copy information to MergedTree, a MergedTree can be considered
a root tree representation plus global metadata. I think Merge<Tree> is a better
type for sub trees.
2024-08-08 23:05:37 +09:00
Yuya Nishihara
8a3e4ad966 merged_tree: hold store globally by TreeEntriesIterator
Since TreeEntriesDirItem is now calculated eagerly, it doesn't make sense to
keep MergedTree in it.
2024-08-08 23:05:37 +09:00
Martin von Zweigbergk
ec7725064b merged_tree: make MergedTree a struct
I considered making `MergedTree` just a newtype (1-tuple) but I went
with a struct instead because we may want to add copy information in a
separate field in the future.
2024-08-08 05:32:16 -07:00