I played with max-inline-alternation = 3 for a couple of weeks, and it's pretty
good. I think somewhere between 2 and 4 is good default because one or two
remove + add sequences are easy to parse.
In this patch, I use the number of adds<->removes alternation as a threshold,
which approximates the visual complexity of diff hunks. I don't think user can
choose the threshold intuitively, but we need a config knob to try out some.
I set `max-inline-alternation = 3` locally. 0 and 1 mean "disable inlining"
and "inline adds-only/removes-only lines" respectively.
I've added "diff.<format>" config namespace assuming "ui.diff" will be
reorganized as "ui.diff-formatter" or something. #3327
Some other metrics I've tried:
```
// Per-line alternation. This also works well, but can't measure complexity of
// changes across lines.
fn count_max_diff_alternation_per_line(diff_lines: &[DiffLine]) -> usize {
diff_lines
.iter()
.map(|line| {
let sides = line.hunks.iter().map(|&(side, _)| side);
sides
.filter(|&side| side != DiffLineHunkSide::Both)
.dedup() // omit e.g. left->both->left
.count()
})
.max()
.unwrap_or(0)
}
// Per-line occupancy of changes. Large diffs don't always look complex.
fn max_diff_token_ratio_per_line(diff_lines: &[DiffLine]) -> f32 {
diff_lines
.iter()
.filter_map(|line| {
let [both_len, left_len, right_len] =
line.hunks.iter().fold([0, 0, 0], |mut acc, (side, data)| {
let index = match side {
DiffLineHunkSide::Both => 0,
DiffLineHunkSide::Left => 1,
DiffLineHunkSide::Right => 2,
};
acc[index] += data.len();
acc
});
// left/right-only change is readable
(left_len != 0 && right_len != 0).then(|| {
let diff_len = left_len + right_len;
let total_len = both_len + left_len + right_len;
(diff_len as f32) / (total_len as f32)
})
})
.reduce(f32::max)
.unwrap_or(0.0)
}
// Total occupancy of changes. Large diffs don't always look complex.
fn total_change_ratio(diff_lines: &[DiffLine]) -> f32 {
let (diff_len, total_len) = diff_lines
.iter()
.flat_map(|line| &line.hunks)
.fold((0, 0), |(diff_len, total_len), (side, data)| {
let l = data.len();
match side {
DiffLineHunkSide::Both => (diff_len, total_len + l),
DiffLineHunkSide::Left => (diff_len + l, total_len + l),
DiffLineHunkSide::Right => (diff_len + l, total_len + l),
}
});
(diff_len as f32) / (total_len as f32)
}
```
- add support for copy tracking to `diff --stat`
- switch `--summary` to match git's output more closely
- rework `show_diff_summary` signature to be more consistent
- force each diff command to explicitly enable copy tracking
- enable copy tracking in diff_summary
- post-process for diff iterator
- post-process for diff stream
- update changelog
It's inconvenient that we have to quote glob patterns as 'glob:"*.rs"'. Suppose
filesets are usually specified in shell, it's better to allow unquoted strings
if possible. This change also means we'll probably abandon #2101 "make the
parsing of string arguments stricter."
Note that we can no longer introduce ? operator or [] subscript syntax in
filesets.
Closes#4053
This not only makes the output easier to read, but also protects against
implementation detail changes in `write!` when used with a format
string (especially, how many times and with what strings it calls the
underlying writer).
The output looks somewhat similar to color-words diffs. Unified diffs are
verbose, but are easier to follow if adjacent lines are added/removed + modified
for example.
Word-level diffing is forcibly enabled. We can also add a config knob (or
!color condition) to turn it off to save CPU time.
I originally considered disabling highlights in block insertion/deletion, but
that wasn't always great. This can be addressed separately as it also applies
to color-words diffs. #3958
The last hunk could be truncated instead, but the .peekable() version is easier
to follow. If we truncated lines, we would have to adjust line ranges
accordingly.
It's common to create empty working-copy commits while using jj, and
currently the author timestamp for a commit is only set when it is first
created. If you create an empty commit, then don't work on a repo for a
few days, and then start working on a new feature without abandoning the
working-copy commit, the author timestamp will remain as the time the
commit was created rather than being updated to the time that work began
or finished.
This commit changes the behavior so that discardable commits (empty
commits with no description) by the current user have their author
timestamps reset when they are rewritten, meaning that the author
timestamp will become finalized whenever a commit is given a description
or becomes non-empty.
Mercurial appears to resolve cwd-relative path first, so "glob:*.c" could be
parsed as "**/*.c" if cwd was literally "**". It wouldn't practically matter,
but isn't correct. Instead, jj's parser first splits glob into literal part
and pattern. That's mainly because we want to parse the user input texts into
type-safe objects, and (RepoPathBuf, glob::Pattern) pairs are the simplest
ones. The current parser can't handle patterns like "foo/*/.." (= "foo" ?),
and errors out. I believe this restriction is acceptable.
Unlike literal paths, the 'glob:' pattern anchors to the whole file path. I
don't think "prefix"-matching glob is useful, and making it the default would
be rather confusing.
Maybe we can optimize it to check paths during diffing, but I think it's okay
to add extra lookup cost at the end. The size of the path arguments is usually
small.
Closes#505
It's inconsistent that some warnings have headings and some don't, and it seems
the choice is arbitrary. Let's unify the style. There are two exceptions:
1. continued line following labeled message,
2. "unrecognized response" followed by prompt.
As discussed in #2900, the milliseconds are rarely useful, and it can
be confusing with different timezones because it makes harder to
compare timestamps.
I added an environment variable to control the timestamp in a
cross-platform way. I didn't document because it exists only for tests
(like `JJ_RANDOMNESS_SEED`).
Closes#2900
Before, --tool=:builtin argument was ignored and the tool was loaded from
"ui.diff.tool" option. Since there is no single builtin diff format, :builtin
doesn't make sense here. Maybe we can translate ":<format>" to the internal
diff format instead, but that will also mean "ui.diff.tool" and ".format" can
be merged.
This partially reverts 409356fa5b "merge_tools: enable `:builtin` as default
diff/merge editor."
this greatly speeds up the time to run all tests, at the cost of slightly larger recompile times for individual tests.
this unfortunately adds the requirement that all tests are listed in `runner.rs` for the crate.
to avoid forgetting, i've added a new test that ensures the directory is in sync with the file.
## benchmarks
before this change, recompiling all tests took 32-50 seconds and running a single test took 3.5 seconds:
```
; hyperfine 'touch lib/src/lib.rs && cargo t --test test_working_copy'
Time (mean ± σ): 3.543 s ± 0.168 s [User: 2.597 s, System: 1.262 s]
Range (min … max): 3.400 s … 3.847 s 10 runs
```
after this change, recompiling all tests take 4 seconds:
```
; hyperfine 'touch lib/src/lib.rs ; cargo t --test runner --no-run'
Time (mean ± σ): 4.055 s ± 0.123 s [User: 3.591 s, System: 1.593 s]
Range (min … max): 3.804 s … 4.159 s 10 runs
```
and running a single test takes about the same:
```
; hyperfine 'touch lib/src/lib.rs && cargo t --test runner -- test_working_copy'
Time (mean ± σ): 4.129 s ± 0.120 s [User: 3.636 s, System: 1.593 s]
Range (min … max): 3.933 s … 4.346 s 10 runs
```
about 1.4 seconds of that is the time for the runner, of which .4 is the time for the linker. so
there may be room for further improving the times.
previously, `jj diff` would show the full contents of binary files such as images.
after this change, it instead shows "(binary)". it still shows the filename and metadata so that
users can open the file in the viewer of their choce.
future work could involve showing binary files as Sixel or similar; finding a way to compare large
non-binary files without filling up the screen; or extending the data backends to avoid having to
read the whole file contents into memory.
If the path is too long to fit on the screen, this patch makes it so
we elide the first part of it. It goes a bit further and trims it down
to ~70% of the screen, giving some room for the stat. This seems
somewhat similar to what Git does.
`insta` ignores leading indentation (as long as it's consistent within
the snapshot), but when a test case fails and you let `cargo insta`
update it, it's going to use a specific indentation. There were a few
tests that didn't match that indentation, which could lead to
surprising diffs if the tests fail at some point.
This shows that there's too much padding because we pad based on
number of bytes.
I had to reduce the path names for the file names not to get too long
for my file system.
We can still crash on terminals that are less than 4 characters wide
(maybe it doesn't matter if we do because the user can't tell the
crash report from a diffstat in such a terminal?). This patch fixes
the crash.