These flags only apply to line-based diffs. This is easy, and seems still useful
to highlight whitespace changes (that could be ignored by line diffing.)
I've added short options only to "diff"-like commands. It seemed unclear if
they were added to deeply-nested commands such as "op log".
Closes#3781
We're likely to use the right (or new) context lines in rendered diffs, but
it's odd that the hunks iterator choose which context hunk to return. We'll
also need both contents to calculate left/right line numbers.
Since the hunk content types are the same, I also split enum DiffHunk into
{ kind, contents } pair.
This change made some diff benches slow, maybe because the generated code
becomes slightly worse due to the added abstraction? I'll revisit the
performance problem later. There are a couple of ways to mitigate it.
```
group new old
----- --- ---
bench_diff_git_git_read_tree_c 1.02 61.0±0.23µs 1.00 59.7±0.38µs
bench_diff_lines/modified/10k 1.00 41.6±0.24ms 1.02 42.3±0.22ms
bench_diff_lines/modified/1k 1.00 3.8±0.07ms 1.00 3.8±0.03ms
bench_diff_lines/reversed/10k 1.29 23.4±0.20ms 1.00 18.2±0.26ms
bench_diff_lines/reversed/1k 1.05 517.2±5.55µs 1.00 493.7±59.72µs
bench_diff_lines/unchanged/10k 1.00 3.9±0.10ms 1.08 4.2±0.10ms
bench_diff_lines/unchanged/1k 1.01 356.8±2.33µs 1.00 353.7±1.99µs
```
(I don't get stable results on my noisy machine, so the results would vary.)
The added comparison functions correspond to --ignore-all-space and
--ignore-space-change. --ignore-space-at-eol can be combined with the other
flags, so it will have to be implemented as a preprocessing function.
--ignore-blank-lines will also require some change in the tokenizer function.
This could be implemented as a newtype `Wrapper<'a>(&'a [u8])`, but a lifetime
of the wrap function couldn't be specified correctly:
fn diff(left: &[u8], right: &[u8], wrap_fn: F, ..)
where
F: for<'a> Fn(&'a [u8]) -> W<'a>, // F::Output<'a> can't be specified
W: Copy + Eq + Hash
If the wrapper were of `&Wrapper([u8])` type, `Fn(&[u8]) -> &W` works. However,
it means we can no longer set comparison parameter (such as Regex) dynamically.
Another idea is to add some filter function of `Fn(&[u8]) -> Cow<'_, [u8]>`
type, but I don't think we would want to pay the allocation cost in
hashing/comparison code. `Fn(&[u8]) -> impl Iterator<Item = &[u8]>` might work,
but it would be equally complex.
We'll use low-level HashTable to customize Eq/Hash without implementing newtype
wrappers.
Unneeded default features are disabled for now. Note that the new default
hasher, foldhash, is released under the Zlib license, which isn't currently
included in the allow list.
Most collection references implement `.into_iter()` or its mutable version,
so it is possible to iterate over the elements without using an explicit
method to do so.
I forgot to add the `snapshot.auto-track` config option to `config.md`
when I added it. This patch copies it from `working-copy.md` and
modifies it slightly.
Check if only the email or the name are missing in the config and specifically name the missing one, instead of always defaulting to potentially both missing.
We have two options to achieve "diff --ignore-*-space":
a. preprocess contents to be diffed, then translate hunk ranges back
b. add hooks to customize eq and hash functions
I originally thought (a) would be easier, but actually, there aren't many
changes needed to implement (b). And (b) should have a fewer logic errors.
This patch removes assumption that each unchanged region has the same content
length. It won't be true if whitespace characters are ignored.
Since we've moved the default log revset to config/*.toml at 3dab92d2, we don't
have to repeat the default value. It can be queried by "jj config list". I also
split the help paragraphs.
Intersection of unchanged ranges becomes a simple merge-join loop, so I've
removed the existing tests. I also added a fast path for the common 2-way
diffs in which we don't have to build vec![(pos, vec![pos])].
One source of confusion introduced by this change is that WordPosition means
both global and local indices. This is covered by the added tests, but I might
add separate local/global types later.
It's silly that we build new Vec for each recursion stack and merge elements
back. I don't see a measurable performance difference in the diff bench, but
this change will help simplify the next patch. If a result vec were created for
each unchanged_ranges() invocation, it would probably make more sense to return
a list of "local" word positions. Then, callers would have to translate the
returned positions to the caller's local positions.
When `format_short_signature(signature)` is set to `signature.name()` the author names are not yellow like other signature types (eg email and username). When the commit signatures have no colors, they blend in making it hard to distinguish between signatures and commit messages.
If just `name` were set to `yellow`, just like email and username, it affects the colorization of branch names making them also yellow despite them being designated as magenta. Setting `author` and `committer` to `yellow` is specific enough to allow branches to keep their colors while still coloring signature names. This is known to affect signatures in both 'log' and 'show'.