jj/cli
Martin von Zweigbergk fefe07b3c3 diff: consider uncommon words to match only if they have the same count
Patience diff starts by lining up unique elements (e.g. lines) to find
matching segments of the inputs. After that, it refines the
non-matching segments by repeating the process. Histogram expands on
that by not just considering unique elements but by continuing with
elements of count 2, then 3, etc.

Before this commit, when diffing "a b a b b" against "a b a b a b", we
would match the two "a"s in the first input against the first two "a"s
in the second input. After this patch, we ignore the "a"s because
their counts differ, so we try to align the "b"s instead.

I have had this commit lying around since I wrote the histogram diff
implementation in 1e657c5331. I vaguely remember thinking that the
way I had implemented it (without this commit) was a bit weird, but I
wasn't sure if this commit would be an improvement or not. The bug
report from @chooglen today of a case where we behave differently from
Git is enough to make me think that we make this change after all.

#761
2024-07-09 20:35:36 +09:00
..
examples copy-tracking: stub get_copy_records 2024-07-03 20:26:30 -04:00
src cli_util: short-prefixes for commit summary in transaction 2024-07-08 08:23:39 -05:00
testing diff: add a file-by-file variant for external diff tools 2024-07-03 20:09:17 -04:00
tests diff: consider uncommon words to match only if they have the same count 2024-07-09 20:35:36 +09:00
build.rs build: update rerun-if conditions to watch .git/HEAD in colocated repo 2023-08-06 12:16:11 +09:00
Cargo.toml windows: avoid UNC paths in run_ui_editor 2024-07-04 11:30:20 +10:00
LICENSE cargo: add LICENSE file to each crate we publish 2023-09-22 21:48:28 -07:00