ok/jj
1
0
Fork 0
forked from mirrors/jj
jj/lib/benches/diff_bench.rs
Martin von Zweigbergk 1e657c5331 diff: add a histogram(-like?) diff algorithm
The current diff algorithm does a full LCS on the words of the texts,
which is really slow. Diffing the working copy when e.g.
`src/commands.py` has changes far apart takes seconds. This patch adds
an implementation inspired by JGit's Histogram diff. I say "inspired"
because I just didn't quite understand it :P In particular, I didn't
understand what it does when it finds non-unique elements. I decided
to line up the leading common elements on both sides of the merge. I
don't know if that usually gives good enough results in practice.

I'm sure this can still be optimized a lot, but this seems good enough
as a start. There is also many things to improve about the quality of
the diffs.
2021-03-31 22:15:36 -07:00

71 lines
1.9 KiB
Rust

#![feature(test)]
extern crate test;
use jujube_lib::diff;
use test::Bencher;
fn unchanged_lines(count: usize) -> (String, String) {
let mut lines = vec![];
for i in 0..count {
lines.push(format!("left line {}\n", i));
}
(lines.join(""), lines.join(""))
}
fn modified_lines(count: usize) -> (String, String) {
let mut left_lines = vec![];
let mut right_lines = vec![];
for i in 0..count {
left_lines.push(format!("left line {}\n", i));
right_lines.push(format!("right line {}\n", i));
}
(left_lines.join(""), right_lines.join(""))
}
fn reversed_lines(count: usize) -> (String, String) {
let mut left_lines = vec![];
for i in 0..count {
left_lines.push(format!("left line {}\n", i));
}
let mut right_lines = left_lines.clone();
right_lines.reverse();
(left_lines.join(""), right_lines.join(""))
}
#[bench]
fn bench_diff_1k_unchanged_lines(b: &mut Bencher) {
let (left, right) = unchanged_lines(1000);
b.iter(|| diff::diff(left.as_bytes(), right.as_bytes()));
}
#[bench]
fn bench_diff_10k_unchanged_lines(b: &mut Bencher) {
let (left, right) = unchanged_lines(10000);
b.iter(|| diff::diff(left.as_bytes(), right.as_bytes()));
}
#[bench]
fn bench_diff_1k_modified_lines(b: &mut Bencher) {
let (left, right) = modified_lines(1000);
b.iter(|| diff::diff(left.as_bytes(), right.as_bytes()));
}
#[bench]
fn bench_diff_10k_modified_lines(b: &mut Bencher) {
let (left, right) = modified_lines(10000);
b.iter(|| diff::diff(left.as_bytes(), right.as_bytes()));
}
#[bench]
fn bench_diff_1k_lines_reversed(b: &mut Bencher) {
let (left, right) = reversed_lines(1000);
b.iter(|| diff::diff(left.as_bytes(), right.as_bytes()));
}
#[bench]
fn bench_diff_10k_lines_reversed(b: &mut Bencher) {
let (left, right) = reversed_lines(10000);
b.iter(|| diff::diff(left.as_bytes(), right.as_bytes()));
}