mirrors/jj

mirror of https://github.com/martinvonz/jj.git synced 2024-12-27 14:57:14 +00:00

Author	SHA1	Message	Date
Martin von Zweigbergk	f4a41f3880	trees: make tree diff return an iterator instead of taking a callback This is yet another step towards making it easy to propagate `BrokenPipe` errors. The `jj diff` code (naturally) diffs two trees and prints the diffs. If the printing fails, we shouldn't just crash like we do today. The new code is probably slower since it does more copying (the callback got references to the `FileRepoPath` and `TreeValue`). I hope that won't make a noticeable difference. At least `jj diff -r 334afbc76fbd --summary` didn't seem to get measurably slower.	2021-04-07 23:18:00 -07:00
Martin von Zweigbergk	8b2ce18254	trees: make diff_entries() return an iterator instead of taking a callback The iterator version is easier to use and we get rid of the ugly type parameter for the error type. I also simplified the code by using `Peekable` iterators.	2021-04-07 15:48:11 -07:00
Martin von Zweigbergk	5c10c93e64	diff: fix tests broken by the previous commit Sorry, I forgot to run the automated tests again :(	2021-04-07 11:00:04 -07:00
Martin von Zweigbergk	0dd000d236	diff: do final refinement at byte-level for non-word bytes This results in significantly more readable diffs on commits like `659393bec2` in this repo. Before: test bench_diff_10k_lines_reversed ... bench: 38,122,998 ns/iter (+/- 557,688) test bench_diff_10k_modified_lines ... bench: 32,556,563 ns/iter (+/- 548,114) test bench_diff_10k_unchanged_lines ... bench: 4,231 ns/iter (+/- 15) test bench_diff_1k_lines_reversed ... bench: 958,296 ns/iter (+/- 46,963) test bench_diff_1k_modified_lines ... bench: 3,014,723 ns/iter (+/- 15,830) test bench_diff_1k_unchanged_lines ... bench: 249 ns/iter (+/- 2) test bench_diff_git_git_read_tree_c ... bench: 78,599 ns/iter (+/- 1,079) After: test bench_diff_10k_lines_reversed ... bench: 38,289,493 ns/iter (+/- 413,712) test bench_diff_10k_modified_lines ... bench: 37,352,516 ns/iter (+/- 1,293,950) test bench_diff_10k_unchanged_lines ... bench: 4,238 ns/iter (+/- 13) test bench_diff_1k_lines_reversed ... bench: 967,253 ns/iter (+/- 8,506) test bench_diff_1k_modified_lines ... bench: 3,358,028 ns/iter (+/- 37,154) test bench_diff_1k_unchanged_lines ... bench: 233 ns/iter (+/- 1) test bench_diff_git_git_read_tree_c ... bench: 95,787 ns/iter (+/- 740) So the biggest slowdown is when there are modified lines.	2021-04-07 10:27:17 -07:00
Martin von Zweigbergk	f634ff0e3f	files: make diff() return an iterator instead of using a callback Iterators are generally nicer to work with. My immediate goal is to be able to propagate errors when failing to write to stdout.	2021-04-07 10:07:18 -07:00
Martin von Zweigbergk	d7395cc34a	diff: add copyright header	2021-04-06 21:26:37 -07:00
Martin von Zweigbergk	7e4e43f358	diff: first diff lines, then refine to words, producing better diffs The new diff algorithm produces pretty bad diffs in some cases, such as `cc4b1e9230` in this repo (the parent of this commit). I think the problem there is that many words are repeated over and over. Diffing first at the line level and then refining the diff of the changed ranges at the word level gives much better results. That's what this patch does. After this patch, `jj diff -r cc4b1e923091` looks pretty similar to the diff in GitHub's UI. I hope to get around to doing the same for the merge code soon. Impact on benchmarks: Before: test bench_diff_10k_lines_reversed ... bench: 42,647,532 ns/iter (+/- 765,347) test bench_diff_10k_modified_lines ... bench: 21,407,980 ns/iter (+/- 126,366) test bench_diff_10k_unchanged_lines ... bench: 4,235 ns/iter (+/- 16) test bench_diff_1k_lines_reversed ... bench: 1,190,483 ns/iter (+/- 7,192) test bench_diff_1k_modified_lines ... bench: 1,919,766 ns/iter (+/- 9,665) test bench_diff_1k_unchanged_lines ... bench: 231 ns/iter (+/- 1) test bench_diff_git_git_read_tree_c ... bench: 174,702 ns/iter (+/- 1,199) After: test bench_diff_10k_lines_reversed ... bench: 38,289,509 ns/iter (+/- 129,004) test bench_diff_10k_modified_lines ... bench: 33,140,659 ns/iter (+/- 3,989,339) test bench_diff_10k_unchanged_lines ... bench: 3,099 ns/iter (+/- 14) test bench_diff_1k_lines_reversed ... bench: 973,551 ns/iter (+/- 94,895) test bench_diff_1k_modified_lines ... bench: 3,033,818 ns/iter (+/- 29,513) test bench_diff_1k_unchanged_lines ... bench: 230 ns/iter (+/- 1) test bench_diff_git_git_read_tree_c ... bench: 79,100 ns/iter (+/- 963) So most of them get slower, as expected. The last one, taken from a real diff in the git.git repo, get faster, however (which is also what I would have expected).	2021-04-04 21:50:31 -07:00
Martin von Zweigbergk	cc4b1e9230	test: fix merge tests to expect line-based merging I made a quite late change in a recent patch to make the merge code to merge based on lines instead of words. I forgot to update the tests (and to even run them). Sorry :(	2021-04-01 08:27:27 -07:00
Martin von Zweigbergk	c071d412af	diff: use new diff algorithm for content diff The previous patch switched over the content-merge code to use the new histogram diff code. This patch switches over the content-diff code to use the histogram diff code. As before, the immediate goal is to speed it up. `jj diff -r c28ded83fc` in the git.git repo is a good example of a diff that's extremely slow to calculate with our current LCS-based diff. With this patch, that drops from 35 s to 0.12 s. The diff was slightly better before. I think that's mostly because of our different definition of a "word" in the data. We can improve that later. The speedup we get now is easily worth the slightly worse diff.	2021-03-31 22:22:59 -07:00
Martin von Zweigbergk	3c35dbace6	merge: use new diff algorithm for finding sync regions With the histogram diff code from the previous patch, we can now start using that for finding the "sync regions" in 3-way merge. That helps a lot with the slow merging we had before this patch. `jj diff -r 9d540e9726` in the git.git repo drops from 22 s to 0.15 s with this patch. (That commit is a rather arbitrary merge commit from aroun 5 years ago.) With the new diff algorithm, the output of `jj diff -r 9d540e9726` in git.git looks better if we find unchanged sync regions based on lines than on words, so that's what I'm using in this patch. That's a change compared the the LCS-based diff we used before this patch. I suspect the reason that finding sync regions based on words works worse now is not because of the change from LCS to histogram but because of the change in how we define a word. My goal right now is mostly to make it faster; I'll get back to refining the diff result later.	2021-03-31 22:16:19 -07:00
Martin von Zweigbergk	1e657c5331	diff: add a histogram(-like?) diff algorithm The current diff algorithm does a full LCS on the words of the texts, which is really slow. Diffing the working copy when e.g. `src/commands.py` has changes far apart takes seconds. This patch adds an implementation inspired by JGit's Histogram diff. I say "inspired" because I just didn't quite understand it :P In particular, I didn't understand what it does when it finds non-unique elements. I decided to line up the leading common elements on both sides of the merge. I don't know if that usually gives good enough results in practice. I'm sure this can still be optimized a lot, but this seems good enough as a start. There is also many things to improve about the quality of the diffs.	2021-03-31 22:15:36 -07:00
Martin von Zweigbergk	998e23db3c	index: add IndexEntry::parents() and predecessors() returning Vec<IndexEntry>	2021-03-31 14:48:03 -07:00
Martin von Zweigbergk	53d1757994	dag_walk: remove unused TopoIter	2021-03-18 16:42:30 -07:00
Martin von Zweigbergk	db4e8bc458	cargo: upgrade to protobuf 2.22.1 to avoid workaround for rustfmt::skip	2021-03-18 13:06:42 -07:00
Martin von Zweigbergk	07c2b2316f	repo: remove obsolete part of a TODO (we use the index to filter out non-heads)	2021-03-17 08:28:21 -07:00
Martin von Zweigbergk	30cd94f842	dag_walk: rename unreachable() to heads() to match name we use in index module	2021-03-16 23:54:51 -07:00
Martin von Zweigbergk	5aec8b9d77	evolution: use index for filtering out ancestors of candidates in new_parent() This speeds up `jj evolve` of 100 linear commits of the "what's cooking" branch in the git.git repo further, from ~700 ms to ~400 ms.	2021-03-16 23:43:44 -07:00
Martin von Zweigbergk	73f20c8696	transaction: delete write_commit() and as_repo_ref() helpers With this patch, the simple delegating helpers are gone from `Transaction`.	2021-03-16 22:45:58 -07:00
Martin von Zweigbergk	f9873c49ec	transaction: remove add_head(), remove_head(), and set_view() helpers	2021-03-16 22:31:28 -07:00
Martin von Zweigbergk	06df609482	transaction: delete check_out() and set_checkout() helpers	2021-03-16 22:31:28 -07:00
Martin von Zweigbergk	808d0af66d	transaction: remove evolution() and store() helpers	2021-03-16 22:31:24 -07:00
Martin von Zweigbergk	16d97ef8c0	transaction: remove index() and view() helpers	2021-03-16 22:05:51 -07:00
Martin von Zweigbergk	5ed14185a0	git: take a MutableRepo instead of a Transaction	2021-03-16 22:05:51 -07:00
Martin von Zweigbergk	769f88bbae	tests: rename test_transaction to test_mut_repo The test doesn't test any logic in the `Transaction` type itself anymore.	2021-03-16 22:05:51 -07:00
Martin von Zweigbergk	2c2b5fb3b7	evolution: take a MutableRepo instead of a Transaction	2021-03-16 22:05:51 -07:00
Martin von Zweigbergk	c3b9d1cd13	rewrite: take a MutableRepo instead of a Transaction	2021-03-16 22:05:51 -07:00
Martin von Zweigbergk	ee8423a69e	MutableRepo: rename `repo` to `base_repo` to clarify its role	2021-03-16 22:05:50 -07:00
Martin von Zweigbergk	69de4698ac	tests: set $HOME in a few tests to avoid depending in developer's ~/.gitignore I just changed my `~/.gitignore` and some tests started failing because the working copy respects the user's `~/.gitignore`. We should probably not depend on `$HOME` in the library crate. For now, this patch just makes sure we set it to an arbitrary directory in the tests where it matters.	2021-03-16 22:05:36 -07:00
Martin von Zweigbergk	67e11e0fc3	git_store: wait 1 minute for lock on refs to help tests `test_commit_parallel` was failing on Mac in the GitHub CI. I suspect the reason was that it was timing out. The test runs in about 1 s on my Linux desktop and in about 3 s on my Mac laptop. It failed after 31 in the GitHub CI. This patch increases the timeout to 1 minute to try to make the test pass. It would be better to set the timeout to a higher value only in tests, but this will be good enough for now. By the way, it has turned out that git notes (at least libgit2's implementation of them) are too slow, so we should probably eventually create our own storage for the extra metadata instead.	2021-03-16 11:28:22 -07:00
Martin von Zweigbergk	81a0e0bd2a	protobuf: upgrade to version 2.22.0 I only noticed that there was a newer version when running `cargo install --path .`, which resulted in warnings about deprecated functions. There's no other reason I'm aware of to upgrade now.	2021-03-15 17:09:29 -07:00
Martin von Zweigbergk	1ebdd4ecf0	MutableRepo: use index when enforcing view invariants We can now finally use the commit index for filtering out ancestors from the sets of heads. I haven't timed the change from most of the recent work on performance, but I did a measurement after this commit. I modified a commit in the git.git repo's "what's cooking" branch (because that's linear). Then I ran `jj evolve` so the 100 commits after it would get evolved. That took ~700ms. `git rebase` of the same 100 commits took ~6s. I also compared `jj op undo` of that `jj evolve` operation. With this patch, that was sped up from ~6.8s to ~125ms.	2021-03-15 16:35:45 -07:00
Martin von Zweigbergk	3ecb4ec16b	MutableRepo: in fast-path for adding head, simply remove parent heads	2021-03-15 15:38:09 -07:00
Martin von Zweigbergk	2c92fca75a	MutableView: don't require whole Commit when CommitId is enough	2021-03-15 15:36:03 -07:00
Martin von Zweigbergk	b4b1de3ddc	view: let MutableRepo enforce view invariants `MutableRepo` has more information needed for taking fast-paths, and it will have to make the same decision for doing incremental updates of the evolution state anyway.	2021-03-15 15:17:36 -07:00
Martin von Zweigbergk	b9fe944e76	view: remove unnecessary removing of parents in add_head() We call `enforce_invariants()` right after removing the parent commits, and that will remove parents anyway.	2021-03-15 15:06:14 -07:00
Martin von Zweigbergk	12a47bd6ed	MutableRepo: don't calculate evolution state only to update it	2021-03-15 15:03:50 -07:00
Martin von Zweigbergk	f0619c07ac	MutableEvolution: make MutableRepo responsible for lazy calculation This patch continues the work from the previous pathc. From this patch, we no longer calculate the evolution state just because a transaction starts. We still unnecessarily calculate it when adding a commit within the transaction, however. I'll fix that next.	2021-03-15 15:03:14 -07:00
Martin von Zweigbergk	61acee52f4	ReadonlyEvolution: make ReadonlyRepo responsible for lazy calculation This patch changes it so that `ReadonlyEvolution` does not lazily calculate its state and the caller, i.e. `ReadonlyRepo`, is instead responsible for the laziness. That will allow the caller to make decisions based on whether the state has been calculated. Specifically, we don't want to calculate the evolution state in order to update it incrementally if it hasn't already been calculated. It's better to just leave it uncalculated in that case. As a result of moving the laziness out of `ReadonlyEvolution`, we also don't need to the reference to `ReadonlyRepo` anymore, which simplifies things a bunch. The next patch will continue by making the corresponding change to `MutableEvolution`, which will let us simplify even more.	2021-03-15 14:41:27 -07:00
Martin von Zweigbergk	43315bc9d2	git: fix bad formatting from commit `1e9d428406`	2021-03-14 22:28:12 -07:00
Martin von Zweigbergk	91117f36b6	cargo: work around warning in generated protobuf code with new nightly rustc	2021-03-14 22:25:43 -07:00
Martin von Zweigbergk	1e9d428406	git: skip tags pointing to GPG keys and similar when importing refs	2021-03-14 20:14:18 -07:00
Martin von Zweigbergk	429a1ad7ab	git: set authentication callback on fetch as well I guess I had not run `jj git fetch` from GitHub until I tried to fetch the result of PR #6 just now.	2021-03-14 17:18:51 -07:00
Jun Wu	d1d502c062	tests: disable tests failing on Windows This unblocks enabling GitHub CI. I took a quick look at some failures but the causes do not seem obvious to me.	2021-03-14 15:51:32 -07:00
Jun Wu	935da3e13f	lock: treat PermissionDenied on Windows as transient error On Windows it can be PermissionDenied when creating the new file exclusively. This change makes lock_concurrent test pass on Windows.	2021-03-14 15:51:32 -07:00
Jun Wu	eacab648b0	working_copy: clean up ".git" automatically TreeState::write_tree leaves a ".git" file in the working copy. This is undesirable but more problematic on Windows - The second time TreeState::write_tree would panic because Repository::init_opts will fail with a Permission Denied error. This seems to be a libgit2 defect. But for now let's just remove ".git" automatically. This makes `cargo test --test smoke_test` pass on Windows.	2021-03-14 15:49:42 -07:00
Jun Wu	4cd29a2130	working_copy: avoid std::os::unix on Windows std::os::unix::fs::PermissionsExt::mode() does not exist on Windows. Treat files on Windows as regular files.	2021-03-14 15:49:22 -07:00
Martin von Zweigbergk	5631e85502	view: don't enforce invariants in merge_views() We now only call the function from `MutableRepo::merge()`. There we pass the result to `MutableView::set_view()`, which already enforces the invariants.	2021-03-14 11:07:34 -07:00
Martin von Zweigbergk	8048d9641e	commands: rewrite `jj op undo` using new `MutableRepo::merge()`	2021-03-14 10:57:57 -07:00
Martin von Zweigbergk	a7f4f4cf5b	rustfmt: configure to merge imports by module Perhaps we should even set the config to "Item" to reduce merge conflicts.	2021-03-14 10:53:14 -07:00
Martin von Zweigbergk	4b8484e561	rustfmt: configure to group imports	2021-03-14 10:46:25 -07:00

1 2 3 4 5

206 commits