mirrors/jj

mirror of https://github.com/martinvonz/jj.git synced 2024-12-28 15:34:22 +00:00

Author	SHA1	Message	Date
Ilya Grigoriev	a88c06068e	clippy: new nightly fixes For some reason, clippy also suggested surrounding `self.value` with parentheses. Not sure whether that's a clippy bug. Cc: https://github.com/rust-lang/rust-clippy/issues/12268	2024-02-10 16:06:28 -08:00
Yuya Nishihara	35f718f212	merged_tree: remove canceling terms prior to resolving file-level conflict I think this is a variant of the problem fixed by `7fda80fc22` "tree: simplify conflict before resolving at hunk level." We need to simplify() the conflict before and after extracting file ids because the source conflict values may contain trees to be cancelled out, and the file values may differ only in exec bits. Since the legacy tree passes a simplified conflict in to this function, I made the merged tree do the same. Fixes #2654	2023-12-03 07:44:58 +09:00
Yuya Nishihara	4ffbf40c82	merged_tree: do not propagate conflicting empty tree value to parent Otherwise an empty subtree would be added to the parent tree. If the stored tree contained an empty subtree, simplify() wouldn't work against new "absent" subtree representation. I don't know if there's a such code path, but I believe it's very rare to encounter the problem. #2654	2023-12-03 07:44:58 +09:00
Yuya Nishihara	28ab9593c3	repo_path: split RepoPath into owned and borrowed types This enables cheap str-to-RepoPath cast, which is useful when sorting and filtering a large Vec<(String, _)> list by using matcher for example. It will also eliminate temporary allocation by repo_path.parent().	2023-11-28 07:33:28 +09:00
Yuya Nishihara	0a1bc2ba42	repo_path: add stub RepoPathBuf type, update callers Most RepoPath::from_internal_string() callers will be migrated to the function that returns &RepoPath, and cloning &RepoPath won't work.	2023-11-28 07:33:28 +09:00
Yuya Nishihara	d322df0c8d	matchers: make Files/PrefixMatcher constructors accept slice of borrowed paths RepoPath will become slice type (like str), and it doesn't make sense to require &[RepoPathBuf] here.	2023-11-28 07:33:28 +09:00
Yuya Nishihara	974a6870b3	repo_path: make RepoPath::components() return iterator This allows us to change the backing type from Vec<String> to String.	2023-11-27 08:42:09 +09:00
Yuya Nishihara	59ef3f0023	repo_path: split RepoPathComponent into owned and borrowed types This is a step towards introducing a borrowed RepoPath type. The current RepoPath type is inefficient as each component String is usually short. We could apply short-string optimization, but still each inlined component would consume 24 bytes just for e.g. "src", and increase the chance of random memory access. If the owned RepoPath type is backed by String, we can implement cheap cast from &str to borrowed &RepoPath type.	2023-11-26 18:21:40 +09:00
Yuya Nishihara	f2096da2d6	repo_path: add stub type to introduce borrowed RepoPathComponent type The current RepoPathComponent will be renamed to RepoPathComponentBuf, and new str wrapper will be added as RepoPathComponent.	2023-11-26 18:21:40 +09:00
Yuya Nishihara	6344cd56b3	repo_path: remove RepoPathJoin trait, just implement join() on the type I don't think we'll add join() that takes different types.	2023-11-26 07:14:47 +09:00
Yuya Nishihara	e0c35684af	merge: rename Merge::new() to Merge::from_removes_adds() Since (removes, adds) pair is no longer the canonical representation of Merge, the name Merge::new() seems too generic. Let's give more verbose name.	2023-11-07 17:10:12 +09:00
Martin von Zweigbergk	d989d4093d	merged_tree: let backend influence whether to use new diff algo Since the concurrent diff algorithm is significantly slower when using the Git backend, I think we'll have to use switch between the two algorithms depending on backend. Even if the concurrent version always performed as well as the sequential version, exactly how concurrent it should be probably still depends on the backend. This commit therefore adds a function to the `Backend` trait, so each backend can say how much concurrency they deal well with. I then use that number for choosing between the sequential and concurrent versions in `MergedTree::diff_stream()`, and also to decide the number of concurrent reads to do in the concurrent version.	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	f40adb84fc	merged_tree: add a `Stream` for concurrent diff off trees When diffing two trees, we currently start at the root and diff those trees. Then we diff each subtree, one at a time, recursively. When using a commit backend that uses remote storage, like our backend at Google does, diffing the subtrees one at a time gets very slow. We should be able to diff subtrees concurrently. That way, the number of roundtrips to a server becomes determined by the depth of the deepest difference instead of by the number of differing trees (times 2, even). This patch implements such an algorithm behind a `Stream` interface. It's not hooked in to `MergedTree::diff_stream()` yet; that will happen in the next commit. I timed the new implementation by updating `jj diff -s` to use the new diff stream and then ran it on the Linux repo with `jj diff --ignore-working-copy -s --from v5.0 --to v6.0`. That slowed down by ~20%, from ~750 ms to ~900 ms. Maybe we can get some of that performance back but I think it'll be hard to match `MergedTree::diff()`. We can decide later if we're okay with the difference (after hopefully reducing the gap a bit) or if we want to keep both implementations. I also timed the new implementation on our cloud-based repo at Google. As expected, it made some diffs much faster (I'm not sure if I'm allowed to share figures).	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	9af09ec236	test_meregd_tree: test diffing with a matcher We didn't have any tests at all for `MergedTree::diff()` with a matcher other than `EverythingMatcher`. This patch adds a few.	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	16aa8e8f10	test_merged_tree: nest each part of `test_diff_dir_file()` I'm about to add a few more checks for diffing with a matcher. I think it will help make it readable and reduce the risk of mixing up variables between each part of the test if we use some nested blocks. I also removed some unnecessary `.clone()` calls while at it.	2023-11-06 23:12:02 -08:00
Yuya Nishihara	895bbce8c0	files: use borrowed Merge iterator in merge() Since the underlying Merge data type is no longer (Vec<T>, Vec<T>), it doesn't make sense to build removes/adds Vecs and concatenate them.	2023-11-07 06:52:35 +09:00
Martin von Zweigbergk	a1ef9dc845	merged_tree: propagate backend errors in diff iterator I want to fix error propagation before I start using async in this code. This makes the diff iterator propagate errors from reading tree objects. Errors include the path and don't stop the iteration. The idea is that we should be able to show the user an error inline in diff output if we failed to read a tree. That's going to be especially useful for backends that can return `BackendError::AccessDenied`. That error variant doesn't yet exist, but I plan to add it, and use it in Google's internal backend.	2023-10-26 06:20:56 -07:00
Martin von Zweigbergk	6ad71e658d	merged_tree: rename `MergedTreeValue` to `MergedTreeVal` I'm going to add `MergedTreeValue` as an alias for `Merge<Option<TreeValue>>`, but we already have a type by that name in `merged_tree`. This patch renames it away, to make room for the new alias. I used `MergedTreeVal` for this borrowing version to be a bit like how `str` is a borrowed version of `String`.	2023-10-26 06:20:56 -07:00
Martin von Zweigbergk	7fda80fc22	tree: simplify conflict before resolving at hunk level I ran into a bug the other day where `jj status` said there was a conflict in a file but there were no conflict markers in the working copy. The commit was created when I squashed a conflict resolution into the commit's parent. The rebased child commit then ended up in this state. I.e., it looked something like this before squashing: ``` C (no conflict) \| \| B conflict \|/ A conflict ``` The conflict in B was different from the conflict in A. When I squashed in C, jj would try to resolve the conflicts by first creating a 7-way conflict (3 from A, 3 from B, 1 from C). Because of the exact content-level changes, the 7-way conflict couldn't be automatically resolved by `files::merge()` (the way it currently works anyway). However, after simplifying the conflict, it could be resolved. Because `MergedTree::merge()` does another round of conflict simplification of the result at the end of the function, it was the simplifed version that actually got stored in the commit. So when inspecting the conflict later (e.g. in the working copy, as I did), it could be automatically resolved. I think there are at least two ways to solve this. One is to call `merge_trees()` again after calling `tree.simplify()` in `MergedTree::merge()`. However, I think it would only matter in the case of content-level conflicts. Therefore, it seems better to make the content-level resolution solve this case to start with. I've done that by simplifying the conflict before passing it into `files::merge()`. We could even do the fix in `files::merge()`, but doing it before calling it has the advantage that we can avoid reading some unchanged content from the backend.	2023-09-27 22:14:39 -07:00
Martin von Zweigbergk	e3f82cd99a	tests: leverage `TestRepo::init()` in `test_merged_tree` I forgot to update these call sites when I introduced (the new version of) `TestRepo::init()`.	2023-09-20 07:47:30 -07:00
Martin von Zweigbergk	7ecd64fde1	merged_tree: use child path when merging child This fixes a bug where we used the parent directory's path when trying read trees and files for a child entry. Many tests in `test_merged_tree` fail after switching to the test backend there without this fix/	2023-09-18 07:53:19 -07:00
Martin von Zweigbergk	9c30d7500b	testutils: delete bool-typed `init()` in favor of enum-typed version It makes the call sites clearer if we pass the `TestRepoBackend` enum instead of the boolean `use_git` value. It's also more extensible (I plan to add another backend for tests).	2023-09-18 07:15:37 -07:00
Martin von Zweigbergk	5ef0be73c1	merged_tree: allow building trees with variable-arity overrides When restoring (`jj restore`) a 3-sided conflict from one tree into a 2-sided tree (or a resolved tree), we'll need to extend the size arity of the target tree to that of the source tree. I had not considered this case before. This patch relaxes the constraint in `MergedTreeBuilder` to allow such cases. The additional trees are based on empty trees with only the larger overrides in them.	2023-09-01 06:09:37 -07:00
Martin von Zweigbergk	26581750fe	store: add a `empty_merged_tree_id()` helper Many (most?) callers of `Store::empty_tree_id()` really want a `MergedTreeId`, so let's create a helper for that. It returns the `Legacy` variant, which is what all current callers used. That should be all we need since the two variants compare equal these days, and since trees built based on the legacy variant can get promoted to the new variant on write if the config is enabled.	2023-08-30 19:58:42 -07:00
Martin von Zweigbergk	2d50d8a077	merged_tree: propagate errors from `from_legacy_tree()`	2023-08-29 08:32:04 -07:00
Martin von Zweigbergk	67832a3940	merged_tree: take store argument to `write_tree()` instead of `new()` The store isn't needed until we write the trees, so I think it makes more sense to pass it there.	2023-08-29 08:32:04 -07:00
Martin von Zweigbergk	d9ce70c176	tests: make `create_tree()` return `MergedTree` I think most tests want a `MergedTree`, so this makes `create_tree()` return that. I kept the old function as `create_single_tree()`. That's now only used in `test_merge_trees` and `test_merged_tree`. I also consistently imported the functions now, something I've considered doing for a long time.	2023-08-29 07:01:52 -07:00
Martin von Zweigbergk	873a6f0674	merged_tree: add a function for merging 3 `MergedTree`s With the already existing `MergedTree::resolve()` and all the recent refactorings into `Merge<T>`, it's now very easy to add support for 3-way merging of `MergedTree` instances.	2023-08-28 15:58:34 -07:00
Martin von Zweigbergk	49e32aa532	merged_tree: teach tree builder to build multiple trees	2023-08-27 06:49:45 -07:00
Martin von Zweigbergk	2dd2e77170	merged_tree: add `entries()` for iterating over all entries We already have `entries_matching()`, so this is just a version of that that doesn't take a matcher.	2023-08-27 06:49:45 -07:00
Martin von Zweigbergk	416fa2741c	merged_tree: add entry iterator	2023-08-25 07:06:20 -07:00
Martin von Zweigbergk	d5ceefcd8e	merged_tree: add diff iterator If we're going to be able to replace most instances of `Tree` by `MergedTree`, we'll need to be able to diff two `MergedTree`s. This implements support for that. The implementation copies a lot from the diff iterator we have for `Tree`. I suspect we should be able to reuse some of the code by introducing some traits that can then be implemented by both `Tree` and `MergedTree`. I've left a TODO about that.	2023-08-25 06:40:36 -07:00
Martin von Zweigbergk	1d55a404cc	merged_tree: add `path_value()`	2023-08-15 07:56:55 -07:00
Martin von Zweigbergk	2238c87da1	merged_tree: import `create_tree()` in tests to reduce line wrapping	2023-08-15 07:56:55 -07:00
Martin von Zweigbergk	ef5f97f8d7	conflicts: move `Merge<T>` to `merge` module The `merge` module now seems like the obvious place for this type.	2023-08-06 22:08:09 +00:00
Martin von Zweigbergk	ecc030848d	conflicts: rename `Conflict<T>` to `Merge<T>` Since `Conflict<T>` can also represent a non-conflict state (a single term), `Merge<T>` seems like better name. Thanks to @ilyagr for the suggestion in https://github.com/martinvonz/jj/pull/1774#discussion_r1257547709 Sorry about the churn. It would have been better if I thought of this name before I introduced `Conflict<T>`.	2023-08-06 22:08:09 +00:00
Martin von Zweigbergk	deb4ae476d	merged_tree: add an iterator over conflicts With `MergedTree`, we can iterate over conflicts by descending into only the subdirectories that cannot be trivially resolved. We assume that the trees have previously been resolved as much as possible, so we don't attempt to resolve conflicts again.	2023-07-19 22:04:16 -07:00
Martin von Zweigbergk	828d528361	merged_tree: add a function for resolving conflicts This adds a function for resolving conflicts that can be automatically resolved, i.e. like our current `merge_trees()` function. However, the new function is written to merge an arbitrary number of trees and, in case of unresolvable conflicts, to produce a `Conflict<TreeId>` as result instead of writing path-level conflicts to the backend. Like `merge_trees()`, it still leaves conflicts unresolved at the file level if any hunks conflict, and it resolves paths that can be trivially resolved even if there are other paths that do conflict.	2023-07-19 22:04:16 -07:00
Martin von Zweigbergk	4f30417ffd	merged_tree: introduce a type for a set of trees to merge on the fly In order to store conflicts in the commit, as conflicts between a set of trees, we want to be able merge those trees on the fly. This introduces a type for that. It has a `Merge(Conflict(Tree))` variant, where the individual trees cannot have path-level conflicts. It also has a `Legacy(Tree)` variant, which does allow path-level conflicts. I think that should help us with the migration.	2023-07-19 22:04:16 -07:00

39 commits