ok/jj - ok.software

ok/jj

Author	SHA1	Message	Date
Yuya Nishihara	97a260b1bf	merged_tree: reimplement TreeEntryDiffIterator by using iterator adapter We don't need a named type anymore.	2023-12-01 00:05:06 +09:00
Yuya Nishihara	fd1c03d037	merged_tree: use sync get_tree() in TreeDiffIterator This basically backs out the change `1b9a3e27e0` "merged_tree: read before/after trees concurrently." As we decided to add a separate impl for async access, it doesn't make sense to read before/after pair in parallel. The async single_tree() is moved to TreeDiffStreamImpl. It will help remove the sync version when the performance problem is solved.	2023-12-01 00:05:06 +09:00
Yuya Nishihara	601be0d480	working_copy: narrow file_states recursively while visiting directories This saves another ~10ms. Without watchman: ``` % hyperfine --sort command --warmup 3 --runs 20 -L bin jj-1,jj-2 \ "target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match" Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 327.7 ms ± 24.9 ms [User: 1059.1 ms, System: 654.3 ms] Range (min … max): 296.0 ms … 385.4 ms 20 runs Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 311.0 ms ± 24.8 ms [User: 960.0 ms, System: 643.1 ms] Range (min … max): 274.9 ms … 358.5 ms 20 runs ```	2023-11-30 12:09:31 +09:00
Yuya Nishihara	a935a4f70c	working_copy: use proto file states without rebuilding BTreeMap In snapshot(), changed_file_states are received in arbitrary order. For the other callers, entries are in diff_stream order, so we don't have to sort them. With watchman enabled, we can see the cost of sorting the sorted proto entries. I don't think this is significant, but we can mitigate it by adding is_file_states_sorted flag to the proto message if needed: ``` % hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \ "target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 164.8 ms ± 16.6 ms [User: 50.2 ms, System: 111.7 ms] Range (min … max): 148.1 ms … 195.0 ms 20 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 171.8 ms ± 13.6 ms [User: 61.7 ms, System: 109.0 ms] Range (min … max): 159.5 ms … 192.1 ms 20 runs ``` Without watchman: ``` % hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \ "target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 367.3 ms ± 30.3 ms [User: 1415.2 ms, System: 633.8 ms] Range (min … max): 325.4 ms … 421.7 ms 20 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match Time (mean ± σ): 327.7 ms ± 24.9 ms [User: 1059.1 ms, System: 654.3 ms] Range (min … max): 296.0 ms … 385.4 ms 20 runs ``` I haven't measured snapshotting against dirty working copy, but I don't think it would be slower than the original implementation.	2023-11-30 12:09:31 +09:00
Yuya Nishihara	fca3690dda	working_copy: add file states wrapper that provides map-like API I'll replace the current lazy loading mechanism with this. Read-only methods are implemented on the borrowed type so that we can narrow lookup scope recursively.	2023-11-30 12:09:31 +09:00
Yuya Nishihara	9292af5e52	working_copy: update file states in bulk This helps migrate BTreeMap<RepoPath, _> to sorted Vec.	2023-11-30 12:09:31 +09:00
Yuya Nishihara	c9150d02fc	working_copy: don't look up file state twice while visiting directories	2023-11-30 12:09:31 +09:00
Yuya Nishihara	6ce7bd5338	repo_path: replace .contains() with .starts_with(), flipping the arguments self.contains(other) means that the self tree contains the other tree (i.e. the self path is prefix of the other), but it could be confused the other way around if we were thinking about the path literal, not the tree. Let's add .starts_with() instead by copying the std::path::Path definition.	2023-11-29 08:41:23 +09:00
Yuya Nishihara	266690a46b	repo_path: make strip_prefix() public function returning &RepoPath There are no external callers, but I think it's useful.	2023-11-29 08:41:23 +09:00
Yuya Nishihara	73690ed54e	matchers: clean up .walk_to(dir) to yield &RepoPath instead of iterator	2023-11-29 08:41:23 +09:00
Yuya Nishihara	bc9725c73c	working_copy: use RepoPath::parent() which no longer allocates temporary object	2023-11-29 08:41:23 +09:00
Yuya Nishihara	016fc2b5cc	repo_path: change .split() and .parent() to return &RepoPath	2023-11-29 08:41:23 +09:00
Yuya Nishihara	28ab9593c3	repo_path: split RepoPath into owned and borrowed types This enables cheap str-to-RepoPath cast, which is useful when sorting and filtering a large Vec<(String, _)> list by using matcher for example. It will also eliminate temporary allocation by repo_path.parent().	2023-11-28 07:33:28 +09:00
Yuya Nishihara	0a1bc2ba42	repo_path: add stub RepoPathBuf type, update callers Most RepoPath::from_internal_string() callers will be migrated to the function that returns &RepoPath, and cloning &RepoPath won't work.	2023-11-28 07:33:28 +09:00
Yuya Nishihara	f5938985f0	repo_path: make RepoPath::from_internal_string() accept owned string I'm going to add borrowed RepoPath type, and most from_internal_string() callers will be migrated to it. For the remaining callers, it makes more sense to move the ownership of String to RepoPathBuf.	2023-11-28 07:33:28 +09:00
Yuya Nishihara	d322df0c8d	matchers: make Files/PrefixMatcher constructors accept slice of borrowed paths RepoPath will become slice type (like str), and it doesn't make sense to require &[RepoPathBuf] here.	2023-11-28 07:33:28 +09:00
Yuya Nishihara	a23bb5b958	matchers: in tests, use alias to RepoPath::from_internal_string() It looked verbose to fully spell the function name.	2023-11-28 07:33:28 +09:00
Ilya Grigoriev	6aef4bb52e	cli rebase: do not allow `-r --skip-empty` This follows up on `3967f63` (see that commit's description for more motivation) and `e79c8b6`. In a discussion linked below, it was decided that forbidding `-r --skip-empty` entirely is preferable to the mixed behavior introduced in `3967f63`. `3967f637dc (commitcomment-133539911)`	2023-11-27 10:16:36 -08:00
Yuya Nishihara	55f75278bc	repo_path: make to_internal_file_string() return &str, rename accordingly	2023-11-27 08:42:09 +09:00
Yuya Nishihara	12d7f8be16	repo_path: turn RepoPath into String wrapper RepoPath::from_components() is removed since it is no longer a primitive function. The components iterator could be implemented on top of str::split(), but it's not as we'll probably want to add components.as_path() -> &RepoPath. Tree walking and tree_states map construction get slightly faster thanks to fewer allocations and/or better cache locality. If we add a borrowed RepoPath type, we can also implement a cheap &str to &RepoPath conversion on top. Then, we can get rid of BTreeMap<RepoPath, FileState> construction at all. Snapshot without watchman: ``` % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \ "target/release-with-debug/{bin} -R ~/mirrors/linux status" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status Time (mean ± σ): 950.1 ms ± 24.9 ms [User: 1642.4 ms, System: 681.1 ms] Range (min … max): 913.8 ms … 990.9 ms 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status Time (mean ± σ): 872.1 ms ± 14.5 ms [User: 1922.3 ms, System: 625.8 ms] Range (min … max): 853.2 ms … 895.9 ms 10 runs Relative speed comparison 1.09 ± 0.03 target/release-with-debug/jj-0 -R ~/mirrors/linux status 1.00 target/release-with-debug/jj-1 -R ~/mirrors/linux status ``` Tree walk: ``` % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \ "target/release-with-debug/{bin} -R ~/mirrors/linux files --ignore-working-copy" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files --ignore-working-copy Time (mean ± σ): 375.3 ms ± 15.4 ms [User: 223.3 ms, System: 151.8 ms] Range (min … max): 359.4 ms … 394.1 ms 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files --ignore-working-copy Time (mean ± σ): 357.1 ms ± 16.2 ms [User: 214.7 ms, System: 142.6 ms] Range (min … max): 341.6 ms … 378.9 ms 10 runs Relative speed comparison 1.05 ± 0.06 target/release-with-debug/jj-0 -R ~/mirrors/linux files --ignore-working-copy 1.00 target/release-with-debug/jj-1 -R ~/mirrors/linux files --ignore-working-copy ```	2023-11-27 08:42:09 +09:00
Yuya Nishihara	974a6870b3	repo_path: make RepoPath::components() return iterator This allows us to change the backing type from Vec<String> to String.	2023-11-27 08:42:09 +09:00
Yuya Nishihara	aba8c640be	repo_path: capture current Vec<String> ordering by tests The added test would fail if paths were purely ordered by concatenated strings. I'm not sure if we want to preserve the current ordering, but let's not break it for the moment.	2023-11-27 08:42:09 +09:00
Ilya Grigoriev	3967f637dc	cli rebase: do not allow `-r --skip-empty` to drop emptied descendants This follows up on @matts1 's #2609. We still allow the `-r` commit to become empty. I would be more comfortable if there was a test for that, but I haven't done that (yet?) and it seems pretty safe. If that's a problem, I'm happy to forbid `-r --skip-empty` entirely, since it is far less useful than `-s --skip-empty` or `-b --skip-empty`. I think it is undesired to abandon emptied descendants. As far as descendants of `A` are concerned, `jj rebase -r A` should be equivalent to `jj abandon A`, and `jj abandon` does not remove emptied commits. It also doesn't seem very useful to do that, since I think descendant commits of an abandoned (or moved with `-r`) commit only become empty in pathological cases. Additionally, if we did want -r to empty descendants of `A`, we'd have to add thorough tests and possibly improve the algorithm. I want to refactor `rebase -r` and add features to it, and having to consider cases of commits becoming abandoned makes everything harder. For example, if we have ``` root -> A -> B -> C ``` and `jj rebase -r A -d C` empties commit `B` (or `C`), I do not know whether the current algorithm will work correctly. It seems possible that it would, but that depends on the fact that empty merge commits are not abandoned for descendants. That seems dangerous to rely on without tests. I hope (but can't promise) that in the near future, making DescendantRebaser return more information should help make it possible to create such functionality in a more robust way. I am likely to attempt this as part of implementing `-r --after`.	2023-11-26 10:56:58 -08:00
Yuya Nishihara	59ef3f0023	repo_path: split RepoPathComponent into owned and borrowed types This is a step towards introducing a borrowed RepoPath type. The current RepoPath type is inefficient as each component String is usually short. We could apply short-string optimization, but still each inlined component would consume 24 bytes just for e.g. "src", and increase the chance of random memory access. If the owned RepoPath type is backed by String, we can implement cheap cast from &str to borrowed &RepoPath type.	2023-11-26 18:21:40 +09:00
Yuya Nishihara	f2096da2d6	repo_path: add stub type to introduce borrowed RepoPathComponent type The current RepoPathComponent will be renamed to RepoPathComponentBuf, and new str wrapper will be added as RepoPathComponent.	2023-11-26 18:21:40 +09:00
Yuya Nishihara	e14b31a033	repo_path: reject leading slash and empty path components Leading/trailing slashes would introduce a bit of complexity if we migrate the backing type from Vec<String> to String. Empty components are okay, but let's reject them as they are cryptic and invalid.	2023-11-26 18:21:40 +09:00
Yuya Nishihara	755af75c30	repo_path: in tests, use alias to RepoPath::from_internal_string() It seemed too verbose to spell the full function name in tests.	2023-11-26 18:21:40 +09:00
Yuya Nishihara	b5b01f4dd7	cargo: add ref-cast dependency It helps to implement transparent conversion from &str to &Wrapped(str). We could instead wrap the reference as Wrapped<'a>(&'a str), but it has various drawbacks. Notably we can't implement Borrow and Deref because these traits require a reference in return position. Since the unsafe bits are pretty small, we can instead implement cast functions without using the ref-cast crate. However, I believe we'll trust ref-cast more than hand-crafted unsafe code. https://crates.io/crates/ref-cast https://docs.rs/ref-cast/1.0.20/ref_cast/attr.ref_cast_custom.html	2023-11-26 18:21:40 +09:00
Yuya Nishihara	b7543f8a08	rewrite: fix check for newly-empty commit in optimized path 'old_base_tree_id == None' means the rebased tree is unchanged, so the commit shouldn't be considered newly-empty.	2023-11-26 14:42:17 +09:00
Yuya Nishihara	2f93de9299	rewrite: flatten mapping from EmptyBehaviour to desired action I think this is slightly easier to follow.	2023-11-26 14:42:17 +09:00
Ilya Grigoriev	c32847696d	rewrite.rs: rename `new_parents` to `parent_mapping` The function `new_parents` makes sense, but I found the mapping being named `new_parents` confusing.	2023-11-25 21:36:35 -08:00
Yuya Nishihara	6344cd56b3	repo_path: remove RepoPathJoin trait, just implement join() on the type I don't think we'll add join() that takes different types.	2023-11-26 07:14:47 +09:00
Yuya Nishihara	d7df2516c5	repo_path: remove RepoPathComponent::string(), use as_str() instead There are only two callers, and one does further conversion to BString.	2023-11-26 07:14:47 +09:00
Martin von Zweigbergk	6d54afa60e	revset: make `evaluate_programmatic()` optimize expression It seems generally useful to optimize revset expressions in `evaluate_programmatic()` so the caller doesn't have to remember to do it. It should generally be cheap to do so even if it's often not needed.	2023-11-24 21:13:58 -10:00
Martin von Zweigbergk	550164209c	revset: add a `RevsetExpression::evaluate_programmatic()` We often resolve a programmatic revset and then immediately evaluate it. This patch adds a convenience method for those two steps.	2023-11-24 21:13:58 -10:00
Martin von Zweigbergk	f2602f78cf	revset: make `resolve_programmatic()` not return a `Result` I think it's always a programming error if `resolve_programmatic()` returns a `Result`, so it shouldn't have to return a `Result`.	2023-11-24 21:13:58 -10:00
Martin von Zweigbergk	f27f52984e	revset: rename `resolve()` to `resolve_programmatic()` `RevsetExpression::resolve()` is meant for programmatically created expressions. In particular, it may not contain symbols. Let's try to clarify that by renaming the function and documenting it.	2023-11-24 21:13:58 -10:00
Yuya Nishihara	b37293fa68	tests: add upper bound to test_concurrent_read_write_commit() loop Hopefully this will fix the unfinished Windows CI issue. A possible scenario is that recent migration to gitoxide made this test flaky on Windows. For example, gitoxide might have in-memory object cache that relies on file mtime, and occasionally fails to detect new object on Windows.	2023-11-24 18:07:35 +09:00
Matt Stark	0a95e20ebe	lib: Implement skipping of empty commits	2023-11-24 14:48:06 +11:00
Matt Stark	dc89566039	lib: Create struct RebaseOptions	2023-11-24 14:48:06 +11:00
Anton Bulakh	5c3c0e9f6e	sign: Implement generic commit signing on the backend	2023-11-23 22:52:20 +02:00
Anton Bulakh	5ab00e197a	backend: Inline gix::Repository::commit_as to prepare for signing Additional bonus is that this allows us to avoid creating keep refs for the intermediate commits in the data race preventing loop.	2023-11-23 22:52:20 +02:00
Yuya Nishihara	042d26049c	working_copy: lazily construct file_states BTreeMap While it got faster to build a large BTreeMap<RepoPath, _>, there's still a measurable cost. Let's eliminate it if watchman is enabled and the working copy is clean. Perhaps, we should introduce new serialization format that supports instant loading and lookup, but this hack works for the moment. I'm not sure if the new tree_state format should be flat (RepoPath, _) list, or tree like the backend storage btw. In my "linux" repo (watchman enabled): % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \ "target/release-with-debug/{bin} -R ~/mirrors/linux status" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status Time (mean ± σ): 768.9 ms ± 14.2 ms [User: 630.7 ms, System: 131.2 ms] Range (min … max): 742.3 ms … 783.1 ms 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status Time (mean ± σ): 713.0 ms ± 16.8 ms [User: 587.9 ms, System: 116.2 ms] Range (min … max): 681.5 ms … 731.1 ms 10 runs Relative speed comparison 1.08 ± 0.03 target/release-with-debug/jj-0 -R ~/mirrors/linux status 1.00 target/release-with-debug/jj-1 -R ~/mirrors/linux status	2023-11-23 18:48:14 +09:00
Yuya Nishihara	12cd657837	working_copy: extract file_states_to_proto() helper Just minimizing the changes in the next commit. As we already have file_states_from_proto(), it makes sense to extract the "to" function.	2023-11-23 18:48:14 +09:00
Yuya Nishihara	74c4ef32aa	fsmonitor: exclude .git and .jj directories from changed files This ensures that the root fsmonitor_matcher matches nothing if there are no working-copy changes. The query result can be observed by "jj debug watchman query-changed-files". I don't have expertise on watchman query language, but using the watchman API is probably better than .filter()-ing the result manually.	2023-11-23 18:48:14 +09:00
Yuya Nishihara	1ddcaa43b3	fsmonitor: don't apply prefix matching to paths obtained from watchman If I understand it, watchman returns changed files and directories, and a directory change doesn't mean we need to scan all files under the directory.	2023-11-23 10:06:00 +09:00
Yuya Nishihara	767e94f5af	fsmonitor: drop unneeded mut from make_fsmonitor_matcher() We only need &self.working_copy_path here.	2023-11-23 10:06:00 +09:00
Yuya Nishihara	c16c89bc27	fsmonitor: keep paths relative to the workspace root Since the caller wants repo-relative paths, it doesn't make sense to convert them back and forth.	2023-11-23 10:06:00 +09:00
Yuya Nishihara	a4f6e0de0b	repo_path: extract helper that converts Path to RepoPath literally	2023-11-23 10:06:00 +09:00
Yuya Nishihara	31def4b131	cleanup: don't use debug format to print source errors	2023-11-23 10:05:37 +09:00
Yuya Nishihara	16620e0e4c	merged_tree: drop legacy tree handling from ConflictsDirItem constructor No callers pass in a legacy tree.	2023-11-21 07:45:30 +09:00
Yuya Nishihara	4ad3db2e84	merged_tree: extract value() function of non-legacy trees	2023-11-21 07:45:30 +09:00
Yuya Nishihara	ca3f549c9e	merged_tree: remove redundant clone() from ConflictIterator construction	2023-11-21 07:45:30 +09:00
Martin von Zweigbergk	acc35a89a8	merged_tree: inline non-recursive entry iterator	2023-11-19 20:29:40 -10:00
Martin von Zweigbergk	426f6d0cdd	merged_tree: inline non-recursive conflict iterator The abstraction is no longer useful since we made the types not self-referential.	2023-11-19 20:29:40 -10:00
Yuya Nishihara	5186066cf5	working_copy: simply collect() proto file states into BTreeMap Suppose the input list is presorted, sorting a sorted vec would be cheaper than .insert()-ing sorted items one by one. In my "linux" repo (watchman eanbled): - jj-0: baseline - jj-1: previous (don't randomize by HashMap) - jj-2: this % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \ "target/release-with-debug/{bin} -R ~/mirrors/linux status" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s] Range (min … max): 1.011 s … 1.068 s 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms] Range (min … max): 821.7 ms … 870.2 ms 10 runs Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux status Time (mean ± σ): 786.2 ms ± 16.7 ms [User: 650.7 ms, System: 204.1 ms] Range (min … max): 760.8 ms … 805.2 ms 10 runs Relative speed comparison 1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status 1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status 1.00 target/release-with-debug/jj-2 -R ~/mirrors/linux status	2023-11-20 08:29:33 +09:00
Yuya Nishihara	ee6a1e2c0a	working_copy: don't build intermediate HashMap from proto file states According to the doc, this is compatible with the map syntax. https://protobuf.dev/programming-guides/proto3/#maps This change means that the serialized file states are sorted by RepoPath, so BTreeMap<RepoPath, _> can be reconstructed with fewer cache misses. In my "linux" repo (watchman enabled): - jj-0: baseline - jj-1: this % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \ "target/release-with-debug/{bin} -R ~/mirrors/linux status" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s] Range (min … max): 1.011 s … 1.068 s 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms] Range (min … max): 821.7 ms … 870.2 ms 10 runs Relative speed comparison 1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status 1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status Cache-misses got reduced: % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \ -- ./target/release-with-debug/jj-0 -R ~/mirrors/linux --no-pager status 1,091.68 msec task-clock # 1.032 CPUs utilized 4,179,596,978 cycles # 3.829 GHz 6,166,231,489 instructions # 1.48 insn per cycle 134,032,047 cache-references # 122.776 M/sec 29,322,707 cache-misses # 21.88% of all cache refs 1.057474164 seconds time elapsed 0.897042000 seconds user 0.194819000 seconds sys % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \ -- ./target/release-with-debug/jj-1 -R ~/mirrors/linux --no-pager status 927.05 msec task-clock # 1.083 CPUs utilized 3,451,299,198 cycles # 3.723 GHz 6,222,418,272 instructions # 1.80 insn per cycle 98,499,363 cache-references # 106.251 M/sec 11,998,523 cache-misses # 12.18% of all cache refs 0.855938336 seconds time elapsed 0.720568000 seconds user 0.207924000 seconds sys	2023-11-20 08:29:33 +09:00
Yuya Nishihara	56047cb7ec	working_copy: don't pass all proto data to from_proto() functions Just a code cleanup. This allows us to consume proto fields if needed. I also removed redundant .clone() and .as_str().	2023-11-20 08:29:33 +09:00
Yuya Nishihara	d7e63837f4	index: use extend_wanted/unwanted() to initialize RevWalk	2023-11-17 21:38:56 +09:00
Yuya Nishihara	a9f3dd95e5	index: remove index field from RevWalkQueue, move it to caller	2023-11-17 21:38:56 +09:00
Yuya Nishihara	ff1bb3133e	index: move index-dependent operations out of RevWalkQueue RevWalkQueue no longer depend on the RevWalkIndex trait, and the index field can be removed.	2023-11-17 21:38:56 +09:00
Yuya Nishihara	32a7b33e56	index: optimize RevWalk to store IndexPosition instead of IndexEntry The idea is the same as the heads_pos() change in `9832ee205d`. While IndexEntry::position() should be cheap, saving 20 bytes per entry appears to improve the performance in mid-size repos. In my "linux" repo: revsets/all() ------------- baseline 1.24 156.0±1.06ms this 1.00 126.0±0.51ms I don't see significant difference in small-sized repos like "jj" or "git". IndexEntryByPosition isn't removed since it's still used by the revset engine.	2023-11-17 21:38:56 +09:00
Martin von Zweigbergk	9be24db051	tree: make TreeEntriesDirItem not self-referential This removes the last use of `ouroboros`. Since `TreeEntriesDirItem` is only used in "legacy trees" (before tree-level conflicts), I didn't bother to check the performance impact. I also didn't bother to check the matcher before adding the entries to the list, instead leaving that where it is in `Iterator::next()`.	2023-11-17 03:50:34 -08:00
Martin von Zweigbergk	c0295c5dbc	merged_tree: make ConflictsDirItem not self-referential This removes the last use of `ouroboros` in `merged_tree.rs`. The set of conflicts to iterate is usually so small that I didn't bother checking the performance impact.	2023-11-17 03:50:34 -08:00
Martin von Zweigbergk	e1a02c5c5b	merged_tree: make TreeDiffDirItem not self-referential This removes another dependency on `ouroboros`, for a small performance hit: ``` ❯ hyperfine --warmup 3 --runs 30 \ '/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' \ '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0' Benchmark 1: /tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0 Time (mean ± σ): 689.7 ms ± 23.9 ms [User: 400.0 ms, System: 289.8 ms] Range (min … max): 666.9 ms … 759.2 ms 30 runs Benchmark 2: /tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0 Time (mean ± σ): 710.9 ms ± 19.2 ms [User: 420.4 ms, System: 290.6 ms] Range (min … max): 688.5 ms … 752.0 ms 30 runs Summary '/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' ran 1.03 ± 0.05 times faster than '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0' ```	2023-11-17 03:50:34 -08:00
Martin von Zweigbergk	61d87fe296	merged_tree: make `TreeEntriesIterator` not self-referential While importing the `ouroboros` crate and the `aliasable` crate it depends on, the "unsafe Rust reviewer" expressed some concern that they contain a lot of unsafe code that's hard to review. We can avoid the unsafe code altogether by making `TreeEntriesIterator` not self-refential. Instead, we can collect the matching entries in an individual tree up front. It does have some performance cost: ``` ❯ hyperfine --warmup 3 --runs 30 \ '/tmp/jj-before --ignore-working-copy files -r v6.0' \ '/tmp/jj-after --ignore-working-copy files -r v6.0' Benchmark 1: /tmp/jj-before --ignore-working-copy files -r v6.0 Time (mean ± σ): 461.4 ms ± 14.3 ms [User: 232.1 ms, System: 229.4 ms] Range (min … max): 443.4 ms … 496.3 ms 30 runs Benchmark 2: /tmp/jj-after --ignore-working-copy files -r v6.0 Time (mean ± σ): 482.0 ms ± 14.3 ms [User: 257.2 ms, System: 224.9 ms] Range (min … max): 461.8 ms … 513.3 ms 30 runs Summary '/tmp/jj-before --ignore-working-copy files -r v6.0' ran 1.04 ± 0.04 times faster than '/tmp/jj-after --ignore-working-copy files -r v6.0' ``` I think that's acceptable.	2023-11-17 03:50:34 -08:00
Yuya Nishihara	93fbcec2f7	index: use BinaryHeap instead of BTreeSet in common_ancestors_pos() For the same reason as the heads_pos() change. We just want to omit duplicated items.	2023-11-16 08:27:59 +09:00
Yuya Nishihara	d4059520a9	index: cache generation numbers during common_ancestors_pos() computation I'm not sure if this is better, but common_ancestors_pos() would have a similar property to heads_pos().	2023-11-16 08:27:59 +09:00
Yuya Nishihara	ea4bdd718d	index: use "while let" in common_ancestors_pos()	2023-11-16 08:27:59 +09:00
Yuya Nishihara	02c84a8596	index: remove stale "allow(unstable_name_collisions)" I think this is remainder of nightly shims.	2023-11-16 08:27:59 +09:00
Yuya Nishihara	6399c392fd	index: make heads_pos() deduplicate entries without building separate set This is much faster (maybe because of better cache locality?) Another option is to use BTreeSet, but the BinaryHeap version is slightly faster. "bench revset" result in my linux repo: revsets/heads(tags()) --------------------- baseline 3.28 560.6±4.01ms 1 2.92 500.0±2.99ms 2 1.98 339.6±1.64ms 3 (this) 1.00 171.2±0.30ms	2023-11-16 08:27:59 +09:00
Yuya Nishihara	9832ee205d	index: optimize heads_pos() to cache generation numbers during computation Apparently, IndexEntry::generation_number() isn't cheap probably because it involves random access to larger memory region, and the u32 value might not be aligned. Let's instead store the generation numbers in BinaryHeap. Also, heads_pos() becomes slightly faster by keeping the BinaryHeap entries small, so I've removed the IndexEntry at all. This makes the default log and disambiguation revsets fast, which evaluate 'heads(immutable_heads())'. "bench revset" result in my linux repo: revsets/heads(tags()) --------------------- baseline 3.28 560.6±4.01ms 1 2.92 500.0±2.99ms 2 (this) 1.98 339.6±1.64ms	2023-11-16 08:27:59 +09:00
Yuya Nishihara	1e933b84dd	index: make IndexEntry::parents() lazy instead of collecting to Vec All callers just iterate over the parent entries. "bench revset" result in my linux repo: revsets/heads(tags()) --------------------- baseline 3.28 560.6±4.01ms 1 (this) 2.92 500.0±2.99ms	2023-11-16 08:27:59 +09:00
Yuya Nishihara	39b065f7ab	git: on import_refs(), exclude uninteresting dirs such as refs/jj/keep For loose refs, uninteresting directories can be just skipped. For packed refs, gix will have to do binary search for each prefix to find the starting point. Still it's better overall if the repository contains tons of refs/jj/keep refs. With my linux repo containing ~5k loose jj refs, this saves ~40ms: % hyperfine --warmup 3 --runs 10 \ "/tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux" \ "/tmp/jj-gix-iter --ignore-working-copy git import -R ~/mirrors/linux" Benchmark 1: /tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux Time (mean ± σ): 151.6 ms ± 11.4 ms [User: 38.8 ms, System: 111.6 ms] Range (min … max): 129.8 ms … 159.5 ms 10 runs Benchmark 2: /tmp/jj-gix-iter --ignore-working-copy git import -R ~/mirrors/linux Time (mean ± σ): 109.9 ms ± 11.6 ms [User: 27.5 ms, System: 82.4 ms] Range (min … max): 89.4 ms … 117.8 ms 10 runs	2023-11-14 17:35:27 +09:00
Yuya Nishihara	044716ee40	git: migrate import_refs() to gix::Repository Gitoxide errors are boxed since there are various error types and they tend to exceed the clippy size limit. Apparently, gitoxide is faster than git2: % hyperfine --warmup 3 --runs 10 \ "/tmp/jj-baseline --ignore-working-copy git import -R ~/mirrors/linux" \ "/tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux" Benchmark 1: /tmp/jj-baseline --ignore-working-copy git import -R ~/mirrors/linux Time (mean ± σ): 205.4 ms ± 15.7 ms [User: 59.6 ms, System: 144.6 ms] Range (min … max): 189.7 ms … 223.9 ms 10 runs Benchmark 2: /tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux Time (mean ± σ): 176.2 ms ± 13.7 ms [User: 41.2 ms, System: 134.0 ms] Range (min … max): 155.4 ms … 186.5 ms 10 runs	2023-11-14 17:35:27 +09:00
Yuya Nishihara	6c98dfcdcb	git: have import_refs() obtain git2::Repository instance from store This helps gitoxide migration. It's theoretically possible to import Git refs from non-Git backend, but I don't think such API flexibility is needed.	2023-11-14 17:35:27 +09:00
Yuya Nishihara	dbb1adaf0a	git: move import-related types close to import_refs() function	2023-11-14 17:35:27 +09:00
Yuya Nishihara	f991705e47	tests: add test for importing missing ancestor of HEAD If a commit pointed to by HEAD or ref is missing, the ref is considered invalid and excluded by import_refs(). The current test behavior appears to depend on some in-memory cache of git2::Repository.	2023-11-14 17:35:27 +09:00
Yuya Nishihara	8e143541a5	operation: propagate OpStoreError from parents() We need to .collect_vec() the parents iterator to temporary buffer since the borrowed iterator can't be returned back to the dag_walk functions. Another option is to clone op_store and parent ids to remove &self lifetime from the iterator, but that also means a temporary Vec is created.	2023-11-14 07:16:39 +09:00
Yuya Nishihara	8ddad859e8	dag_walk: add fallible topo_order_reverse_lazy() Unlike dfs_ok(), this function short-circuits at an Err as we use non-lazy topo_order_forward() internally. I think that's good enough. If we implement GC on operation log, deleted parents will be excluded (or mapped to tombstone) by caller. An Err shouldn't mean it's GC-ed.	2023-11-14 07:16:39 +09:00
Yuya Nishihara	3d5a07e86a	dag_walk: add fallible dfs(), topo_order(), heads(), and closest_common_node() This unblocks the use of Result<T, E> in op.parents(). There are two ways to encode errors: a. impl IntoIterator<Item = Result<T, E>> b. Result<V, E> where V: FromIterator<Item = T> I think (a) is more natural to algorithms like dfs(), which can process error nodes transparently. Still the caller might have to collect the source iterator to temporary Vec to conform to the neighbors_fn signature. It's not easy for neighbors_fn to return an iterator borrowing the input node. We already have GAT, but doesn't have return-position impl Trait in trait yet.	2023-11-14 07:16:39 +09:00
Yuya Nishihara	e5a9a26911	dag_walk: remove unused and untested leaves() function	2023-11-14 07:16:39 +09:00
Anton Bulakh	e3a1e5b80e	sign: Implement storage for digital commit signatures Recognize signature metadata from git commit objects, implement a basic version of that for the native backend. Extract the signed data (a commit binary repr without the signature) to be verified later.	2023-11-12 03:37:13 +02:00
Yuya Nishihara	b42a69db6d	git_backend: configure committer (and author) of gix::Repository Otherwise, ref updates would fail if we port git::export_refs() to gitoxide. This change isn't strictly needed for the backend itself, but we'll reuse the gix::Repository instance created by the backend when importing and exporting Git refs.	2023-11-11 22:35:54 +09:00
Yuya Nishihara	ea32c0cb9e	git_backend: pass UserSettings to GitBackend constructors	2023-11-11 22:35:54 +09:00
Yuya Nishihara	8a2048a0e5	repo: pass UserSettings to store factories and initializers GitBackend will use it to configure gix::Repository. I think UserSettings is generally useful to pass store-specific parameters, so I've updated all factory functions.	2023-11-11 22:35:54 +09:00
Yuya Nishihara	6125fb160e	op_store: embed details in operation/view not found error This is basically a copy of BackendError::ObjectNotFound. The failed id may be either view or operation id.	2023-11-11 22:35:40 +09:00
Yuya Nishihara	ea96513fd1	op_store: deduplicate functions that map io::Error to OpStoreError io_to_read_error() also translates ErrorKind::NotFound.	2023-11-11 22:35:40 +09:00
Martin von Zweigbergk	120115a20d	cli: pass `MaterializedTreeValue` into `git_diff_part()` Just a little preparation for reading the materialized values concurrently.	2023-11-10 04:54:47 -08:00
Waleed Khan	a60733f632	tree: remove unsafe with `ouroboros` for self-referential iterators	2023-11-09 21:50:29 -08:00
Yuya Nishihara	6ff3a4f3df	repo: reimplement DirtyCell without using unsafe While the safe implementation is a bit more complex (and probably more branchy), I don't think the runtime overhead would matter here. Let's remove one more unsafe for better code maintainability.	2023-11-10 07:42:45 +09:00
Martin von Zweigbergk	9b24d24612	conflicts: add another helper for materializing a tree value We have a few places where we have a `MergedTreeValue` and need to read the data associated with it so we can write to the working copy or include it in a diff. Let's extract some of that shared logic to a function so we can reuse it. I plan to use it for reading file contents in advance while streaming a diff in `local_working_copy` soon (and probably in `jj diff` thereafter), but I think it seems like an improvement on its own.	2023-11-08 21:21:38 -08:00
Martin von Zweigbergk	65bd5cacba	working copy: on checkout, move read from store out of `write_()` functions I'd like to read N files ahead from the backend, to avoid serializing too many server calls on backends that are backed by a server. Moving the reads a little earlier is a little step towards that. The `TreeState::write_()` functions can now be made into free/static functions if we prefer.	2023-11-08 21:21:38 -08:00
Yuya Nishihara	084b99e1e2	index: rewrite CompositeIndex::entry_by_pos() by leveraging ancestors iterator We no longer have "unsafe" in this function, so let's use the iterator API instead of recursion. Apparently I haven't pushed this change before because unsafe in .find_map() looked scary.	2023-11-08 12:09:33 +09:00
Anton Bulakh	d27351b978	misc: drop a few low-hanging unsafes Remove a couple of unnecessary unsafes: - The NonZeroUsize is a constant where the unwrap will optimize away anyway and we don't have an unsafe without any good reason there :) - The other two were simply not needed, lifetimes worked fine, maybe Rust became better since that code was written? NLL? Anyway, they're gone now	2023-11-08 02:16:08 +02:00
Yuya Nishihara	2ac9865ce7	revset: exclude @git branches from remote_branches() As discussed in Discord, it's less useful if remote_branches() included Git-tracking branches. Users wouldn't consider the backing Git repo as a remote. We could allow explicit 'remote_branches(remote=exact:"git")' query by changing the default remote pattern to something like 'remote=~exact:"git"'. I don't know which will be better overall, but we don't have support for negative patterns anyway.	2023-11-08 07:34:30 +09:00
Yuya Nishihara	59640496aa	cargo: sort dependencies list alphabetically	2023-11-07 23:46:05 +09:00
Yuya Nishihara	d1b0c4cc48	merge: relax input type of Merge::from_removes_adds()	2023-11-07 17:10:12 +09:00
Yuya Nishihara	e0c35684af	merge: rename Merge::new() to Merge::from_removes_adds() Since (removes, adds) pair is no longer the canonical representation of Merge, the name Merge::new() seems too generic. Let's give more verbose name.	2023-11-07 17:10:12 +09:00
Yuya Nishihara	2c128f1b61	merged_tree: convert from legacy conflicts through interleaved list This is basically the same change as the previous commit.	2023-11-07 17:10:12 +09:00
Yuya Nishihara	a734f46130	merged_tree: build unresolved Merge<Tree> from interleaved list We no longer need to iterate removes and adds separately.	2023-11-07 17:10:12 +09:00
Yuya Nishihara	dd26b7be40	merge: add Merge constructor that accepts interleaved values Also migrated some callers of 3-way merge, where [left, base, right] order looks okay.	2023-11-07 17:10:12 +09:00
Yuya Nishihara	803b41c426	merge: load legacy Merge values without allocating intermediate buffers	2023-11-07 17:10:12 +09:00
Yuya Nishihara	09987c1d27	merge: micro-optimize allocation of Merge object for resolved value It's super common that a Merge object holds a resolved value, so let's inline up to 1 element. T of Merge<T> usually consists of a couple of pointer-sized fields. I don't see any measurable speed up, but it's no worse than the original.	2023-11-07 17:10:12 +09:00
Martin von Zweigbergk	1140295829	merged_tree: extract polling of tree futures into a function	2023-11-07 00:03:50 -08:00
Martin von Zweigbergk	c77417d4e4	merged_tree: drop outer loop in `TreeDiffStreamImpl::poll_next()` As suggested by Yuya. I also added a comment and an assertion in the case where return `Poll::Pending`.	2023-11-07 00:03:50 -08:00
Martin von Zweigbergk	d989d4093d	merged_tree: let backend influence whether to use new diff algo Since the concurrent diff algorithm is significantly slower when using the Git backend, I think we'll have to use switch between the two algorithms depending on backend. Even if the concurrent version always performed as well as the sequential version, exactly how concurrent it should be probably still depends on the backend. This commit therefore adds a function to the `Backend` trait, so each backend can say how much concurrency they deal well with. I then use that number for choosing between the sequential and concurrent versions in `MergedTree::diff_stream()`, and also to decide the number of concurrent reads to do in the concurrent version.	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	f40adb84fc	merged_tree: add a `Stream` for concurrent diff off trees When diffing two trees, we currently start at the root and diff those trees. Then we diff each subtree, one at a time, recursively. When using a commit backend that uses remote storage, like our backend at Google does, diffing the subtrees one at a time gets very slow. We should be able to diff subtrees concurrently. That way, the number of roundtrips to a server becomes determined by the depth of the deepest difference instead of by the number of differing trees (times 2, even). This patch implements such an algorithm behind a `Stream` interface. It's not hooked in to `MergedTree::diff_stream()` yet; that will happen in the next commit. I timed the new implementation by updating `jj diff -s` to use the new diff stream and then ran it on the Linux repo with `jj diff --ignore-working-copy -s --from v5.0 --to v6.0`. That slowed down by ~20%, from ~750 ms to ~900 ms. Maybe we can get some of that performance back but I think it'll be hard to match `MergedTree::diff()`. We can decide later if we're okay with the difference (after hopefully reducing the gap a bit) or if we want to keep both implementations. I also timed the new implementation on our cloud-based repo at Google. As expected, it made some diffs much faster (I'm not sure if I'm allowed to share figures).	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	9af09ec236	test_meregd_tree: test diffing with a matcher We didn't have any tests at all for `MergedTree::diff()` with a matcher other than `EverythingMatcher`. This patch adds a few.	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	16aa8e8f10	test_merged_tree: nest each part of `test_diff_dir_file()` I'm about to add a few more checks for diffing with a matcher. I think it will help make it readable and reduce the risk of mixing up variables between each part of the test if we use some nested blocks. I also removed some unnecessary `.clone()` calls while at it.	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	c9ce80a82a	merged_tree: extract function for merged iterator of basenames in diff I'm going to reuse this for stream/async diffing.	2023-11-06 23:12:02 -08:00
Martin von Zweigbergk	b72f04ba61	merged_tree: rename `all_tree_conflict_names()` since it's not about conflicts	2023-11-06 23:12:02 -08:00
Yuya Nishihara	3fddc31da8	merge: remove Merge::take() which is no longer used Merge::take() is no longer a cheap function. We can add into_vec() if needed.	2023-11-07 06:52:35 +09:00
Yuya Nishihara	92dfe59ade	refs: run non-trivial merge of ref targets without destructuring Merge object	2023-11-07 06:52:35 +09:00
Yuya Nishihara	93601541cb	refs: use swap_remove() in non-trivial merge of ref targets I'm going to add a Merge method that removes negative/positive terms pair, and swap_remove() is the easiest option. The order of the conflicted ref targets doesn't matter.	2023-11-07 06:52:35 +09:00
Yuya Nishihara	895bbce8c0	files: use borrowed Merge iterator in merge() Since the underlying Merge data type is no longer (Vec<T>, Vec<T>), it doesn't make sense to build removes/adds Vecs and concatenate them.	2023-11-07 06:52:35 +09:00
Yuya Nishihara	f1898a31b5	merge: simply print interleaved conflict values in debug output We could apply that for the resolved case, but Resolved/Conflicted label seems more useful than just printing Merge([value]).	2023-11-06 07:21:06 +09:00
Yuya Nishihara	b07b370ed3	merge: simply generate content hash from interleaved values	2023-11-06 07:21:06 +09:00
Yuya Nishihara	46ffb2f0b2	merge: store negative/positive terms internally in an interleaved Vec Many callers use interleaved iterators, and recently-added serialization code is built on top of that, so I think it's better to store terms in that format. map() functions no longer use MergeBuilder as we know the mapped values are ordered properly. flatten() and simplify() are reimplemented to work with the interleaved values. The other changes are trivial.	2023-11-06 07:21:06 +09:00
Yuya Nishihara	287728fee7	merge: extract trivial_merge() that takes interleaved adds/removes iterator The Merge type will store interleaved terms instead of separate adds/removes vecs.	2023-11-06 07:21:06 +09:00
Yuya Nishihara	01523ba4f3	merge: rewrite bottom half of trivial_merge() for non-copyable types The input values of trivial_merge() will be changed to Iterator<Item = T> where T: Eq + Hash. It could be <Item = &'a T>, but it doesn't have to be.	2023-11-06 07:21:06 +09:00
Martin von Zweigbergk	7c923514ee	git: add config to disable abandoning of unreachable commits Some users prefer to have commits not get abandoned when importing refs. This adds a config option for that. Closes #2504.	2023-11-05 06:10:54 -08:00
Martin von Zweigbergk	7bf8906f9c	git: extract a function for abandoning unreachable commits This motivation for this is so we can easily skip calling the function if the user has opted out of the propagation of abandoned commits we usually do (#2504). However, it seems like a good piece of code to extract regardless of that feature.	2023-11-05 06:10:54 -08:00
Yuya Nishihara	d9fbf21794	merge: have Merge::adds()/removes() return iterator The Merge type will be changed to store interleaved values internally.	2023-11-05 16:43:06 +09:00
Yuya Nishihara	1c6913d618	merge: use Merge::iter() instead of adds()/removes() where order doesn't matter Merge::iter() will be a slice::Iter, and be more efficient than chaining adds and removes.	2023-11-05 16:43:06 +09:00
Yuya Nishihara	99e6ff493a	merge: fix copy-paste error in doc comment for adds()	2023-11-05 16:43:06 +09:00
Yuya Nishihara	f6d85c51cd	merge: add non-optional Merge accessor to the zeroth value We have a few callers which just need to obtain an object common among all the merge values. Let's add a non-failing accessor for that purpose.	2023-11-05 16:43:06 +09:00
Yuya Nishihara	b12c688ea0	merge: add method for indexed adds/removes access The current adds()/removes() will be changed to return iterators.	2023-11-05 16:43:06 +09:00
Martin von Zweigbergk	6a5615c933	rewrite: use `MergedTree::diff_stream()` when restoring from tree	2023-11-04 21:07:49 -07:00
Yuya Nishihara	602b44258e	workspace: add function that initializes colocated git repository One less git2 API use in CLI. The function name GitBackend::init_colocated() is a bit odd, but we need to specify the work-tree path, not the ".git" repo path. So we can't eliminate the notion of the working copy path anyway.	2023-11-05 08:48:35 +09:00
Yuya Nishihara	77e16243d6	tests: assert paths of initialized GitBackend	2023-11-05 08:48:35 +09:00
Yuya Nishihara	ce46c10c96	git_backend: extract inner function that initializes backend with open git repo	2023-11-05 08:48:35 +09:00
Yuya Nishihara	dce640aaf1	workspace: one less cloning of workspace_root in init_external_git() Just a trivial code cleanup.	2023-11-05 08:48:35 +09:00
Yuya Nishihara	c866b4a42d	workspace: fix repository path in init_internal_git() doc comment Also rephrased "Git backend" as "Git repo" since the new backend storage will be created.	2023-11-05 08:48:35 +09:00
Antoine Cezar	5973ab47b9	commands: move rebase_to_dest_parent to jj_lib::rewrite What make rebase_to_dest_parent a good candidate for jj_lib::rewrite module: - It is used both in obslog and interdiff. It's a sign that it may be moved to a lower layer - CommandError is returned by converting from TreeMergeError. Not explicitly. - It only use jj_lib::rewrite fonctions.	2023-11-03 20:48:00 +01:00
Martin von Zweigbergk	904c37d36d	working copy: use `MergedTree::diff_stream()` This will make it a little faster to update the working copy at Google once we've made `MergedTree::diff_stream()` fetch trees concurrently. (It only makes it a little faster because we still fetch files serially.)	2023-11-03 08:15:10 -07:00
Martin von Zweigbergk	72245cfac5	merged_tree: add `Stream`-based version of `diff()`, delegating for now I'm going to implement a `Stream`-based version optimized for high-latency (RPC-based) commit backends. So far, that implementation is about 20% slower in the Linux repo when running `jj diff --ignore-working-copy -s --from v5.0 --to v6.0`. I think that's almost only because the algorithm is different, not because it's async per se. This commit adds a `Stream`-based version of `MergedTree::diff()` that just wraps the regular iterator in stream. I updated `jj diff` to use it. I couldn't measure any difference on the command above in the Linux repo. I think that means we can safely use the same `Stream`-based interface regardless of backend, even if we end up needing two different implementations of the `Stream`. We would then be using the wrapped iterator from this commit for local backends, and the new implementation for remote backends. But ideally we can make the remote-friendly implementation fast enough that we don't need two implementations.	2023-11-03 08:15:10 -07:00
Martin von Zweigbergk	24b706641f	async: switch to `pollster`'s `block_on()` During the transition to using more async code, I keep running into https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want to convert `MergedTree::diff()` into a `Stream`. I don't want to update all call sites at once, so instead I'm adding a `MergedTree::diff_stream()` method, which just wraps `MergedTree::diff()` in a `Stream. However, since the iterator is synchronous, it needs to block on the async `Backend::read_tree()` calls. If we then also block on the `Stream` in the CLI, we run into the panic.	2023-11-03 08:15:10 -07:00
Martin von Zweigbergk	3a378dc234	cli: add a function for restoring part of a tree from another tree We had similar code in two places for restoring paths from one tree to another. Let's reuse it instead. I put the new function in the `rewrite` module. I'm not sure if that's right place. Maybe it belongs in `tree`?	2023-11-02 06:07:45 -07:00
Yuya Nishihara	162dcd49b4	cli: rewrite base GitIgnoreFile lookup to use gitoxide instead of libgit2 Since gix::Repository::config_snapshot() borrows the repo instance, it has to be allocated in caller's stack. That's why GitBackend::git_config() is removed.	2023-11-02 19:33:06 +09:00
Yuya Nishihara	c88e69ad6f	git_backend: replace git2::Repository with gix::Repository My gut feeling is that gitoxide aims to be more transparent than libgit2. We'll need to know more about the underlying Git data model. Random comments on gix API: * gix::Repository provides API similar to git2::Repository, but has less "convenient" functions. For example, we need to use .find_object() + .try_to/into_<kind>() instead of .find_<kind>(). * gix::Object, Blob, etc. own raw data as bytes. gix::object and gix::objs types provide high-level views on such data. * Tree building is pretty low-level compared to git2. * gix leverages bstr (i.e. bytes) extensively. It's probably not difficult to migrate git::import/export_refs(). It might help eliminate the startup overhead of libssl initialization. The gix-based GitBackend appears to be a bit faster, but that wouldn't practically matter. #2316	2023-11-02 19:33:06 +09:00
Yuya Nishihara	9a86b77e38	tests: force gitoxide to not load config nor use "main" as default branch AFAIK, there's no global config state for gitoxide. We can use Config::isolated() in tests, but GitBackend should load config files in a normal way. https://docs.rs/gix/0.55.2/gix/open/permissions/struct.Config.html#method.isolated https://docs.rs/gix/0.55.2/gix/init/constant.DEFAULT_BRANCH_NAME.html	2023-11-02 19:33:06 +09:00
Yuya Nishihara	f5a61dc2b7	git_backend: open just-initialized repo with canonicalized path Otherwise, the initialized repo could have a different work-dir path than the load()-ed one. libgit2 appears to do some normalization somewhere, but gix won't.	2023-11-02 19:33:06 +09:00
Yuya Nishihara	fd187d266f	git_backend: box GitBackendInit/LoadError up front These error enums will wrap gix error types, and will become bigger enough for clippy to complain.	2023-11-02 19:33:06 +09:00
Yuya Nishihara	b48569b104	cargo: add gitoxide (or gix) dependency I've enabled the "index" component from the "basic" feature set, which would be needed to implement colocated repo functionality. The doc suggests that a library shouldn't activate "max-performance-safe", but our crate is also an application so it would be okay to enable the feature. We'll need "parallel" anyway to make GitBackend Sync. https://docs.rs/gix/latest/gix/#feature-flags	2023-11-02 19:33:06 +09:00
Isabella Basso	749d8bb15a	git: preserve HEAD when possible Closes: #2210	2023-11-01 08:23:52 -03:00
Yuya Nishihara	1788b5014e	git_backend: remove redundant copy back of author timestamp Only the committer timestamp can be updated inside a loop.	2023-10-31 06:51:27 +09:00
Yuya Nishihara	f5aa739c70	git_backend: use .strip_suffix() instead of manual slicing	2023-10-31 06:51:27 +09:00
Yuya Nishihara	9bd84c55e0	git_backend: use file mode extensively in read_tree() Both filemode() and kind() are calculated from the same underlying data, and kind() is libgit2-specific API.	2023-10-31 06:51:27 +09:00
Yuya Nishihara	b3c9cab12d	git_backend: handle read_tree() lookup/encoding errors gracefully	2023-10-31 06:51:27 +09:00
Yuya Nishihara	847adc832f	git_backend: use lossy conversion to decode non-UTF-8 commit message If message() returned None, it doesn't mean the commit message is empty. I originally mapped it to an error, but that made import of linux repo fail. https://docs.rs/git2/latest/git2/struct.Commit.html#method.message	2023-10-31 06:51:27 +09:00
Yuya Nishihara	06c254e742	git_backend: use non-owned str::from_utf8() to decode symlink target Just for consistency with the other changes. str::Utf8Error is 2 words long, so I removed the boxing.	2023-10-31 06:51:27 +09:00
Yuya Nishihara	d1c71c05c9	git_backend: remove redundant error handling for invalid hash length The only error that could be returned by libgit2 is invalid hash length, and we check that explicitly. If we switch the backends to gitoxide, there will be panicking constructor. https://docs.rs/git2/latest/git2/struct.Oid.html#method.from_bytes	2023-10-31 06:51:27 +09:00
Ilya Grigoriev	8bc3f5fd67	cli rebase: Allow `jj rebase -r` to rebase a commit onto a descendant #1188 There are some additional test changes because children and descendants are now rebased before the commit itself.	2023-10-30 10:56:27 -07:00
Yuya Nishihara	2d3fe7eee2	rewrite: replace use of "lift"ed function application with try_collect() Also removed redundant borrow + clone.	2023-10-30 13:50:37 +09:00
Martin von Zweigbergk	35a23172ec	backend: delete unused `Phase` enum The idea was to support phases like in hg, but that hasn't happened yet. We can add back this simple enum if we do add support for phases.	2023-10-29 12:02:40 -07:00
Martin von Zweigbergk	cfcdd71865	backend: make `read_conflict` synchronous again This avoids https://github.com/rust-lang/futures-rs/issues/2090. I don't think we need to worry about reading legacy conflicts asynchronously - async is really only useful for Google's backend right now, and we don't use the legacy format at Google. In particular, I don't want `MergedTree::value()` to have to be async.	2023-10-28 16:45:40 -07:00
Martin von Zweigbergk	a1ef9dc845	merged_tree: propagate backend errors in diff iterator I want to fix error propagation before I start using async in this code. This makes the diff iterator propagate errors from reading tree objects. Errors include the path and don't stop the iteration. The idea is that we should be able to show the user an error inline in diff output if we failed to read a tree. That's going to be especially useful for backends that can return `BackendError::AccessDenied`. That error variant doesn't yet exist, but I plan to add it, and use it in Google's internal backend.	2023-10-26 06:20:56 -07:00
Martin von Zweigbergk	309f1200d6	merge: introduce a type alias for `Merge<Option<TreeValue>>` Reasons to introduce this alias: * Reduces complexity of a type, to silence Clippy warnings in the future if we use this type as a type parameter * The type is used quite frequently, so it makes sense to have a name for it * It's easier to visually scan for the end of the type when you don't have to match opening and closing angle brackets	2023-10-26 06:20:56 -07:00
Martin von Zweigbergk	6ad71e658d	merged_tree: rename `MergedTreeValue` to `MergedTreeVal` I'm going to add `MergedTreeValue` as an alias for `Merge<Option<TreeValue>>`, but we already have a type by that name in `merged_tree`. This patch renames it away, to make room for the new alias. I used `MergedTreeVal` for this borrowing version to be a bit like how `str` is a borrowed version of `String`.	2023-10-26 06:20:56 -07:00
Yuya Nishihara	1bfe5b5b56	cli: add string pattern support to "git push --branch" Since "jj git fetch --branch" supports glob patterns, users would expect that "jj git push --branch glob:.." also works. The error handling bits are copied from "branch" sub commands. We might want to extract it to a common helper function, but I haven't figured out a reasonable boundary point yet.	2023-10-26 04:51:17 +09:00
Yuya Nishihara	a6ac9b46e7	git: simply call fetch() with one or more branch name filters	2023-10-25 03:58:48 +09:00
Yuya Nishihara	560b63544a	cli: parse "git fetch --branch" parameter as string pattern Even though "*" can't be used as a branch name to fetch, it should be better to explicitly enable glob matching like the other commands.	2023-10-25 03:58:48 +09:00
Martin von Zweigbergk	8157d4ff63	merge: materialize conflicts in executable files like regular files AFAICT, all callers of `Merge::to_file_merge()` are already well prepared for working with executable files. It's called from these places: * `local_working_copy.rs`: Materialized conflicts are correctly updated using `Merge::with_new_file_ids()`. * `merge_tools/`: Same as above. * `cmd_cat()`: We already ignore the executable bit when we print non-conflicted files, so it makes sense to also ignore it for conflicted files. * `git_diff_part()`: We print all conflicts with mode "100644" (the mode for regular files). Maybe it's best to use "100755" for conflicts that are unambiguously executable, or maybe it's better to use a fake mode like "000000" for all conflicts. Either way, the current behavior seems fine. * `diff_content()`: We use the diff content in various diff formats. We could add more detail about the executable bits in some of them, but I think the current output is fine. For example, instead of our current "Created conflict in my-file", we could say "Created conflict in executable file my-file" or "Created conflict in ambiguously executable file my-file". That's getting verbose, though. So, I think all we need to do is to make `Merge::to_file_merge()` not require its inputs to be non-executable. Closes #1279.	2023-10-24 06:45:45 -07:00
Martin von Zweigbergk	21b11df8a9	merge: make non-conflicted debug string for `Merge` shorter Resolves states are most common and the current format is pretty verbose. Let's print it as if `Merge` were an enum with `Resolved` and `Conflicted` variants instead.	2023-10-24 06:45:45 -07:00
Yuya Nishihara	831dc84c5b	git: remove RefName::GitRef variant which isn't used anymore	2023-10-24 09:45:01 +09:00
Yuya Nishihara	a756333216	git: move RefName enum from view module It's no longer used by View API.	2023-10-24 09:45:01 +09:00
Yuya Nishihara	95919699d4	repo: add merge methods per ref kind, remove RefName indirection Since local/remote branches are now of different types, it doesn't make much sense to dispatch merging through RefName. Let's add merge_<kind>() methods instead.	2023-10-24 09:45:01 +09:00
Yuya Nishihara	6dfe1572a0	git: expand RefName match arms in import_refs() This prepares for the removal of merge_single_refs().	2023-10-24 09:45:01 +09:00
Yuya Nishihara	ffe7e5f142	repo: inline merge_single_ref() from view MutableRepo handles merging of the other kind of refs internally, and the merge function is short enough to inline. I also removed early returns since most callers provide non-identical ref targets, and merge_ref_targets() should be cheap if the inputs can be trivially merged.	2023-10-24 09:45:01 +09:00
Yuya Nishihara	5543f7a11c	refs: merge tracking state of remote branches Otherwise "jj op undo" can't roll back tracking states (whereas "op restore" can.)	2023-10-24 07:13:58 +09:00
Yuya Nishihara	3d45540ff6	refs: add function to diff RemoteRef pairs	2023-10-24 07:13:58 +09:00
Yuya Nishihara	6450194ccd	refs: rename diff_named_refs() to diff_named_ref_targets() I'm going to add RemoteRef variant of this function, and "refs" there will be ambiguous.	2023-10-24 07:13:58 +09:00
Yuya Nishihara	59eb03eec5	refs: add helper function to iterate local/remote ref pairs This partially reverts the change in `30fb7995c2` "view: make local/remote branches iterator yield RemoteRef instead of RefTarget." As I'm going to add diff function for RemoteRef pairs, we'll need a generic version of merge-join iterator anyway.	2023-10-24 07:13:58 +09:00
Yuya Nishihara	16ef57907b	str_util: add more StringPattern methods for convenience The Display impl helps to format error messages. Since both Regex and glob::Pattern implement Display, it's probably okay for our type to copy that.	2023-10-22 04:07:49 +09:00
Yuya Nishihara	4c4eb31a62	view: add methods that filter local/remote branches by pattern We'll use them in CLI code.	2023-10-22 04:07:49 +09:00
Martin von Zweigbergk	578d61ec2e	git: use forward slashes in relative path to backing git repo Closes #2403.	2023-10-20 22:51:14 -05:00
Yuya Nishihara	cfcc76571c	revset: add support for glob:pattern	2023-10-21 09:55:01 +09:00
Yuya Nishihara	f7c8622981	str_util: extract StringPattern filtering iterator from revset module	2023-10-21 09:55:01 +09:00
Yuya Nishihara	2d80f071de	str_util: extract function that constructs StringPattern from string	2023-10-21 09:55:01 +09:00
Yuya Nishihara	5707a194d5	str_util: extract StringPattern from revset module Branch name filtering in CLI will be migrated to this, and I'll probably add glob:<pattern> in place of --glob option.	2023-10-21 09:55:01 +09:00
Martin von Zweigbergk	8764ad9826	conflicts: make materialization async We need to let async-ness propagate up from the backend because `block_on()` doesn't like to be called recursively. The conflict materialization code is a good place to make async because it doesn't depends on anything that isn't already async-ready.	2023-10-20 07:38:34 -07:00
Martin von Zweigbergk	e900c97618	conflicts: reduce some duplication in tests by extracting a closure	2023-10-20 07:38:34 -07:00
Martin von Zweigbergk	f541f9f3a6	cleanup: import `futures::exectutor::block_on()` instead of qualifying It seems we'll end up using `block_on()` quite a bit, at least until we're done transitioning to async, and the function name doesn't conflict with anything else, so let's always import it when we need it.	2023-10-20 07:38:34 -07:00
Yuya Nishihara	390b3208da	revset: always include non-tracking remote branches in suggestion For the same reason as the previous commit. Non-tracking remote branch shouldn't be shadowed by a local branch of the same name.	2023-10-17 16:42:36 +09:00
Yuya Nishihara	9e0b9e6dc8	refs: error out push if non-tracking remote branches exist We can provide more actionable error message than "not fast-forwardable". If the push was fast-forwardable, "jj branch track" should be able to merge the remote branch without conflicts, so the added step would be minimal.	2023-10-17 15:06:03 +09:00
Yuya Nishihara	089503abfb	refs: classify push action based on tracking target Although this is logically correct, the error message is a bit cryptic. It's probably better to reject push if non-tracking remote branches exist. #1136	2023-10-17 15:06:03 +09:00
Yuya Nishihara	30fb7995c2	view: make local/remote branches iterator yield RemoteRef instead of RefTarget We'll use remote_ref.tracking_target() to classify push action, but not all callers of local_remote_branches() need tracking_target() instead of target.	2023-10-17 15:06:03 +09:00
Yuya Nishihara	e0965c4533	git: on push, update jj's view of remote branches without using import_refs() This means that the commits previously pinned by remote branches are no longer abandoned. I think that's more correct since "push" is the operation to propagate local view to remote, and uninteresting commits should have been locally abandoned.	2023-10-17 15:06:03 +09:00
Yuya Nishihara	c8a848d260	git: prohibit push to remote named "git" Since I'm going to make git::push_branches() update the repo view internally, it should fail fast if the remote name is reserved. Before, the problem was detected on git::import_refs().	2023-10-17 15:06:03 +09:00
Yuya Nishihara	58897d79c7	git: extract push function that processes branches instead of git refs Since pushed remote branches will share the common base targets with locals, these branches should be marked as tracking. git::push_branches() will handle that. It looks ugly that the public GitBranchPushTargets type keeps "force"-d branches as a separate set, but we'll need to rework that anyway when we implement --force-with-lease behavior. So let's leave it for now. Some of the git::push_updates() tests have been migrated to the new function. I left a couple of basic tests for git::push_updates() because push_updates() will be used to implement a low-level "jj git push-refs" command.	2023-10-17 15:06:03 +09:00
Yuya Nishihara	57167cefda	git: on import_refs(), don't abandon ancestors of newly fetched refs I made import_refs() not preserve commits referenced by remote branches at `520f692a46` "git: on import_refs(), don't preserve old branches referenced by remote refs." The idea is that remote branches are weak, and commits referenced by these refs can be freely rewritten by future local changes without moving the refs. I don't think that's wrong, but `520f692a46` also made "new" remote changes be abandoned by old remote refs. This problem occurs only when git.auto-local-branch is off. I think there are two ways to fix the problem: a. pin non-tracking remote branches just like local refs b. pin newly fetched refs in addition to local refs This patch implements (b) because it's simpler and more obvious that the fetched commits would never be abandoned immediately.	2023-10-17 14:49:49 +09:00
Martin von Zweigbergk	c3b45b6fd1	workspace: make working-copy type customizable This add support for custom `jj` binaries to use custom working-copy backends. It works in the same way as with the other backends, i.e. we write a `.jj/working_copy/type` file when the working copy is initialized, and then we let that file control which implementation to use (see previous commit). I included an example of a (useless) working-copy implementation. I hope we can figure out a way to test the examples some day.	2023-10-16 22:33:44 -07:00
Martin von Zweigbergk	6bfd618275	workspace: load working copy implementation dynamically This makes `Workspace::load()` look a new `.jj/working_copy/type` file in order to load the right working copy implementation, just like `Repo::load()` picks the right backends based on `.jj/store/type`, `.jj/op_store/type`, etc. We don't write the file yet, and we don't have a way of adding alternative working copy implementations, so it will always be `LocalWorkingCopy` for now.	2023-10-16 22:33:44 -07:00
Martin von Zweigbergk	e1f00d9426	working copy: pass commit instead of tree into `check_out()` Our internal working copy implementations at Google will need the commit so they can walk history backwards until they get to a "public" commit. They'll then use that to tell build tools and virtual file systems to present that as a base. I'm not sure if we'll need to update `reset()` too. It's currently only used by `jj untrack`, which doesn't change the commit's parent, so it wouldn't affect any history walks.	2023-10-16 22:33:44 -07:00
Martin von Zweigbergk	7c8a0a18f9	repo: define types for backend initializer functions `ReadonlyRepo::init()` takes callbacks for initializing each kind of backend. We called these things like `op_store_initializer`. I found that confusing because it is not a `OpStoreFactory` (which is for loading an existing backend). This patch tries to clarify that by renaming the arguments and adding types for each kind of callback function.	2023-10-16 22:33:44 -07:00
Yuya Nishihara	9cafff87e1	cli: add API and branch subcommand to track/untrack remote branches This patch adds MutableRepo::track_remote_branch() as we'll probably need to track the default branch on "jj git clone". untrack_remote_branch() is also added for consistency.	2023-10-16 23:21:05 +09:00
Yuya Nishihara	4af678848d	op_store: minimal change to load/store tracking state of remote branches We could instead migrate the storage types to (local_branches, remote_views), but that would be more involved and break forward compatibility with little benefit. Maybe we can do that later when we introduce remote tags.	2023-10-16 23:21:05 +09:00
Yuya Nishihara	4cd2518be0	git: on import_refs(), respect tracking state of existing remote refs In this commit, new behavior is tested by using in-memory view data. Data persistence and track/untrack commands will be implemented soon.	2023-10-16 23:21:05 +09:00
Yuya Nishihara	a697175674	view: add tracking state to RemoteRef The state field isn't saved yet. git import/export code paths are migrated, but new tracking state is always calculated based on git.auto-local-branch setting. So the tracking state is effectively a global flag. As we don't know whether the existing remote branches have been merged in to local branches, we assume that remote branches are "tracking" so long as the local counterparts exist. This means existing locally-deleted branch won't be pushed without re-tracking it. I think it's rare to leave locally-deleted branches for long. For "git.auto-local-branch = false" setup, users might have to untrack branches if they've manually "merged" remote branches and want to continue that workflow. I considered using git.auto-local-branch setting in the migration path, but I don't think that would give a better result. The setting may be toggled after the branches got merged, and I'm planning to change it default off for better Git interop. Implementation-wise, the state enum can be a simple bool. It's enum just because I originally considered to pack "forgotten" concept into it. I have no idea which will be better for future extension.	2023-10-16 23:21:05 +09:00

... 2 3 4 5 6 ...

2482 commits