mirrors/jj

mirror of https://github.com/martinvonz/jj.git synced 2025-01-07 05:16:33 +00:00

Author	SHA1	Message	Date
Anton Bulakh	5ab00e197a	backend: Inline gix::Repository::commit_as to prepare for signing Additional bonus is that this allows us to avoid creating keep refs for the intermediate commits in the data race preventing loop.	2023-11-23 22:52:20 +02:00
Yuya Nishihara	042d26049c	working_copy: lazily construct file_states BTreeMap While it got faster to build a large BTreeMap<RepoPath, _>, there's still a measurable cost. Let's eliminate it if watchman is enabled and the working copy is clean. Perhaps, we should introduce new serialization format that supports instant loading and lookup, but this hack works for the moment. I'm not sure if the new tree_state format should be flat (RepoPath, _) list, or tree like the backend storage btw. In my "linux" repo (watchman enabled): % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \ "target/release-with-debug/{bin} -R ~/mirrors/linux status" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status Time (mean ± σ): 768.9 ms ± 14.2 ms [User: 630.7 ms, System: 131.2 ms] Range (min … max): 742.3 ms … 783.1 ms 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status Time (mean ± σ): 713.0 ms ± 16.8 ms [User: 587.9 ms, System: 116.2 ms] Range (min … max): 681.5 ms … 731.1 ms 10 runs Relative speed comparison 1.08 ± 0.03 target/release-with-debug/jj-0 -R ~/mirrors/linux status 1.00 target/release-with-debug/jj-1 -R ~/mirrors/linux status	2023-11-23 18:48:14 +09:00
Yuya Nishihara	12cd657837	working_copy: extract file_states_to_proto() helper Just minimizing the changes in the next commit. As we already have file_states_from_proto(), it makes sense to extract the "to" function.	2023-11-23 18:48:14 +09:00
Yuya Nishihara	74c4ef32aa	fsmonitor: exclude .git and .jj directories from changed files This ensures that the root fsmonitor_matcher matches nothing if there are no working-copy changes. The query result can be observed by "jj debug watchman query-changed-files". I don't have expertise on watchman query language, but using the watchman API is probably better than .filter()-ing the result manually.	2023-11-23 18:48:14 +09:00
Yuya Nishihara	1ddcaa43b3	fsmonitor: don't apply prefix matching to paths obtained from watchman If I understand it, watchman returns changed files and directories, and a directory change doesn't mean we need to scan all files under the directory.	2023-11-23 10:06:00 +09:00
Yuya Nishihara	767e94f5af	fsmonitor: drop unneeded mut from make_fsmonitor_matcher() We only need &self.working_copy_path here.	2023-11-23 10:06:00 +09:00
Yuya Nishihara	c16c89bc27	fsmonitor: keep paths relative to the workspace root Since the caller wants repo-relative paths, it doesn't make sense to convert them back and forth.	2023-11-23 10:06:00 +09:00
Yuya Nishihara	a4f6e0de0b	repo_path: extract helper that converts Path to RepoPath literally	2023-11-23 10:06:00 +09:00
Yuya Nishihara	31def4b131	cleanup: don't use debug format to print source errors	2023-11-23 10:05:37 +09:00
Yuya Nishihara	16620e0e4c	merged_tree: drop legacy tree handling from ConflictsDirItem constructor No callers pass in a legacy tree.	2023-11-21 07:45:30 +09:00
Yuya Nishihara	4ad3db2e84	merged_tree: extract value() function of non-legacy trees	2023-11-21 07:45:30 +09:00
Yuya Nishihara	ca3f549c9e	merged_tree: remove redundant clone() from ConflictIterator construction	2023-11-21 07:45:30 +09:00
Martin von Zweigbergk	acc35a89a8	merged_tree: inline non-recursive entry iterator	2023-11-19 20:29:40 -10:00
Martin von Zweigbergk	426f6d0cdd	merged_tree: inline non-recursive conflict iterator The abstraction is no longer useful since we made the types not self-referential.	2023-11-19 20:29:40 -10:00
Yuya Nishihara	5186066cf5	working_copy: simply collect() proto file states into BTreeMap Suppose the input list is presorted, sorting a sorted vec would be cheaper than .insert()-ing sorted items one by one. In my "linux" repo (watchman eanbled): - jj-0: baseline - jj-1: previous (don't randomize by HashMap) - jj-2: this % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \ "target/release-with-debug/{bin} -R ~/mirrors/linux status" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s] Range (min … max): 1.011 s … 1.068 s 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms] Range (min … max): 821.7 ms … 870.2 ms 10 runs Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux status Time (mean ± σ): 786.2 ms ± 16.7 ms [User: 650.7 ms, System: 204.1 ms] Range (min … max): 760.8 ms … 805.2 ms 10 runs Relative speed comparison 1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status 1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status 1.00 target/release-with-debug/jj-2 -R ~/mirrors/linux status	2023-11-20 08:29:33 +09:00
Yuya Nishihara	ee6a1e2c0a	working_copy: don't build intermediate HashMap from proto file states According to the doc, this is compatible with the map syntax. https://protobuf.dev/programming-guides/proto3/#maps This change means that the serialized file states are sorted by RepoPath, so BTreeMap<RepoPath, _> can be reconstructed with fewer cache misses. In my "linux" repo (watchman enabled): - jj-0: baseline - jj-1: this % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \ "target/release-with-debug/{bin} -R ~/mirrors/linux status" Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status Time (mean ± σ): 1.034 s ± 0.020 s [User: 0.881 s, System: 0.212 s] Range (min … max): 1.011 s … 1.068 s 10 runs Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status Time (mean ± σ): 849.3 ms ± 13.8 ms [User: 710.7 ms, System: 199.3 ms] Range (min … max): 821.7 ms … 870.2 ms 10 runs Relative speed comparison 1.32 ± 0.04 target/release-with-debug/jj-0 -R ~/mirrors/linux status 1.08 ± 0.03 target/release-with-debug/jj-1 -R ~/mirrors/linux status Cache-misses got reduced: % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \ -- ./target/release-with-debug/jj-0 -R ~/mirrors/linux --no-pager status 1,091.68 msec task-clock # 1.032 CPUs utilized 4,179,596,978 cycles # 3.829 GHz 6,166,231,489 instructions # 1.48 insn per cycle 134,032,047 cache-references # 122.776 M/sec 29,322,707 cache-misses # 21.88% of all cache refs 1.057474164 seconds time elapsed 0.897042000 seconds user 0.194819000 seconds sys % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \ -- ./target/release-with-debug/jj-1 -R ~/mirrors/linux --no-pager status 927.05 msec task-clock # 1.083 CPUs utilized 3,451,299,198 cycles # 3.723 GHz 6,222,418,272 instructions # 1.80 insn per cycle 98,499,363 cache-references # 106.251 M/sec 11,998,523 cache-misses # 12.18% of all cache refs 0.855938336 seconds time elapsed 0.720568000 seconds user 0.207924000 seconds sys	2023-11-20 08:29:33 +09:00
Yuya Nishihara	56047cb7ec	working_copy: don't pass all proto data to from_proto() functions Just a code cleanup. This allows us to consume proto fields if needed. I also removed redundant .clone() and .as_str().	2023-11-20 08:29:33 +09:00
Yuya Nishihara	d7e63837f4	index: use extend_wanted/unwanted() to initialize RevWalk	2023-11-17 21:38:56 +09:00
Yuya Nishihara	a9f3dd95e5	index: remove index field from RevWalkQueue, move it to caller	2023-11-17 21:38:56 +09:00
Yuya Nishihara	ff1bb3133e	index: move index-dependent operations out of RevWalkQueue RevWalkQueue no longer depend on the RevWalkIndex trait, and the index field can be removed.	2023-11-17 21:38:56 +09:00
Yuya Nishihara	32a7b33e56	index: optimize RevWalk to store IndexPosition instead of IndexEntry The idea is the same as the heads_pos() change in `9832ee205d`. While IndexEntry::position() should be cheap, saving 20 bytes per entry appears to improve the performance in mid-size repos. In my "linux" repo: revsets/all() ------------- baseline 1.24 156.0±1.06ms this 1.00 126.0±0.51ms I don't see significant difference in small-sized repos like "jj" or "git". IndexEntryByPosition isn't removed since it's still used by the revset engine.	2023-11-17 21:38:56 +09:00
Martin von Zweigbergk	9be24db051	tree: make TreeEntriesDirItem not self-referential This removes the last use of `ouroboros`. Since `TreeEntriesDirItem` is only used in "legacy trees" (before tree-level conflicts), I didn't bother to check the performance impact. I also didn't bother to check the matcher before adding the entries to the list, instead leaving that where it is in `Iterator::next()`.	2023-11-17 03:50:34 -08:00
Martin von Zweigbergk	c0295c5dbc	merged_tree: make ConflictsDirItem not self-referential This removes the last use of `ouroboros` in `merged_tree.rs`. The set of conflicts to iterate is usually so small that I didn't bother checking the performance impact.	2023-11-17 03:50:34 -08:00
Martin von Zweigbergk	e1a02c5c5b	merged_tree: make TreeDiffDirItem not self-referential This removes another dependency on `ouroboros`, for a small performance hit: ``` ❯ hyperfine --warmup 3 --runs 30 \ '/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' \ '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0' Benchmark 1: /tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0 Time (mean ± σ): 689.7 ms ± 23.9 ms [User: 400.0 ms, System: 289.8 ms] Range (min … max): 666.9 ms … 759.2 ms 30 runs Benchmark 2: /tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0 Time (mean ± σ): 710.9 ms ± 19.2 ms [User: 420.4 ms, System: 290.6 ms] Range (min … max): 688.5 ms … 752.0 ms 30 runs Summary '/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' ran 1.03 ± 0.05 times faster than '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0' ```	2023-11-17 03:50:34 -08:00
Martin von Zweigbergk	61d87fe296	merged_tree: make `TreeEntriesIterator` not self-referential While importing the `ouroboros` crate and the `aliasable` crate it depends on, the "unsafe Rust reviewer" expressed some concern that they contain a lot of unsafe code that's hard to review. We can avoid the unsafe code altogether by making `TreeEntriesIterator` not self-refential. Instead, we can collect the matching entries in an individual tree up front. It does have some performance cost: ``` ❯ hyperfine --warmup 3 --runs 30 \ '/tmp/jj-before --ignore-working-copy files -r v6.0' \ '/tmp/jj-after --ignore-working-copy files -r v6.0' Benchmark 1: /tmp/jj-before --ignore-working-copy files -r v6.0 Time (mean ± σ): 461.4 ms ± 14.3 ms [User: 232.1 ms, System: 229.4 ms] Range (min … max): 443.4 ms … 496.3 ms 30 runs Benchmark 2: /tmp/jj-after --ignore-working-copy files -r v6.0 Time (mean ± σ): 482.0 ms ± 14.3 ms [User: 257.2 ms, System: 224.9 ms] Range (min … max): 461.8 ms … 513.3 ms 30 runs Summary '/tmp/jj-before --ignore-working-copy files -r v6.0' ran 1.04 ± 0.04 times faster than '/tmp/jj-after --ignore-working-copy files -r v6.0' ``` I think that's acceptable.	2023-11-17 03:50:34 -08:00
Yuya Nishihara	93fbcec2f7	index: use BinaryHeap instead of BTreeSet in common_ancestors_pos() For the same reason as the heads_pos() change. We just want to omit duplicated items.	2023-11-16 08:27:59 +09:00
Yuya Nishihara	d4059520a9	index: cache generation numbers during common_ancestors_pos() computation I'm not sure if this is better, but common_ancestors_pos() would have a similar property to heads_pos().	2023-11-16 08:27:59 +09:00
Yuya Nishihara	ea4bdd718d	index: use "while let" in common_ancestors_pos()	2023-11-16 08:27:59 +09:00
Yuya Nishihara	02c84a8596	index: remove stale "allow(unstable_name_collisions)" I think this is remainder of nightly shims.	2023-11-16 08:27:59 +09:00
Yuya Nishihara	6399c392fd	index: make heads_pos() deduplicate entries without building separate set This is much faster (maybe because of better cache locality?) Another option is to use BTreeSet, but the BinaryHeap version is slightly faster. "bench revset" result in my linux repo: revsets/heads(tags()) --------------------- baseline 3.28 560.6±4.01ms 1 2.92 500.0±2.99ms 2 1.98 339.6±1.64ms 3 (this) 1.00 171.2±0.30ms	2023-11-16 08:27:59 +09:00
Yuya Nishihara	9832ee205d	index: optimize heads_pos() to cache generation numbers during computation Apparently, IndexEntry::generation_number() isn't cheap probably because it involves random access to larger memory region, and the u32 value might not be aligned. Let's instead store the generation numbers in BinaryHeap. Also, heads_pos() becomes slightly faster by keeping the BinaryHeap entries small, so I've removed the IndexEntry at all. This makes the default log and disambiguation revsets fast, which evaluate 'heads(immutable_heads())'. "bench revset" result in my linux repo: revsets/heads(tags()) --------------------- baseline 3.28 560.6±4.01ms 1 2.92 500.0±2.99ms 2 (this) 1.98 339.6±1.64ms	2023-11-16 08:27:59 +09:00
Yuya Nishihara	1e933b84dd	index: make IndexEntry::parents() lazy instead of collecting to Vec All callers just iterate over the parent entries. "bench revset" result in my linux repo: revsets/heads(tags()) --------------------- baseline 3.28 560.6±4.01ms 1 (this) 2.92 500.0±2.99ms	2023-11-16 08:27:59 +09:00
Yuya Nishihara	39b065f7ab	git: on import_refs(), exclude uninteresting dirs such as refs/jj/keep For loose refs, uninteresting directories can be just skipped. For packed refs, gix will have to do binary search for each prefix to find the starting point. Still it's better overall if the repository contains tons of refs/jj/keep refs. With my linux repo containing ~5k loose jj refs, this saves ~40ms: % hyperfine --warmup 3 --runs 10 \ "/tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux" \ "/tmp/jj-gix-iter --ignore-working-copy git import -R ~/mirrors/linux" Benchmark 1: /tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux Time (mean ± σ): 151.6 ms ± 11.4 ms [User: 38.8 ms, System: 111.6 ms] Range (min … max): 129.8 ms … 159.5 ms 10 runs Benchmark 2: /tmp/jj-gix-iter --ignore-working-copy git import -R ~/mirrors/linux Time (mean ± σ): 109.9 ms ± 11.6 ms [User: 27.5 ms, System: 82.4 ms] Range (min … max): 89.4 ms … 117.8 ms 10 runs	2023-11-14 17:35:27 +09:00
Yuya Nishihara	044716ee40	git: migrate import_refs() to gix::Repository Gitoxide errors are boxed since there are various error types and they tend to exceed the clippy size limit. Apparently, gitoxide is faster than git2: % hyperfine --warmup 3 --runs 10 \ "/tmp/jj-baseline --ignore-working-copy git import -R ~/mirrors/linux" \ "/tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux" Benchmark 1: /tmp/jj-baseline --ignore-working-copy git import -R ~/mirrors/linux Time (mean ± σ): 205.4 ms ± 15.7 ms [User: 59.6 ms, System: 144.6 ms] Range (min … max): 189.7 ms … 223.9 ms 10 runs Benchmark 2: /tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux Time (mean ± σ): 176.2 ms ± 13.7 ms [User: 41.2 ms, System: 134.0 ms] Range (min … max): 155.4 ms … 186.5 ms 10 runs	2023-11-14 17:35:27 +09:00
Yuya Nishihara	6c98dfcdcb	git: have import_refs() obtain git2::Repository instance from store This helps gitoxide migration. It's theoretically possible to import Git refs from non-Git backend, but I don't think such API flexibility is needed.	2023-11-14 17:35:27 +09:00
Yuya Nishihara	dbb1adaf0a	git: move import-related types close to import_refs() function	2023-11-14 17:35:27 +09:00
Yuya Nishihara	f991705e47	tests: add test for importing missing ancestor of HEAD If a commit pointed to by HEAD or ref is missing, the ref is considered invalid and excluded by import_refs(). The current test behavior appears to depend on some in-memory cache of git2::Repository.	2023-11-14 17:35:27 +09:00
Yuya Nishihara	8e143541a5	operation: propagate OpStoreError from parents() We need to .collect_vec() the parents iterator to temporary buffer since the borrowed iterator can't be returned back to the dag_walk functions. Another option is to clone op_store and parent ids to remove &self lifetime from the iterator, but that also means a temporary Vec is created.	2023-11-14 07:16:39 +09:00
Yuya Nishihara	8ddad859e8	dag_walk: add fallible topo_order_reverse_lazy() Unlike dfs_ok(), this function short-circuits at an Err as we use non-lazy topo_order_forward() internally. I think that's good enough. If we implement GC on operation log, deleted parents will be excluded (or mapped to tombstone) by caller. An Err shouldn't mean it's GC-ed.	2023-11-14 07:16:39 +09:00
Yuya Nishihara	3d5a07e86a	dag_walk: add fallible dfs(), topo_order(), heads(), and closest_common_node() This unblocks the use of Result<T, E> in op.parents(). There are two ways to encode errors: a. impl IntoIterator<Item = Result<T, E>> b. Result<V, E> where V: FromIterator<Item = T> I think (a) is more natural to algorithms like dfs(), which can process error nodes transparently. Still the caller might have to collect the source iterator to temporary Vec to conform to the neighbors_fn signature. It's not easy for neighbors_fn to return an iterator borrowing the input node. We already have GAT, but doesn't have return-position impl Trait in trait yet.	2023-11-14 07:16:39 +09:00
Yuya Nishihara	e5a9a26911	dag_walk: remove unused and untested leaves() function	2023-11-14 07:16:39 +09:00
Anton Bulakh	e3a1e5b80e	sign: Implement storage for digital commit signatures Recognize signature metadata from git commit objects, implement a basic version of that for the native backend. Extract the signed data (a commit binary repr without the signature) to be verified later.	2023-11-12 03:37:13 +02:00
Yuya Nishihara	b42a69db6d	git_backend: configure committer (and author) of gix::Repository Otherwise, ref updates would fail if we port git::export_refs() to gitoxide. This change isn't strictly needed for the backend itself, but we'll reuse the gix::Repository instance created by the backend when importing and exporting Git refs.	2023-11-11 22:35:54 +09:00
Yuya Nishihara	ea32c0cb9e	git_backend: pass UserSettings to GitBackend constructors	2023-11-11 22:35:54 +09:00
Yuya Nishihara	8a2048a0e5	repo: pass UserSettings to store factories and initializers GitBackend will use it to configure gix::Repository. I think UserSettings is generally useful to pass store-specific parameters, so I've updated all factory functions.	2023-11-11 22:35:54 +09:00
Yuya Nishihara	6125fb160e	op_store: embed details in operation/view not found error This is basically a copy of BackendError::ObjectNotFound. The failed id may be either view or operation id.	2023-11-11 22:35:40 +09:00
Yuya Nishihara	ea96513fd1	op_store: deduplicate functions that map io::Error to OpStoreError io_to_read_error() also translates ErrorKind::NotFound.	2023-11-11 22:35:40 +09:00
Martin von Zweigbergk	120115a20d	cli: pass `MaterializedTreeValue` into `git_diff_part()` Just a little preparation for reading the materialized values concurrently.	2023-11-10 04:54:47 -08:00
Waleed Khan	a60733f632	tree: remove unsafe with `ouroboros` for self-referential iterators	2023-11-09 21:50:29 -08:00
Yuya Nishihara	6ff3a4f3df	repo: reimplement DirtyCell without using unsafe While the safe implementation is a bit more complex (and probably more branchy), I don't think the runtime overhead would matter here. Let's remove one more unsafe for better code maintainability.	2023-11-10 07:42:45 +09:00

1 2 3 4 5 ...

2291 commits