Commit graph

2312 commits

Author SHA1 Message Date
Yuya Nishihara
974a6870b3 repo_path: make RepoPath::components() return iterator
This allows us to change the backing type from Vec<String> to String.
2023-11-27 08:42:09 +09:00
Yuya Nishihara
aba8c640be repo_path: capture current Vec<String> ordering by tests
The added test would fail if paths were purely ordered by concatenated strings.
I'm not sure if we want to preserve the current ordering, but let's not break
it for the moment.
2023-11-27 08:42:09 +09:00
Ilya Grigoriev
3967f637dc cli rebase: do not allow -r --skip-empty to drop emptied descendants
This follows up on @matts1 's #2609.

We still allow the `-r` commit to become empty. I would be more comfortable if
there was a test for that, but I haven't done that (yet?) and it seems pretty
safe. If that's a problem, I'm happy to forbid `-r --skip-empty` entirely,
since it is far less useful than `-s --skip-empty` or `-b --skip-empty`.

I think it is undesired to abandon emptied descendants. As far as descendants
of `A` are concerned, `jj rebase -r A` should be equivalent to `jj abandon A`,
and `jj abandon` does not remove emptied commits. It also doesn't seem very
useful to do that, since I think descendant commits of an abandoned (or moved
with `-r`) commit only become empty in pathological cases.

Additionally, if we did want -r to empty descendants of `A`, we'd have to add
thorough tests and possibly improve the algorithm. I want to refactor `rebase
-r` and add features to it, and having to consider cases of commits becoming
abandoned makes everything harder.

For example, if we have

```
root -> A -> B -> C
```

and `jj rebase -r A -d C` empties commit `B` (or `C`), I do not know whether
the current algorithm will work correctly. It seems possible that it would, but
that depends on the fact that empty merge commits are not abandoned for
descendants. That seems dangerous to rely on without tests.

I hope (but can't promise) that in the near future, making DescendantRebaser
return more information  should help make it possible to create such
functionality in a more robust way. I am likely to attempt this as part of
implementing `-r --after`.
2023-11-26 10:56:58 -08:00
Yuya Nishihara
59ef3f0023 repo_path: split RepoPathComponent into owned and borrowed types
This is a step towards introducing a borrowed RepoPath type. The current
RepoPath type is inefficient as each component String is usually short. We
could apply short-string optimization, but still each inlined component would
consume 24 bytes just for e.g. "src", and increase the chance of random memory
access. If the owned RepoPath type is backed by String, we can implement cheap
cast from &str to borrowed &RepoPath type.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
f2096da2d6 repo_path: add stub type to introduce borrowed RepoPathComponent type
The current RepoPathComponent will be renamed to RepoPathComponentBuf, and
new str wrapper will be added as RepoPathComponent.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
e14b31a033 repo_path: reject leading slash and empty path components
Leading/trailing slashes would introduce a bit of complexity if we migrate
the backing type from Vec<String> to String. Empty components are okay, but
let's reject them as they are cryptic and invalid.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
755af75c30 repo_path: in tests, use alias to RepoPath::from_internal_string()
It seemed too verbose to spell the full function name in tests.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
b5b01f4dd7 cargo: add ref-cast dependency
It helps to implement transparent conversion from &str to &Wrapped(str). We
could instead wrap the reference as Wrapped<'a>(&'a str), but it has various
drawbacks. Notably we can't implement Borrow and Deref because these traits
require a reference in return position.

Since the unsafe bits are pretty small, we can instead implement cast functions
without using the ref-cast crate. However, I believe we'll trust ref-cast more
than hand-crafted unsafe code.

https://crates.io/crates/ref-cast
https://docs.rs/ref-cast/1.0.20/ref_cast/attr.ref_cast_custom.html
2023-11-26 18:21:40 +09:00
Yuya Nishihara
b7543f8a08 rewrite: fix check for newly-empty commit in optimized path
'old_base_tree_id == None' means the rebased tree is unchanged, so the commit
shouldn't be considered newly-empty.
2023-11-26 14:42:17 +09:00
Yuya Nishihara
2f93de9299 rewrite: flatten mapping from EmptyBehaviour to desired action
I think this is slightly easier to follow.
2023-11-26 14:42:17 +09:00
Ilya Grigoriev
c32847696d rewrite.rs: rename new_parents to parent_mapping
The function `new_parents` makes sense, but I found the mapping
being named `new_parents` confusing.
2023-11-25 21:36:35 -08:00
Yuya Nishihara
6344cd56b3 repo_path: remove RepoPathJoin trait, just implement join() on the type
I don't think we'll add join() that takes different types.
2023-11-26 07:14:47 +09:00
Yuya Nishihara
d7df2516c5 repo_path: remove RepoPathComponent::string(), use as_str() instead
There are only two callers, and one does further conversion to BString.
2023-11-26 07:14:47 +09:00
Martin von Zweigbergk
6d54afa60e revset: make evaluate_programmatic() optimize expression
It seems generally useful to optimize revset expressions in
`evaluate_programmatic()` so the caller doesn't have to remember to do
it. It should generally be cheap to do so even if it's often not
needed.
2023-11-24 21:13:58 -10:00
Martin von Zweigbergk
550164209c revset: add a RevsetExpression::evaluate_programmatic()
We often resolve a programmatic revset and then immediately evaluate
it. This patch adds a convenience method for those two steps.
2023-11-24 21:13:58 -10:00
Martin von Zweigbergk
f2602f78cf revset: make resolve_programmatic() not return a Result
I think it's always a programming error if `resolve_programmatic()`
returns a `Result`, so it shouldn't have to return a `Result`.
2023-11-24 21:13:58 -10:00
Martin von Zweigbergk
f27f52984e revset: rename resolve() to resolve_programmatic()
`RevsetExpression::resolve()` is meant for programmatically created
expressions. In particular, it may not contain symbols. Let's try to
clarify that by renaming the function and documenting it.
2023-11-24 21:13:58 -10:00
Yuya Nishihara
b37293fa68 tests: add upper bound to test_concurrent_read_write_commit() loop
Hopefully this will fix the unfinished Windows CI issue. A possible scenario
is that recent migration to gitoxide made this test flaky on Windows. For
example, gitoxide might have in-memory object cache that relies on file mtime,
and occasionally fails to detect new object on Windows.
2023-11-24 18:07:35 +09:00
Matt Stark
0a95e20ebe lib: Implement skipping of empty commits 2023-11-24 14:48:06 +11:00
Matt Stark
dc89566039 lib: Create struct RebaseOptions 2023-11-24 14:48:06 +11:00
Anton Bulakh
5c3c0e9f6e sign: Implement generic commit signing on the backend 2023-11-23 22:52:20 +02:00
Anton Bulakh
5ab00e197a backend: Inline gix::Repository::commit_as to prepare for signing
Additional bonus is that this allows us to avoid creating keep refs for the
intermediate commits in the data race preventing loop.
2023-11-23 22:52:20 +02:00
Yuya Nishihara
042d26049c working_copy: lazily construct file_states BTreeMap
While it got faster to build a large BTreeMap<RepoPath, _>, there's still
a measurable cost. Let's eliminate it if watchman is enabled and the working
copy is clean. Perhaps, we should introduce new serialization format that
supports instant loading and lookup, but this hack works for the moment.
I'm not sure if the new tree_state format should be flat (RepoPath, _) list,
or tree like the backend storage btw.

In my "linux" repo (watchman enabled):
    % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
      "target/release-with-debug/{bin} -R ~/mirrors/linux status"
    Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
      Time (mean ± σ):     768.9 ms ±  14.2 ms    [User: 630.7 ms, System: 131.2 ms]
      Range (min … max):   742.3 ms … 783.1 ms    10 runs

    Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
      Time (mean ± σ):     713.0 ms ±  16.8 ms    [User: 587.9 ms, System: 116.2 ms]
      Range (min … max):   681.5 ms … 731.1 ms    10 runs

    Relative speed comparison
            1.08 ±  0.03  target/release-with-debug/jj-0 -R ~/mirrors/linux status
            1.00          target/release-with-debug/jj-1 -R ~/mirrors/linux status
2023-11-23 18:48:14 +09:00
Yuya Nishihara
12cd657837 working_copy: extract file_states_to_proto() helper
Just minimizing the changes in the next commit. As we already have
file_states_from_proto(), it makes sense to extract the "to" function.
2023-11-23 18:48:14 +09:00
Yuya Nishihara
74c4ef32aa fsmonitor: exclude .git and .jj directories from changed files
This ensures that the root fsmonitor_matcher matches nothing if there are no
working-copy changes. The query result can be observed by "jj debug watchman
query-changed-files".

I don't have expertise on watchman query language, but using the watchman API
is probably better than .filter()-ing the result manually.
2023-11-23 18:48:14 +09:00
Yuya Nishihara
1ddcaa43b3 fsmonitor: don't apply prefix matching to paths obtained from watchman
If I understand it, watchman returns changed files and directories, and a
directory change doesn't mean we need to scan all files under the directory.
2023-11-23 10:06:00 +09:00
Yuya Nishihara
767e94f5af fsmonitor: drop unneeded mut from make_fsmonitor_matcher()
We only need &self.working_copy_path here.
2023-11-23 10:06:00 +09:00
Yuya Nishihara
c16c89bc27 fsmonitor: keep paths relative to the workspace root
Since the caller wants repo-relative paths, it doesn't make sense to convert
them back and forth.
2023-11-23 10:06:00 +09:00
Yuya Nishihara
a4f6e0de0b repo_path: extract helper that converts Path to RepoPath literally 2023-11-23 10:06:00 +09:00
Yuya Nishihara
31def4b131 cleanup: don't use debug format to print source errors 2023-11-23 10:05:37 +09:00
Yuya Nishihara
16620e0e4c merged_tree: drop legacy tree handling from ConflictsDirItem constructor
No callers pass in a legacy tree.
2023-11-21 07:45:30 +09:00
Yuya Nishihara
4ad3db2e84 merged_tree: extract value() function of non-legacy trees 2023-11-21 07:45:30 +09:00
Yuya Nishihara
ca3f549c9e merged_tree: remove redundant clone() from ConflictIterator construction 2023-11-21 07:45:30 +09:00
Martin von Zweigbergk
acc35a89a8 merged_tree: inline non-recursive entry iterator 2023-11-19 20:29:40 -10:00
Martin von Zweigbergk
426f6d0cdd merged_tree: inline non-recursive conflict iterator
The abstraction is no longer useful since we made the types not
self-referential.
2023-11-19 20:29:40 -10:00
Yuya Nishihara
5186066cf5 working_copy: simply collect() proto file states into BTreeMap
Suppose the input list is presorted, sorting a sorted vec would be cheaper
than .insert()-ing sorted items one by one.

In my "linux" repo (watchman eanbled):
 - jj-0: baseline
 - jj-1: previous (don't randomize by HashMap)
 - jj-2: this

    % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
        "target/release-with-debug/{bin} -R ~/mirrors/linux status"
    Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
      Time (mean ± σ):      1.034 s ±  0.020 s    [User: 0.881 s, System: 0.212 s]
      Range (min … max):    1.011 s …  1.068 s    10 runs

    Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
      Time (mean ± σ):     849.3 ms ±  13.8 ms    [User: 710.7 ms, System: 199.3 ms]
      Range (min … max):   821.7 ms … 870.2 ms    10 runs

    Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux status
      Time (mean ± σ):     786.2 ms ±  16.7 ms    [User: 650.7 ms, System: 204.1 ms]
      Range (min … max):   760.8 ms … 805.2 ms    10 runs

    Relative speed comparison
            1.32 ±  0.04  target/release-with-debug/jj-0 -R ~/mirrors/linux status
            1.08 ±  0.03  target/release-with-debug/jj-1 -R ~/mirrors/linux status
            1.00          target/release-with-debug/jj-2 -R ~/mirrors/linux status
2023-11-20 08:29:33 +09:00
Yuya Nishihara
ee6a1e2c0a working_copy: don't build intermediate HashMap from proto file states
According to the doc, this is compatible with the map syntax.
https://protobuf.dev/programming-guides/proto3/#maps

This change means that the serialized file states are sorted by RepoPath,
so BTreeMap<RepoPath, _> can be reconstructed with fewer cache misses.

In my "linux" repo (watchman enabled):
 - jj-0: baseline
 - jj-1: this

    % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
      "target/release-with-debug/{bin} -R ~/mirrors/linux status"
    Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
      Time (mean ± σ):      1.034 s ±  0.020 s    [User: 0.881 s, System: 0.212 s]
      Range (min … max):    1.011 s …  1.068 s    10 runs

    Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
      Time (mean ± σ):     849.3 ms ±  13.8 ms    [User: 710.7 ms, System: 199.3 ms]
      Range (min … max):   821.7 ms … 870.2 ms    10 runs

    Relative speed comparison
            1.32 ±  0.04  target/release-with-debug/jj-0 -R ~/mirrors/linux status
            1.08 ±  0.03  target/release-with-debug/jj-1 -R ~/mirrors/linux status

Cache-misses got reduced:

    % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
      -- ./target/release-with-debug/jj-0 -R ~/mirrors/linux --no-pager status

              1,091.68 msec task-clock                       #    1.032 CPUs utilized
         4,179,596,978      cycles                           #    3.829 GHz
         6,166,231,489      instructions                     #    1.48  insn per cycle
           134,032,047      cache-references                 #  122.776 M/sec
            29,322,707      cache-misses                     #   21.88% of all cache refs

           1.057474164 seconds time elapsed

           0.897042000 seconds user
           0.194819000 seconds sys

    % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
      -- ./target/release-with-debug/jj-1 -R ~/mirrors/linux --no-pager status

                927.05 msec task-clock                       #    1.083 CPUs utilized
         3,451,299,198      cycles                           #    3.723 GHz
         6,222,418,272      instructions                     #    1.80  insn per cycle
            98,499,363      cache-references                 #  106.251 M/sec
            11,998,523      cache-misses                     #   12.18% of all cache refs

           0.855938336 seconds time elapsed

           0.720568000 seconds user
           0.207924000 seconds sys
2023-11-20 08:29:33 +09:00
Yuya Nishihara
56047cb7ec working_copy: don't pass all proto data to from_proto() functions
Just a code cleanup. This allows us to consume proto fields if needed.
I also removed redundant .clone() and .as_str().
2023-11-20 08:29:33 +09:00
Yuya Nishihara
d7e63837f4 index: use extend_wanted/unwanted() to initialize RevWalk 2023-11-17 21:38:56 +09:00
Yuya Nishihara
a9f3dd95e5 index: remove index field from RevWalkQueue, move it to caller 2023-11-17 21:38:56 +09:00
Yuya Nishihara
ff1bb3133e index: move index-dependent operations out of RevWalkQueue
RevWalkQueue no longer depend on the RevWalkIndex trait, and the index field
can be removed.
2023-11-17 21:38:56 +09:00
Yuya Nishihara
32a7b33e56 index: optimize RevWalk to store IndexPosition instead of IndexEntry
The idea is the same as the heads_pos() change in 9832ee205d. While
IndexEntry::position() should be cheap, saving 20 bytes per entry appears to
improve the performance in mid-size repos.

In my "linux" repo:

    revsets/all()
    -------------
    baseline  1.24     156.0±1.06ms
    this      1.00     126.0±0.51ms

I don't see significant difference in small-sized repos like "jj" or "git".

IndexEntryByPosition isn't removed since it's still used by the revset engine.
2023-11-17 21:38:56 +09:00
Martin von Zweigbergk
9be24db051 tree: make TreeEntriesDirItem not self-referential
This removes the last use of `ouroboros`. Since `TreeEntriesDirItem`
is only used in "legacy trees" (before tree-level conflicts), I didn't
bother to check the performance impact. I also didn't bother to check
the matcher before adding the entries to the list, instead leaving
that where it is in `Iterator::next()`.
2023-11-17 03:50:34 -08:00
Martin von Zweigbergk
c0295c5dbc merged_tree: make ConflictsDirItem not self-referential
This removes the last use of `ouroboros` in `merged_tree.rs`. The set
of conflicts to iterate is usually so small that I didn't bother
checking the performance impact.
2023-11-17 03:50:34 -08:00
Martin von Zweigbergk
e1a02c5c5b merged_tree: make TreeDiffDirItem not self-referential
This removes another dependency on `ouroboros`, for a small
performance hit:


```
❯ hyperfine --warmup 3 --runs 30 \
      '/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' \
      '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0'
Benchmark 1: /tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0
  Time (mean ± σ):     689.7 ms ±  23.9 ms    [User: 400.0 ms, System: 289.8 ms]
  Range (min … max):   666.9 ms … 759.2 ms    30 runs

Benchmark 2: /tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0
  Time (mean ± σ):     710.9 ms ±  19.2 ms    [User: 420.4 ms, System: 290.6 ms]
  Range (min … max):   688.5 ms … 752.0 ms    30 runs

Summary
  '/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' ran
    1.03 ± 0.05 times faster than '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0'
```
2023-11-17 03:50:34 -08:00
Martin von Zweigbergk
61d87fe296 merged_tree: make TreeEntriesIterator not self-referential
While importing the `ouroboros` crate and the `aliasable` crate it
depends on, the "unsafe Rust reviewer" expressed some concern that
they contain a lot of unsafe code that's hard to review. We can avoid
the unsafe code altogether by making `TreeEntriesIterator` not
self-refential. Instead, we can collect the matching entries in an
individual tree up front. It does have some performance cost:


```
❯ hyperfine --warmup 3 --runs 30 \
      '/tmp/jj-before --ignore-working-copy files -r v6.0' \
      '/tmp/jj-after --ignore-working-copy files -r v6.0'
Benchmark 1: /tmp/jj-before --ignore-working-copy files -r v6.0
  Time (mean ± σ):     461.4 ms ±  14.3 ms    [User: 232.1 ms, System: 229.4 ms]
  Range (min … max):   443.4 ms … 496.3 ms    30 runs

Benchmark 2: /tmp/jj-after --ignore-working-copy files -r v6.0
  Time (mean ± σ):     482.0 ms ±  14.3 ms    [User: 257.2 ms, System: 224.9 ms]
  Range (min … max):   461.8 ms … 513.3 ms    30 runs

Summary
  '/tmp/jj-before --ignore-working-copy files -r v6.0' ran
    1.04 ± 0.04 times faster than '/tmp/jj-after --ignore-working-copy files -r v6.0'
```

I think that's acceptable.
2023-11-17 03:50:34 -08:00
Yuya Nishihara
93fbcec2f7 index: use BinaryHeap instead of BTreeSet in common_ancestors_pos()
For the same reason as the heads_pos() change. We just want to omit duplicated
items.
2023-11-16 08:27:59 +09:00
Yuya Nishihara
d4059520a9 index: cache generation numbers during common_ancestors_pos() computation
I'm not sure if this is better, but common_ancestors_pos() would have a
similar property to heads_pos().
2023-11-16 08:27:59 +09:00
Yuya Nishihara
ea4bdd718d index: use "while let" in common_ancestors_pos() 2023-11-16 08:27:59 +09:00
Yuya Nishihara
02c84a8596 index: remove stale "allow(unstable_name_collisions)"
I think this is remainder of nightly shims.
2023-11-16 08:27:59 +09:00