ok/jj
1
0
Fork 0
forked from mirrors/jj
Commit graph

2642 commits

Author SHA1 Message Date
Yuya Nishihara
787fa1340b workspace: remove redundant cloning from init_external_git()
Apparently, I forgot to update it in 1db033504c "repo, workspace: remove
'static lifetime bound from initializer functions."
2023-12-05 14:23:59 -08:00
Yuya Nishihara
899c6375a0 git_backend: don't fully canonicalize .git symlink
Apparently, libgit2 doesn't deduce "core.bare" config from the directory name,
but gitoxide implements it correctly. So we shouldn't blindly canonicalize
the Git repository path. Fortunately, the saved git_target path isn't a fully-
canonicalized form (unless user explicitly sepcified "--git-repo ./.git"), so
we don't need a hack to remap git_target back to the symlink path.

is_colocated_git_workspace() is adjusted since the git_workdir is no longer
resolved from the fully-canonicalized repo path, at least in our code. Still we
have the ".git/.." fallback because test_init_git_colocated_symlink_gitlink()
would otherwise fail. I haven't figured out why, and the test might be actually
wrong compared to the git CLI behavior, but let's not change that for now.

Fixes #2668
2023-12-05 14:23:59 -08:00
Martin von Zweigbergk
1cc271441f gc: implement basic GC for Git backend
This adds an initial `jj util gc` command, which simply calls `git gc`
when using the Git backend. That should already be useful in
non-colocated repos because it's not obvious how to GC (repack) such
repos. In my own jj repo, it shrunk `.jj/repo/store/` from 2.4 GiB to
780 MiB, and `jj log --ignore-working-copy` was sped up from 157 ms to
86 ms.

I haven't added any tests because the functionality depends on having
`git` binary on the PATH, which we don't yet depend on anywhere
else. I think we'll still be able to test much of the future parts of
garbage collection without a `git` binary because the interesting
parts are about manipulating the Git repo before calling `git gc` on
it.
2023-12-03 07:40:12 -08:00
Yuya Nishihara
35f718f212 merged_tree: remove canceling terms prior to resolving file-level conflict
I think this is a variant of the problem fixed by 7fda80fc22 "tree: simplify
conflict before resolving at hunk level." We need to simplify() the conflict
before and after extracting file ids because the source conflict values may
contain trees to be cancelled out, and the file values may differ only in exec
bits. Since the legacy tree passes a simplified conflict in to this function,
I made the merged tree do the same.

Fixes #2654
2023-12-03 07:44:58 +09:00
Yuya Nishihara
4ffbf40c82 merged_tree: do not propagate conflicting empty tree value to parent
Otherwise an empty subtree would be added to the parent tree.

If the stored tree contained an empty subtree, simplify() wouldn't work
against new "absent" subtree representation. I don't know if there's a
such code path, but I believe it's very rare to encounter the problem.

#2654
2023-12-03 07:44:58 +09:00
Yuya Nishihara
1db033504c repo, workspace: remove 'static lifetime bound from initializer functions 2023-12-03 07:44:41 +09:00
Yuya Nishihara
d747879aee signing: pass SigningFn by reference
write_commit() doesn't need ownership of the signing function.
2023-12-01 22:55:04 +09:00
Anton Bulakh
eb1c0ab4a2 sign: Implement a test signing backend and add a few basic tests 2023-11-30 23:36:56 +02:00
Anton Bulakh
d7229a3f90 sign: Define signing backend API and integrate it
Finished everything except actual signing backend implementation(s) and
the UI.
2023-11-30 23:36:56 +02:00
Yuya Nishihara
076b49b610 merged_tree: use merged_tree_entry_diff() in stream version 2023-12-01 00:05:06 +09:00
Yuya Nishihara
97a260b1bf merged_tree: reimplement TreeEntryDiffIterator by using iterator adapter
We don't need a named type anymore.
2023-12-01 00:05:06 +09:00
Yuya Nishihara
fd1c03d037 merged_tree: use sync get_tree() in TreeDiffIterator
This basically backs out the change 1b9a3e27e0 "merged_tree: read before/after
trees concurrently." As we decided to add a separate impl for async access, it
doesn't make sense to read before/after pair in parallel.

The async single_tree() is moved to TreeDiffStreamImpl. It will help remove
the sync version when the performance problem is solved.
2023-12-01 00:05:06 +09:00
Yuya Nishihara
601be0d480 working_copy: narrow file_states recursively while visiting directories
This saves another ~10ms.

Without watchman:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-1,jj-2 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     327.7 ms ±  24.9 ms    [User: 1059.1 ms, System: 654.3 ms]
  Range (min … max):   296.0 ms … 385.4 ms    20 runs

Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     311.0 ms ±  24.8 ms    [User: 960.0 ms, System: 643.1 ms]
  Range (min … max):   274.9 ms … 358.5 ms    20 runs
```
2023-11-30 12:09:31 +09:00
Yuya Nishihara
a935a4f70c working_copy: use proto file states without rebuilding BTreeMap
In snapshot(), changed_file_states are received in arbitrary order. For the
other callers, entries are in diff_stream order, so we don't have to sort
them.

With watchman enabled, we can see the cost of sorting the sorted proto entries.
I don't think this is significant, but we can mitigate it by adding
is_file_states_sorted flag to the proto message if needed:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     164.8 ms ±  16.6 ms    [User: 50.2 ms, System: 111.7 ms]
  Range (min … max):   148.1 ms … 195.0 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     171.8 ms ±  13.6 ms    [User: 61.7 ms, System: 109.0 ms]
  Range (min … max):   159.5 ms … 192.1 ms    20 runs
```

Without watchman:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files ~/mirrors/linux/no-match"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     367.3 ms ±  30.3 ms    [User: 1415.2 ms, System: 633.8 ms]
  Range (min … max):   325.4 ms … 421.7 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files ~/mirrors/linux/no-match
  Time (mean ± σ):     327.7 ms ±  24.9 ms    [User: 1059.1 ms, System: 654.3 ms]
  Range (min … max):   296.0 ms … 385.4 ms    20 runs
```

I haven't measured snapshotting against dirty working copy, but I don't think
it would be slower than the original implementation.
2023-11-30 12:09:31 +09:00
Yuya Nishihara
fca3690dda working_copy: add file states wrapper that provides map-like API
I'll replace the current lazy loading mechanism with this. Read-only methods
are implemented on the borrowed type so that we can narrow lookup scope
recursively.
2023-11-30 12:09:31 +09:00
Yuya Nishihara
9292af5e52 working_copy: update file states in bulk
This helps migrate BTreeMap<RepoPath, _> to sorted Vec.
2023-11-30 12:09:31 +09:00
Yuya Nishihara
c9150d02fc working_copy: don't look up file state twice while visiting directories 2023-11-30 12:09:31 +09:00
Yuya Nishihara
6ce7bd5338 repo_path: replace .contains() with .starts_with(), flipping the arguments
self.contains(other) means that the self tree contains the other tree (i.e.
the self path is prefix of the other), but it could be confused the other way
around if we were thinking about the path literal, not the tree. Let's add
.starts_with() instead by copying the std::path::Path definition.
2023-11-29 08:41:23 +09:00
Yuya Nishihara
266690a46b repo_path: make strip_prefix() public function returning &RepoPath
There are no external callers, but I think it's useful.
2023-11-29 08:41:23 +09:00
Yuya Nishihara
73690ed54e matchers: clean up .walk_to(dir) to yield &RepoPath instead of iterator 2023-11-29 08:41:23 +09:00
Yuya Nishihara
bc9725c73c working_copy: use RepoPath::parent() which no longer allocates temporary object 2023-11-29 08:41:23 +09:00
Yuya Nishihara
016fc2b5cc repo_path: change .split() and .parent() to return &RepoPath 2023-11-29 08:41:23 +09:00
Yuya Nishihara
28ab9593c3 repo_path: split RepoPath into owned and borrowed types
This enables cheap str-to-RepoPath cast, which is useful when sorting and
filtering a large Vec<(String, _)> list by using matcher for example. It
will also eliminate temporary allocation by repo_path.parent().
2023-11-28 07:33:28 +09:00
Yuya Nishihara
0a1bc2ba42 repo_path: add stub RepoPathBuf type, update callers
Most RepoPath::from_internal_string() callers will be migrated to the function
that returns &RepoPath, and cloning &RepoPath won't work.
2023-11-28 07:33:28 +09:00
Yuya Nishihara
f5938985f0 repo_path: make RepoPath::from_internal_string() accept owned string
I'm going to add borrowed RepoPath type, and most from_internal_string()
callers will be migrated to it. For the remaining callers, it makes more
sense to move the ownership of String to RepoPathBuf.
2023-11-28 07:33:28 +09:00
Yuya Nishihara
d322df0c8d matchers: make Files/PrefixMatcher constructors accept slice of borrowed paths
RepoPath will become slice type (like str), and it doesn't make sense to
require &[RepoPathBuf] here.
2023-11-28 07:33:28 +09:00
Yuya Nishihara
a23bb5b958 matchers: in tests, use alias to RepoPath::from_internal_string()
It looked verbose to fully spell the function name.
2023-11-28 07:33:28 +09:00
Ilya Grigoriev
6aef4bb52e cli rebase: do not allow -r --skip-empty
This follows up on 3967f63 (see that commit's description for more
motivation) and e79c8b6.

In a discussion linked below, it was decided that forbidding `-r --skip-empty`
entirely is preferable to the mixed behavior introduced in 3967f63.

3967f637dc (commitcomment-133539911)
2023-11-27 10:16:36 -08:00
Yuya Nishihara
55f75278bc repo_path: make to_internal_file_string() return &str, rename accordingly 2023-11-27 08:42:09 +09:00
Yuya Nishihara
12d7f8be16 repo_path: turn RepoPath into String wrapper
RepoPath::from_components() is removed since it is no longer a primitive
function.

The components iterator could be implemented on top of str::split(), but
it's not as we'll probably want to add components.as_path() -> &RepoPath.

Tree walking and tree_states map construction get slightly faster thanks to
fewer allocations and/or better cache locality. If we add a borrowed RepoPath
type, we can also implement a cheap &str to &RepoPath conversion on top. Then,
we can get rid of BTreeMap<RepoPath, FileState> construction at all.

Snapshot without watchman:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux status"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
  Time (mean ± σ):     950.1 ms ±  24.9 ms    [User: 1642.4 ms, System: 681.1 ms]
  Range (min … max):   913.8 ms … 990.9 ms    10 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
  Time (mean ± σ):     872.1 ms ±  14.5 ms    [User: 1922.3 ms, System: 625.8 ms]
  Range (min … max):   853.2 ms … 895.9 ms    10 runs

Relative speed comparison
        1.09 ±  0.03  target/release-with-debug/jj-0 -R ~/mirrors/linux status
        1.00          target/release-with-debug/jj-1 -R ~/mirrors/linux status
```

Tree walk:
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
"target/release-with-debug/{bin} -R ~/mirrors/linux files --ignore-working-copy"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux files --ignore-working-copy
  Time (mean ± σ):     375.3 ms ±  15.4 ms    [User: 223.3 ms, System: 151.8 ms]
  Range (min … max):   359.4 ms … 394.1 ms    10 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux files --ignore-working-copy
  Time (mean ± σ):     357.1 ms ±  16.2 ms    [User: 214.7 ms, System: 142.6 ms]
  Range (min … max):   341.6 ms … 378.9 ms    10 runs

Relative speed comparison
        1.05 ±  0.06  target/release-with-debug/jj-0 -R ~/mirrors/linux files --ignore-working-copy
        1.00          target/release-with-debug/jj-1 -R ~/mirrors/linux files --ignore-working-copy
```
2023-11-27 08:42:09 +09:00
Yuya Nishihara
974a6870b3 repo_path: make RepoPath::components() return iterator
This allows us to change the backing type from Vec<String> to String.
2023-11-27 08:42:09 +09:00
Yuya Nishihara
aba8c640be repo_path: capture current Vec<String> ordering by tests
The added test would fail if paths were purely ordered by concatenated strings.
I'm not sure if we want to preserve the current ordering, but let's not break
it for the moment.
2023-11-27 08:42:09 +09:00
Ilya Grigoriev
3967f637dc cli rebase: do not allow -r --skip-empty to drop emptied descendants
This follows up on @matts1 's #2609.

We still allow the `-r` commit to become empty. I would be more comfortable if
there was a test for that, but I haven't done that (yet?) and it seems pretty
safe. If that's a problem, I'm happy to forbid `-r --skip-empty` entirely,
since it is far less useful than `-s --skip-empty` or `-b --skip-empty`.

I think it is undesired to abandon emptied descendants. As far as descendants
of `A` are concerned, `jj rebase -r A` should be equivalent to `jj abandon A`,
and `jj abandon` does not remove emptied commits. It also doesn't seem very
useful to do that, since I think descendant commits of an abandoned (or moved
with `-r`) commit only become empty in pathological cases.

Additionally, if we did want -r to empty descendants of `A`, we'd have to add
thorough tests and possibly improve the algorithm. I want to refactor `rebase
-r` and add features to it, and having to consider cases of commits becoming
abandoned makes everything harder.

For example, if we have

```
root -> A -> B -> C
```

and `jj rebase -r A -d C` empties commit `B` (or `C`), I do not know whether
the current algorithm will work correctly. It seems possible that it would, but
that depends on the fact that empty merge commits are not abandoned for
descendants. That seems dangerous to rely on without tests.

I hope (but can't promise) that in the near future, making DescendantRebaser
return more information  should help make it possible to create such
functionality in a more robust way. I am likely to attempt this as part of
implementing `-r --after`.
2023-11-26 10:56:58 -08:00
Yuya Nishihara
59ef3f0023 repo_path: split RepoPathComponent into owned and borrowed types
This is a step towards introducing a borrowed RepoPath type. The current
RepoPath type is inefficient as each component String is usually short. We
could apply short-string optimization, but still each inlined component would
consume 24 bytes just for e.g. "src", and increase the chance of random memory
access. If the owned RepoPath type is backed by String, we can implement cheap
cast from &str to borrowed &RepoPath type.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
f2096da2d6 repo_path: add stub type to introduce borrowed RepoPathComponent type
The current RepoPathComponent will be renamed to RepoPathComponentBuf, and
new str wrapper will be added as RepoPathComponent.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
e14b31a033 repo_path: reject leading slash and empty path components
Leading/trailing slashes would introduce a bit of complexity if we migrate
the backing type from Vec<String> to String. Empty components are okay, but
let's reject them as they are cryptic and invalid.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
755af75c30 repo_path: in tests, use alias to RepoPath::from_internal_string()
It seemed too verbose to spell the full function name in tests.
2023-11-26 18:21:40 +09:00
Yuya Nishihara
b5b01f4dd7 cargo: add ref-cast dependency
It helps to implement transparent conversion from &str to &Wrapped(str). We
could instead wrap the reference as Wrapped<'a>(&'a str), but it has various
drawbacks. Notably we can't implement Borrow and Deref because these traits
require a reference in return position.

Since the unsafe bits are pretty small, we can instead implement cast functions
without using the ref-cast crate. However, I believe we'll trust ref-cast more
than hand-crafted unsafe code.

https://crates.io/crates/ref-cast
https://docs.rs/ref-cast/1.0.20/ref_cast/attr.ref_cast_custom.html
2023-11-26 18:21:40 +09:00
Yuya Nishihara
b7543f8a08 rewrite: fix check for newly-empty commit in optimized path
'old_base_tree_id == None' means the rebased tree is unchanged, so the commit
shouldn't be considered newly-empty.
2023-11-26 14:42:17 +09:00
Yuya Nishihara
2f93de9299 rewrite: flatten mapping from EmptyBehaviour to desired action
I think this is slightly easier to follow.
2023-11-26 14:42:17 +09:00
Ilya Grigoriev
c32847696d rewrite.rs: rename new_parents to parent_mapping
The function `new_parents` makes sense, but I found the mapping
being named `new_parents` confusing.
2023-11-25 21:36:35 -08:00
Yuya Nishihara
6344cd56b3 repo_path: remove RepoPathJoin trait, just implement join() on the type
I don't think we'll add join() that takes different types.
2023-11-26 07:14:47 +09:00
Yuya Nishihara
d7df2516c5 repo_path: remove RepoPathComponent::string(), use as_str() instead
There are only two callers, and one does further conversion to BString.
2023-11-26 07:14:47 +09:00
Martin von Zweigbergk
6d54afa60e revset: make evaluate_programmatic() optimize expression
It seems generally useful to optimize revset expressions in
`evaluate_programmatic()` so the caller doesn't have to remember to do
it. It should generally be cheap to do so even if it's often not
needed.
2023-11-24 21:13:58 -10:00
Martin von Zweigbergk
550164209c revset: add a RevsetExpression::evaluate_programmatic()
We often resolve a programmatic revset and then immediately evaluate
it. This patch adds a convenience method for those two steps.
2023-11-24 21:13:58 -10:00
Martin von Zweigbergk
f2602f78cf revset: make resolve_programmatic() not return a Result
I think it's always a programming error if `resolve_programmatic()`
returns a `Result`, so it shouldn't have to return a `Result`.
2023-11-24 21:13:58 -10:00
Martin von Zweigbergk
f27f52984e revset: rename resolve() to resolve_programmatic()
`RevsetExpression::resolve()` is meant for programmatically created
expressions. In particular, it may not contain symbols. Let's try to
clarify that by renaming the function and documenting it.
2023-11-24 21:13:58 -10:00
Yuya Nishihara
b37293fa68 tests: add upper bound to test_concurrent_read_write_commit() loop
Hopefully this will fix the unfinished Windows CI issue. A possible scenario
is that recent migration to gitoxide made this test flaky on Windows. For
example, gitoxide might have in-memory object cache that relies on file mtime,
and occasionally fails to detect new object on Windows.
2023-11-24 18:07:35 +09:00
Matt Stark
0a95e20ebe lib: Implement skipping of empty commits 2023-11-24 14:48:06 +11:00
Matt Stark
dc89566039 lib: Create struct RebaseOptions 2023-11-24 14:48:06 +11:00
Anton Bulakh
5c3c0e9f6e sign: Implement generic commit signing on the backend 2023-11-23 22:52:20 +02:00
Anton Bulakh
5ab00e197a backend: Inline gix::Repository::commit_as to prepare for signing
Additional bonus is that this allows us to avoid creating keep refs for the
intermediate commits in the data race preventing loop.
2023-11-23 22:52:20 +02:00
Yuya Nishihara
042d26049c working_copy: lazily construct file_states BTreeMap
While it got faster to build a large BTreeMap<RepoPath, _>, there's still
a measurable cost. Let's eliminate it if watchman is enabled and the working
copy is clean. Perhaps, we should introduce new serialization format that
supports instant loading and lookup, but this hack works for the moment.
I'm not sure if the new tree_state format should be flat (RepoPath, _) list,
or tree like the backend storage btw.

In my "linux" repo (watchman enabled):
    % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
      "target/release-with-debug/{bin} -R ~/mirrors/linux status"
    Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
      Time (mean ± σ):     768.9 ms ±  14.2 ms    [User: 630.7 ms, System: 131.2 ms]
      Range (min … max):   742.3 ms … 783.1 ms    10 runs

    Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
      Time (mean ± σ):     713.0 ms ±  16.8 ms    [User: 587.9 ms, System: 116.2 ms]
      Range (min … max):   681.5 ms … 731.1 ms    10 runs

    Relative speed comparison
            1.08 ±  0.03  target/release-with-debug/jj-0 -R ~/mirrors/linux status
            1.00          target/release-with-debug/jj-1 -R ~/mirrors/linux status
2023-11-23 18:48:14 +09:00
Yuya Nishihara
12cd657837 working_copy: extract file_states_to_proto() helper
Just minimizing the changes in the next commit. As we already have
file_states_from_proto(), it makes sense to extract the "to" function.
2023-11-23 18:48:14 +09:00
Yuya Nishihara
74c4ef32aa fsmonitor: exclude .git and .jj directories from changed files
This ensures that the root fsmonitor_matcher matches nothing if there are no
working-copy changes. The query result can be observed by "jj debug watchman
query-changed-files".

I don't have expertise on watchman query language, but using the watchman API
is probably better than .filter()-ing the result manually.
2023-11-23 18:48:14 +09:00
Yuya Nishihara
1ddcaa43b3 fsmonitor: don't apply prefix matching to paths obtained from watchman
If I understand it, watchman returns changed files and directories, and a
directory change doesn't mean we need to scan all files under the directory.
2023-11-23 10:06:00 +09:00
Yuya Nishihara
767e94f5af fsmonitor: drop unneeded mut from make_fsmonitor_matcher()
We only need &self.working_copy_path here.
2023-11-23 10:06:00 +09:00
Yuya Nishihara
c16c89bc27 fsmonitor: keep paths relative to the workspace root
Since the caller wants repo-relative paths, it doesn't make sense to convert
them back and forth.
2023-11-23 10:06:00 +09:00
Yuya Nishihara
a4f6e0de0b repo_path: extract helper that converts Path to RepoPath literally 2023-11-23 10:06:00 +09:00
Yuya Nishihara
31def4b131 cleanup: don't use debug format to print source errors 2023-11-23 10:05:37 +09:00
Yuya Nishihara
16620e0e4c merged_tree: drop legacy tree handling from ConflictsDirItem constructor
No callers pass in a legacy tree.
2023-11-21 07:45:30 +09:00
Yuya Nishihara
4ad3db2e84 merged_tree: extract value() function of non-legacy trees 2023-11-21 07:45:30 +09:00
Yuya Nishihara
ca3f549c9e merged_tree: remove redundant clone() from ConflictIterator construction 2023-11-21 07:45:30 +09:00
Martin von Zweigbergk
acc35a89a8 merged_tree: inline non-recursive entry iterator 2023-11-19 20:29:40 -10:00
Martin von Zweigbergk
426f6d0cdd merged_tree: inline non-recursive conflict iterator
The abstraction is no longer useful since we made the types not
self-referential.
2023-11-19 20:29:40 -10:00
Yuya Nishihara
5186066cf5 working_copy: simply collect() proto file states into BTreeMap
Suppose the input list is presorted, sorting a sorted vec would be cheaper
than .insert()-ing sorted items one by one.

In my "linux" repo (watchman eanbled):
 - jj-0: baseline
 - jj-1: previous (don't randomize by HashMap)
 - jj-2: this

    % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
        "target/release-with-debug/{bin} -R ~/mirrors/linux status"
    Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
      Time (mean ± σ):      1.034 s ±  0.020 s    [User: 0.881 s, System: 0.212 s]
      Range (min … max):    1.011 s …  1.068 s    10 runs

    Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
      Time (mean ± σ):     849.3 ms ±  13.8 ms    [User: 710.7 ms, System: 199.3 ms]
      Range (min … max):   821.7 ms … 870.2 ms    10 runs

    Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux status
      Time (mean ± σ):     786.2 ms ±  16.7 ms    [User: 650.7 ms, System: 204.1 ms]
      Range (min … max):   760.8 ms … 805.2 ms    10 runs

    Relative speed comparison
            1.32 ±  0.04  target/release-with-debug/jj-0 -R ~/mirrors/linux status
            1.08 ±  0.03  target/release-with-debug/jj-1 -R ~/mirrors/linux status
            1.00          target/release-with-debug/jj-2 -R ~/mirrors/linux status
2023-11-20 08:29:33 +09:00
Yuya Nishihara
ee6a1e2c0a working_copy: don't build intermediate HashMap from proto file states
According to the doc, this is compatible with the map syntax.
https://protobuf.dev/programming-guides/proto3/#maps

This change means that the serialized file states are sorted by RepoPath,
so BTreeMap<RepoPath, _> can be reconstructed with fewer cache misses.

In my "linux" repo (watchman enabled):
 - jj-0: baseline
 - jj-1: this

    % hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1,jj-2 \
      "target/release-with-debug/{bin} -R ~/mirrors/linux status"
    Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status
      Time (mean ± σ):      1.034 s ±  0.020 s    [User: 0.881 s, System: 0.212 s]
      Range (min … max):    1.011 s …  1.068 s    10 runs

    Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status
      Time (mean ± σ):     849.3 ms ±  13.8 ms    [User: 710.7 ms, System: 199.3 ms]
      Range (min … max):   821.7 ms … 870.2 ms    10 runs

    Relative speed comparison
            1.32 ±  0.04  target/release-with-debug/jj-0 -R ~/mirrors/linux status
            1.08 ±  0.03  target/release-with-debug/jj-1 -R ~/mirrors/linux status

Cache-misses got reduced:

    % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
      -- ./target/release-with-debug/jj-0 -R ~/mirrors/linux --no-pager status

              1,091.68 msec task-clock                       #    1.032 CPUs utilized
         4,179,596,978      cycles                           #    3.829 GHz
         6,166,231,489      instructions                     #    1.48  insn per cycle
           134,032,047      cache-references                 #  122.776 M/sec
            29,322,707      cache-misses                     #   21.88% of all cache refs

           1.057474164 seconds time elapsed

           0.897042000 seconds user
           0.194819000 seconds sys

    % perf stat -e task-clock,cycles,instructions,cache-references,cache-misses \
      -- ./target/release-with-debug/jj-1 -R ~/mirrors/linux --no-pager status

                927.05 msec task-clock                       #    1.083 CPUs utilized
         3,451,299,198      cycles                           #    3.723 GHz
         6,222,418,272      instructions                     #    1.80  insn per cycle
            98,499,363      cache-references                 #  106.251 M/sec
            11,998,523      cache-misses                     #   12.18% of all cache refs

           0.855938336 seconds time elapsed

           0.720568000 seconds user
           0.207924000 seconds sys
2023-11-20 08:29:33 +09:00
Yuya Nishihara
56047cb7ec working_copy: don't pass all proto data to from_proto() functions
Just a code cleanup. This allows us to consume proto fields if needed.
I also removed redundant .clone() and .as_str().
2023-11-20 08:29:33 +09:00
Yuya Nishihara
d7e63837f4 index: use extend_wanted/unwanted() to initialize RevWalk 2023-11-17 21:38:56 +09:00
Yuya Nishihara
a9f3dd95e5 index: remove index field from RevWalkQueue, move it to caller 2023-11-17 21:38:56 +09:00
Yuya Nishihara
ff1bb3133e index: move index-dependent operations out of RevWalkQueue
RevWalkQueue no longer depend on the RevWalkIndex trait, and the index field
can be removed.
2023-11-17 21:38:56 +09:00
Yuya Nishihara
32a7b33e56 index: optimize RevWalk to store IndexPosition instead of IndexEntry
The idea is the same as the heads_pos() change in 9832ee205d. While
IndexEntry::position() should be cheap, saving 20 bytes per entry appears to
improve the performance in mid-size repos.

In my "linux" repo:

    revsets/all()
    -------------
    baseline  1.24     156.0±1.06ms
    this      1.00     126.0±0.51ms

I don't see significant difference in small-sized repos like "jj" or "git".

IndexEntryByPosition isn't removed since it's still used by the revset engine.
2023-11-17 21:38:56 +09:00
Martin von Zweigbergk
9be24db051 tree: make TreeEntriesDirItem not self-referential
This removes the last use of `ouroboros`. Since `TreeEntriesDirItem`
is only used in "legacy trees" (before tree-level conflicts), I didn't
bother to check the performance impact. I also didn't bother to check
the matcher before adding the entries to the list, instead leaving
that where it is in `Iterator::next()`.
2023-11-17 03:50:34 -08:00
Martin von Zweigbergk
c0295c5dbc merged_tree: make ConflictsDirItem not self-referential
This removes the last use of `ouroboros` in `merged_tree.rs`. The set
of conflicts to iterate is usually so small that I didn't bother
checking the performance impact.
2023-11-17 03:50:34 -08:00
Martin von Zweigbergk
e1a02c5c5b merged_tree: make TreeDiffDirItem not self-referential
This removes another dependency on `ouroboros`, for a small
performance hit:


```
❯ hyperfine --warmup 3 --runs 30 \
      '/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' \
      '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0'
Benchmark 1: /tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0
  Time (mean ± σ):     689.7 ms ±  23.9 ms    [User: 400.0 ms, System: 289.8 ms]
  Range (min … max):   666.9 ms … 759.2 ms    30 runs

Benchmark 2: /tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0
  Time (mean ± σ):     710.9 ms ±  19.2 ms    [User: 420.4 ms, System: 290.6 ms]
  Range (min … max):   688.5 ms … 752.0 ms    30 runs

Summary
  '/tmp/jj-before --ignore-working-copy diff -s --from v5.0 --to v6.0' ran
    1.03 ± 0.05 times faster than '/tmp/jj-after --ignore-working-copy diff -s --from v5.0 --to v6.0'
```
2023-11-17 03:50:34 -08:00
Martin von Zweigbergk
61d87fe296 merged_tree: make TreeEntriesIterator not self-referential
While importing the `ouroboros` crate and the `aliasable` crate it
depends on, the "unsafe Rust reviewer" expressed some concern that
they contain a lot of unsafe code that's hard to review. We can avoid
the unsafe code altogether by making `TreeEntriesIterator` not
self-refential. Instead, we can collect the matching entries in an
individual tree up front. It does have some performance cost:


```
❯ hyperfine --warmup 3 --runs 30 \
      '/tmp/jj-before --ignore-working-copy files -r v6.0' \
      '/tmp/jj-after --ignore-working-copy files -r v6.0'
Benchmark 1: /tmp/jj-before --ignore-working-copy files -r v6.0
  Time (mean ± σ):     461.4 ms ±  14.3 ms    [User: 232.1 ms, System: 229.4 ms]
  Range (min … max):   443.4 ms … 496.3 ms    30 runs

Benchmark 2: /tmp/jj-after --ignore-working-copy files -r v6.0
  Time (mean ± σ):     482.0 ms ±  14.3 ms    [User: 257.2 ms, System: 224.9 ms]
  Range (min … max):   461.8 ms … 513.3 ms    30 runs

Summary
  '/tmp/jj-before --ignore-working-copy files -r v6.0' ran
    1.04 ± 0.04 times faster than '/tmp/jj-after --ignore-working-copy files -r v6.0'
```

I think that's acceptable.
2023-11-17 03:50:34 -08:00
Yuya Nishihara
93fbcec2f7 index: use BinaryHeap instead of BTreeSet in common_ancestors_pos()
For the same reason as the heads_pos() change. We just want to omit duplicated
items.
2023-11-16 08:27:59 +09:00
Yuya Nishihara
d4059520a9 index: cache generation numbers during common_ancestors_pos() computation
I'm not sure if this is better, but common_ancestors_pos() would have a
similar property to heads_pos().
2023-11-16 08:27:59 +09:00
Yuya Nishihara
ea4bdd718d index: use "while let" in common_ancestors_pos() 2023-11-16 08:27:59 +09:00
Yuya Nishihara
02c84a8596 index: remove stale "allow(unstable_name_collisions)"
I think this is remainder of nightly shims.
2023-11-16 08:27:59 +09:00
Yuya Nishihara
6399c392fd index: make heads_pos() deduplicate entries without building separate set
This is much faster (maybe because of better cache locality?) Another option
is to use BTreeSet, but the BinaryHeap version is slightly faster.

"bench revset" result in my linux repo:

    revsets/heads(tags())
    ---------------------
    baseline  3.28     560.6±4.01ms
    1         2.92     500.0±2.99ms
    2         1.98     339.6±1.64ms
    3 (this)  1.00     171.2±0.30ms
2023-11-16 08:27:59 +09:00
Yuya Nishihara
9832ee205d index: optimize heads_pos() to cache generation numbers during computation
Apparently, IndexEntry::generation_number() isn't cheap probably because it
involves random access to larger memory region, and the u32 value might not
be aligned. Let's instead store the generation numbers in BinaryHeap.

Also, heads_pos() becomes slightly faster by keeping the BinaryHeap entries
small, so I've removed the IndexEntry at all.

This makes the default log and disambiguation revsets fast, which evaluate
'heads(immutable_heads())'.

"bench revset" result in my linux repo:

    revsets/heads(tags())
    ---------------------
    baseline  3.28     560.6±4.01ms
    1         2.92     500.0±2.99ms
    2 (this)  1.98     339.6±1.64ms
2023-11-16 08:27:59 +09:00
Yuya Nishihara
1e933b84dd index: make IndexEntry::parents() lazy instead of collecting to Vec
All callers just iterate over the parent entries.

"bench revset" result in my linux repo:

    revsets/heads(tags())
    ---------------------
    baseline  3.28     560.6±4.01ms
    1 (this)  2.92     500.0±2.99ms
2023-11-16 08:27:59 +09:00
Yuya Nishihara
39b065f7ab git: on import_refs(), exclude uninteresting dirs such as refs/jj/keep
For loose refs, uninteresting directories can be just skipped. For packed refs,
gix will have to do binary search for each prefix to find the starting point.
Still it's better overall if the repository contains tons of refs/jj/keep refs.

With my linux repo containing ~5k loose jj refs, this saves ~40ms:

    % hyperfine --warmup 3 --runs 10 \
        "/tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux" \
        "/tmp/jj-gix-iter --ignore-working-copy git import -R ~/mirrors/linux"
    Benchmark 1: /tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux
      Time (mean ± σ):     151.6 ms ±  11.4 ms    [User: 38.8 ms, System: 111.6 ms]
      Range (min … max):   129.8 ms … 159.5 ms    10 runs
    Benchmark 2: /tmp/jj-gix-iter --ignore-working-copy git import -R ~/mirrors/linux
      Time (mean ± σ):     109.9 ms ±  11.6 ms    [User: 27.5 ms, System: 82.4 ms]
      Range (min … max):    89.4 ms … 117.8 ms    10 runs
2023-11-14 17:35:27 +09:00
Yuya Nishihara
044716ee40 git: migrate import_refs() to gix::Repository
Gitoxide errors are boxed since there are various error types and they tend
to exceed the clippy size limit.

Apparently, gitoxide is faster than git2:

    % hyperfine --warmup 3 --runs 10 \
        "/tmp/jj-baseline --ignore-working-copy git import -R ~/mirrors/linux" \
        "/tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux"
    Benchmark 1: /tmp/jj-baseline --ignore-working-copy git import -R ~/mirrors/linux
      Time (mean ± σ):     205.4 ms ±  15.7 ms    [User: 59.6 ms, System: 144.6 ms]
      Range (min … max):   189.7 ms … 223.9 ms    10 runs
    Benchmark 2: /tmp/jj-gix --ignore-working-copy git import -R ~/mirrors/linux
      Time (mean ± σ):     176.2 ms ±  13.7 ms    [User: 41.2 ms, System: 134.0 ms]
      Range (min … max):   155.4 ms … 186.5 ms    10 runs
2023-11-14 17:35:27 +09:00
Yuya Nishihara
6c98dfcdcb git: have import_refs() obtain git2::Repository instance from store
This helps gitoxide migration. It's theoretically possible to import Git refs
from non-Git backend, but I don't think such API flexibility is needed.
2023-11-14 17:35:27 +09:00
Yuya Nishihara
dbb1adaf0a git: move import-related types close to import_refs() function 2023-11-14 17:35:27 +09:00
Yuya Nishihara
f991705e47 tests: add test for importing missing ancestor of HEAD
If a commit pointed to by HEAD or ref is missing, the ref is considered
invalid and excluded by import_refs(). The current test behavior appears to
depend on some in-memory cache of git2::Repository.
2023-11-14 17:35:27 +09:00
Yuya Nishihara
8e143541a5 operation: propagate OpStoreError from parents()
We need to .collect_vec() the parents iterator to temporary buffer since the
borrowed iterator can't be returned back to the dag_walk functions. Another
option is to clone op_store and parent ids to remove &self lifetime from the
iterator, but that also means a temporary Vec is created.
2023-11-14 07:16:39 +09:00
Yuya Nishihara
8ddad859e8 dag_walk: add fallible topo_order_reverse_lazy()
Unlike dfs_ok(), this function short-circuits at an Err as we use non-lazy
topo_order_forward() internally. I think that's good enough. If we implement
GC on operation log, deleted parents will be excluded (or mapped to tombstone)
by caller. An Err shouldn't mean it's GC-ed.
2023-11-14 07:16:39 +09:00
Yuya Nishihara
3d5a07e86a dag_walk: add fallible dfs(), topo_order(), heads(), and closest_common_node()
This unblocks the use of Result<T, E> in op.parents().

There are two ways to encode errors:
 a. impl IntoIterator<Item = Result<T, E>>
 b. Result<V, E> where V: FromIterator<Item = T>
I think (a) is more natural to algorithms like dfs(), which can process error
nodes transparently.

Still the caller might have to collect the source iterator to temporary Vec
to conform to the neighbors_fn signature. It's not easy for neighbors_fn to
return an iterator borrowing the input node. We already have GAT, but doesn't
have return-position impl Trait in trait yet.
2023-11-14 07:16:39 +09:00
Yuya Nishihara
e5a9a26911 dag_walk: remove unused and untested leaves() function 2023-11-14 07:16:39 +09:00
Anton Bulakh
e3a1e5b80e sign: Implement storage for digital commit signatures
Recognize signature metadata from git commit objects, implement
a basic version of that for the native backend.
Extract the signed data (a commit binary repr without the signature) to
be verified later.
2023-11-12 03:37:13 +02:00
Yuya Nishihara
b42a69db6d git_backend: configure committer (and author) of gix::Repository
Otherwise, ref updates would fail if we port git::export_refs() to gitoxide.
This change isn't strictly needed for the backend itself, but we'll reuse the
gix::Repository instance created by the backend when importing and exporting
Git refs.
2023-11-11 22:35:54 +09:00
Yuya Nishihara
ea32c0cb9e git_backend: pass UserSettings to GitBackend constructors 2023-11-11 22:35:54 +09:00
Yuya Nishihara
8a2048a0e5 repo: pass UserSettings to store factories and initializers
GitBackend will use it to configure gix::Repository. I think UserSettings
is generally useful to pass store-specific parameters, so I've updated all
factory functions.
2023-11-11 22:35:54 +09:00
Yuya Nishihara
6125fb160e op_store: embed details in operation/view not found error
This is basically a copy of BackendError::ObjectNotFound. The failed id may
be either view or operation id.
2023-11-11 22:35:40 +09:00
Yuya Nishihara
ea96513fd1 op_store: deduplicate functions that map io::Error to OpStoreError
io_to_read_error() also translates ErrorKind::NotFound.
2023-11-11 22:35:40 +09:00
Martin von Zweigbergk
120115a20d cli: pass MaterializedTreeValue into git_diff_part()
Just a little preparation for reading the materialized values
concurrently.
2023-11-10 04:54:47 -08:00
Waleed Khan
a60733f632 tree: remove unsafe with ouroboros for self-referential iterators 2023-11-09 21:50:29 -08:00
Yuya Nishihara
6ff3a4f3df repo: reimplement DirtyCell without using unsafe
While the safe implementation is a bit more complex (and probably more branchy),
I don't think the runtime overhead would matter here. Let's remove one more
unsafe for better code maintainability.
2023-11-10 07:42:45 +09:00
Martin von Zweigbergk
9b24d24612 conflicts: add another helper for materializing a tree value
We have a few places where we have a `MergedTreeValue` and need to
read the data associated with it so we can write to the working copy
or include it in a diff. Let's extract some of that shared logic to a
function so we can reuse it. I plan to use it for reading file
contents in advance while streaming a diff in `local_working_copy`
soon (and probably in `jj diff` thereafter), but I think it seems like
an improvement on its own.
2023-11-08 21:21:38 -08:00
Martin von Zweigbergk
65bd5cacba working copy: on checkout, move read from store out of write_*() functions
I'd like to read N files ahead from the backend, to avoid serializing
too many server calls on backends that are backed by a server. Moving
the reads a little earlier is a little step towards that.

The `TreeState::write_*()` functions can now be made into free/static
functions if we prefer.
2023-11-08 21:21:38 -08:00
Yuya Nishihara
084b99e1e2 index: rewrite CompositeIndex::entry_by_pos() by leveraging ancestors iterator
We no longer have "unsafe" in this function, so let's use the iterator API
instead of recursion. Apparently I haven't pushed this change before because
unsafe in .find_map() looked scary.
2023-11-08 12:09:33 +09:00
Anton Bulakh
d27351b978 misc: drop a few low-hanging unsafes
Remove a couple of unnecessary unsafes:
- The NonZeroUsize is a constant where the unwrap will optimize away
anyway and we don't have an unsafe without any good reason there :)
- The other two were simply not needed, lifetimes worked fine, maybe
Rust became better since that code was written? NLL? Anyway, they're
gone now
2023-11-08 02:16:08 +02:00
Yuya Nishihara
2ac9865ce7 revset: exclude @git branches from remote_branches()
As discussed in Discord, it's less useful if remote_branches() included
Git-tracking branches. Users wouldn't consider the backing Git repo as
a remote.

We could allow explicit 'remote_branches(remote=exact:"git")' query by changing
the default remote pattern to something like 'remote=~exact:"git"'. I don't
know which will be better overall, but we don't have support for negative
patterns anyway.
2023-11-08 07:34:30 +09:00
Yuya Nishihara
59640496aa cargo: sort dependencies list alphabetically 2023-11-07 23:46:05 +09:00
Yuya Nishihara
d1b0c4cc48 merge: relax input type of Merge::from_removes_adds() 2023-11-07 17:10:12 +09:00
Yuya Nishihara
e0c35684af merge: rename Merge::new() to Merge::from_removes_adds()
Since (removes, adds) pair is no longer the canonical representation of Merge,
the name Merge::new() seems too generic. Let's give more verbose name.
2023-11-07 17:10:12 +09:00
Yuya Nishihara
2c128f1b61 merged_tree: convert from legacy conflicts through interleaved list
This is basically the same change as the previous commit.
2023-11-07 17:10:12 +09:00
Yuya Nishihara
a734f46130 merged_tree: build unresolved Merge<Tree> from interleaved list
We no longer need to iterate removes and adds separately.
2023-11-07 17:10:12 +09:00
Yuya Nishihara
dd26b7be40 merge: add Merge constructor that accepts interleaved values
Also migrated some callers of 3-way merge, where [left, base, right] order
looks okay.
2023-11-07 17:10:12 +09:00
Yuya Nishihara
803b41c426 merge: load legacy Merge values without allocating intermediate buffers 2023-11-07 17:10:12 +09:00
Yuya Nishihara
09987c1d27 merge: micro-optimize allocation of Merge object for resolved value
It's super common that a Merge object holds a resolved value, so let's inline
up to 1 element. T of Merge<T> usually consists of a couple of pointer-sized
fields. I don't see any measurable speed up, but it's no worse than the
original.
2023-11-07 17:10:12 +09:00
Martin von Zweigbergk
1140295829 merged_tree: extract polling of tree futures into a function 2023-11-07 00:03:50 -08:00
Martin von Zweigbergk
c77417d4e4 merged_tree: drop outer loop in TreeDiffStreamImpl::poll_next()
As suggested by Yuya. I also added a comment and an assertion in the
case where return `Poll::Pending`.
2023-11-07 00:03:50 -08:00
Martin von Zweigbergk
d989d4093d merged_tree: let backend influence whether to use new diff algo
Since the concurrent diff algorithm is significantly slower when using
the Git backend, I think we'll have to use switch between the two
algorithms depending on backend. Even if the concurrent version always
performed as well as the sequential version, exactly how concurrent it
should be probably still depends on the backend. This commit therefore
adds a function to the `Backend` trait, so each backend can say how
much concurrency they deal well with. I then use that number for
choosing between the sequential and concurrent versions in
`MergedTree::diff_stream()`, and also to decide the number of
concurrent reads to do in the concurrent version.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
f40adb84fc merged_tree: add a Stream for concurrent diff off trees
When diffing two trees, we currently start at the root and diff those
trees. Then we diff each subtree, one at a time, recursively. When
using a commit backend that uses remote storage, like our backend at
Google does, diffing the subtrees one at a time gets very slow. We
should be able to diff subtrees concurrently. That way, the number of
roundtrips to a server becomes determined by the depth of the deepest
difference instead of by the number of differing trees (times 2,
even). This patch implements such an algorithm behind a `Stream`
interface. It's not hooked in to `MergedTree::diff_stream()` yet; that
will happen in the next commit.

I timed the new implementation by updating `jj diff -s` to use the new
diff stream and then ran it on the Linux repo with `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0`. That slowed down by
~20%, from ~750 ms to ~900 ms. Maybe we can get some of that
performance back but I think it'll be hard to match
`MergedTree::diff()`. We can decide later if we're okay with the
difference (after hopefully reducing the gap a bit) or if we want to
keep both implementations.

I also timed the new implementation on our cloud-based repo at
Google. As expected, it made some diffs much faster (I'm not sure if
I'm allowed to share figures).
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
9af09ec236 test_meregd_tree: test diffing with a matcher
We didn't have any tests at all for `MergedTree::diff()` with a
matcher other than `EverythingMatcher`. This patch adds a few.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
16aa8e8f10 test_merged_tree: nest each part of test_diff_dir_file()
I'm about to add a few more checks for diffing with a matcher. I think
it will help make it readable and reduce the risk of mixing up
variables between each part of the test if we use some nested blocks.

I also removed some unnecessary `.clone()` calls while at it.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
c9ce80a82a merged_tree: extract function for merged iterator of basenames in diff
I'm going to reuse this for stream/async diffing.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
b72f04ba61 merged_tree: rename all_tree_conflict_names() since it's not about conflicts 2023-11-06 23:12:02 -08:00
Yuya Nishihara
3fddc31da8 merge: remove Merge::take() which is no longer used
Merge::take() is no longer a cheap function. We can add into_vec() if needed.
2023-11-07 06:52:35 +09:00
Yuya Nishihara
92dfe59ade refs: run non-trivial merge of ref targets without destructuring Merge object 2023-11-07 06:52:35 +09:00
Yuya Nishihara
93601541cb refs: use swap_remove() in non-trivial merge of ref targets
I'm going to add a Merge method that removes negative/positive terms pair, and
swap_remove() is the easiest option. The order of the conflicted ref targets
doesn't matter.
2023-11-07 06:52:35 +09:00
Yuya Nishihara
895bbce8c0 files: use borrowed Merge iterator in merge()
Since the underlying Merge data type is no longer (Vec<T>, Vec<T>), it doesn't
make sense to build removes/adds Vecs and concatenate them.
2023-11-07 06:52:35 +09:00
Yuya Nishihara
f1898a31b5 merge: simply print interleaved conflict values in debug output
We could apply that for the resolved case, but Resolved/Conflicted label
seems more useful than just printing Merge([value]).
2023-11-06 07:21:06 +09:00
Yuya Nishihara
b07b370ed3 merge: simply generate content hash from interleaved values 2023-11-06 07:21:06 +09:00
Yuya Nishihara
46ffb2f0b2 merge: store negative/positive terms internally in an interleaved Vec
Many callers use interleaved iterators, and recently-added serialization code
is built on top of that, so I think it's better to store terms in that format.

map() functions no longer use MergeBuilder as we know the mapped values are
ordered properly. flatten() and simplify() are reimplemented to work with the
interleaved values. The other changes are trivial.
2023-11-06 07:21:06 +09:00
Yuya Nishihara
287728fee7 merge: extract trivial_merge() that takes interleaved adds/removes iterator
The Merge type will store interleaved terms instead of separate adds/removes
vecs.
2023-11-06 07:21:06 +09:00
Yuya Nishihara
01523ba4f3 merge: rewrite bottom half of trivial_merge() for non-copyable types
The input values of trivial_merge() will be changed to Iterator<Item = T>
where T: Eq + Hash. It could be <Item = &'a T>, but it doesn't have to be.
2023-11-06 07:21:06 +09:00
Martin von Zweigbergk
7c923514ee git: add config to disable abandoning of unreachable commits
Some users prefer to have commits not get abandoned when importing
refs. This adds a config option for that.

Closes #2504.
2023-11-05 06:10:54 -08:00
Martin von Zweigbergk
7bf8906f9c git: extract a function for abandoning unreachable commits
This motivation for this is so we can easily skip calling the function
if the user has opted out of the propagation of abandoned commits we
usually do (#2504). However, it seems like a good piece of code to
extract regardless of that feature.
2023-11-05 06:10:54 -08:00
Yuya Nishihara
d9fbf21794 merge: have Merge::adds()/removes() return iterator
The Merge type will be changed to store interleaved values internally.
2023-11-05 16:43:06 +09:00
Yuya Nishihara
1c6913d618 merge: use Merge::iter() instead of adds()/removes() where order doesn't matter
Merge::iter() will be a slice::Iter, and be more efficient than chaining adds
and removes.
2023-11-05 16:43:06 +09:00
Yuya Nishihara
99e6ff493a merge: fix copy-paste error in doc comment for adds() 2023-11-05 16:43:06 +09:00
Yuya Nishihara
f6d85c51cd merge: add non-optional Merge accessor to the zeroth value
We have a few callers which just need to obtain an object common among all
the merge values. Let's add a non-failing accessor for that purpose.
2023-11-05 16:43:06 +09:00
Yuya Nishihara
b12c688ea0 merge: add method for indexed adds/removes access
The current adds()/removes() will be changed to return iterators.
2023-11-05 16:43:06 +09:00
Martin von Zweigbergk
6a5615c933 rewrite: use MergedTree::diff_stream() when restoring from tree 2023-11-04 21:07:49 -07:00
Yuya Nishihara
602b44258e workspace: add function that initializes colocated git repository
One less git2 API use in CLI.

The function name GitBackend::init_colocated() is a bit odd, but we need to
specify the work-tree path, not the ".git" repo path. So we can't eliminate
the notion of the working copy path anyway.
2023-11-05 08:48:35 +09:00
Yuya Nishihara
77e16243d6 tests: assert paths of initialized GitBackend 2023-11-05 08:48:35 +09:00
Yuya Nishihara
ce46c10c96 git_backend: extract inner function that initializes backend with open git repo 2023-11-05 08:48:35 +09:00
Yuya Nishihara
dce640aaf1 workspace: one less cloning of workspace_root in init_external_git()
Just a trivial code cleanup.
2023-11-05 08:48:35 +09:00
Yuya Nishihara
c866b4a42d workspace: fix repository path in init_internal_git() doc comment
Also rephrased "Git backend" as "Git repo" since the new backend storage will
be created.
2023-11-05 08:48:35 +09:00
Antoine Cezar
5973ab47b9 commands: move rebase_to_dest_parent to jj_lib::rewrite
What make rebase_to_dest_parent a good candidate for jj_lib::rewrite module:

- It is used both in obslog and interdiff. It's a sign that it may be moved to a lower layer
- CommandError is returned by converting from TreeMergeError. Not explicitly.
- It only use jj_lib::rewrite fonctions.
2023-11-03 20:48:00 +01:00
Martin von Zweigbergk
904c37d36d working copy: use MergedTree::diff_stream()
This will make it a little faster to update the working copy at Google
once we've made `MergedTree::diff_stream()` fetch trees
concurrently. (It only makes it a little faster because we still fetch
files serially.)
2023-11-03 08:15:10 -07:00
Martin von Zweigbergk
72245cfac5 merged_tree: add Stream-based version of diff(), delegating for now
I'm going to implement a `Stream`-based version optimized for
high-latency (RPC-based) commit backends. So far, that implementation
is about 20% slower in the Linux repo when running `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0`. I think that's almost
only because the algorithm is different, not because it's async per
se.

This commit adds a `Stream`-based version of `MergedTree::diff()` that
just wraps the regular iterator in stream. I updated `jj diff` to use
it. I couldn't measure any difference on the command above in the
Linux repo. I think that means we can safely use the same
`Stream`-based interface regardless of backend, even if we end up
needing two different implementations of the `Stream`. We would then
be using the wrapped iterator from this commit for local backends, and
the new implementation for remote backends. But ideally we can make
the remote-friendly implementation fast enough that we don't need two
implementations.
2023-11-03 08:15:10 -07:00
Martin von Zweigbergk
24b706641f async: switch to pollster's block_on()
During the transition to using more async code, I keep running into
https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want
to convert `MergedTree::diff()` into a `Stream`. I don't want to
update all call sites at once, so instead I'm adding a
`MergedTree::diff_stream()` method, which just wraps
`MergedTree::diff()` in a `Stream. However, since the iterator is
synchronous, it needs to block on the async `Backend::read_tree()`
calls. If we then also block on the `Stream` in the CLI, we run into
the panic.
2023-11-03 08:15:10 -07:00
Martin von Zweigbergk
3a378dc234 cli: add a function for restoring part of a tree from another tree
We had similar code in two places for restoring paths from one tree to
another. Let's reuse it instead.

I put the new function in the `rewrite` module. I'm not sure if that's
right place. Maybe it belongs in `tree`?
2023-11-02 06:07:45 -07:00
Yuya Nishihara
162dcd49b4 cli: rewrite base GitIgnoreFile lookup to use gitoxide instead of libgit2
Since gix::Repository::config_snapshot() borrows the repo instance, it has to
be allocated in caller's stack. That's why GitBackend::git_config() is removed.
2023-11-02 19:33:06 +09:00
Yuya Nishihara
c88e69ad6f git_backend: replace git2::Repository with gix::Repository
My gut feeling is that gitoxide aims to be more transparent than libgit2. We'll
need to know more about the underlying Git data model.

Random comments on gix API:

 * gix::Repository provides API similar to git2::Repository, but has less
   "convenient" functions. For example, we need to use .find_object() +
   .try_to/into_<kind>() instead of .find_<kind>().
 * gix::Object, Blob, etc. own raw data as bytes. gix::object and gix::objs
   types provide high-level views on such data.
 * Tree building is pretty low-level compared to git2.
 * gix leverages bstr (i.e. bytes) extensively.

It's probably not difficult to migrate git::import/export_refs(). It might
help eliminate the startup overhead of libssl initialization. The gix-based
GitBackend appears to be a bit faster, but that wouldn't practically matter.

#2316
2023-11-02 19:33:06 +09:00
Yuya Nishihara
9a86b77e38 tests: force gitoxide to not load config nor use "main" as default branch
AFAIK, there's no global config state for gitoxide. We can use
Config::isolated() in tests, but GitBackend should load config files in a
normal way.

https://docs.rs/gix/0.55.2/gix/open/permissions/struct.Config.html#method.isolated
https://docs.rs/gix/0.55.2/gix/init/constant.DEFAULT_BRANCH_NAME.html
2023-11-02 19:33:06 +09:00
Yuya Nishihara
f5a61dc2b7 git_backend: open just-initialized repo with canonicalized path
Otherwise, the initialized repo could have a different work-dir path than the
load()-ed one. libgit2 appears to do some normalization somewhere, but gix
won't.
2023-11-02 19:33:06 +09:00
Yuya Nishihara
fd187d266f git_backend: box GitBackendInit/LoadError up front
These error enums will wrap gix error types, and will become bigger enough for
clippy to complain.
2023-11-02 19:33:06 +09:00
Yuya Nishihara
b48569b104 cargo: add gitoxide (or gix) dependency
I've enabled the "index" component from the "basic" feature set, which would
be needed to implement colocated repo functionality. The doc suggests that
a library shouldn't activate "max-performance-safe", but our crate is also
an application so it would be okay to enable the feature. We'll need "parallel"
anyway to make GitBackend Sync.

https://docs.rs/gix/latest/gix/#feature-flags
2023-11-02 19:33:06 +09:00
Isabella Basso
749d8bb15a git: preserve HEAD when possible
Closes: #2210
2023-11-01 08:23:52 -03:00
Yuya Nishihara
1788b5014e git_backend: remove redundant copy back of author timestamp
Only the committer timestamp can be updated inside a loop.
2023-10-31 06:51:27 +09:00
Yuya Nishihara
f5aa739c70 git_backend: use .strip_suffix() instead of manual slicing 2023-10-31 06:51:27 +09:00
Yuya Nishihara
9bd84c55e0 git_backend: use file mode extensively in read_tree()
Both filemode() and kind() are calculated from the same underlying data,
and kind() is libgit2-specific API.
2023-10-31 06:51:27 +09:00
Yuya Nishihara
b3c9cab12d git_backend: handle read_tree() lookup/encoding errors gracefully 2023-10-31 06:51:27 +09:00
Yuya Nishihara
847adc832f git_backend: use lossy conversion to decode non-UTF-8 commit message
If message() returned None, it doesn't mean the commit message is empty. I
originally mapped it to an error, but that made import of linux repo fail.

https://docs.rs/git2/latest/git2/struct.Commit.html#method.message
2023-10-31 06:51:27 +09:00
Yuya Nishihara
06c254e742 git_backend: use non-owned str::from_utf8() to decode symlink target
Just for consistency with the other changes. str::Utf8Error is 2 words long,
so I removed the boxing.
2023-10-31 06:51:27 +09:00
Yuya Nishihara
d1c71c05c9 git_backend: remove redundant error handling for invalid hash length
The only error that could be returned by libgit2 is invalid hash length, and
we check that explicitly. If we switch the backends to gitoxide, there will be
panicking constructor.

https://docs.rs/git2/latest/git2/struct.Oid.html#method.from_bytes
2023-10-31 06:51:27 +09:00
Ilya Grigoriev
8bc3f5fd67 cli rebase: Allow jj rebase -r to rebase a commit onto a descendant
#1188

There are some additional test changes because children and descendants are now
rebased before the commit itself.
2023-10-30 10:56:27 -07:00
Yuya Nishihara
2d3fe7eee2 rewrite: replace use of "lift"ed function application with try_collect()
Also removed redundant borrow + clone.
2023-10-30 13:50:37 +09:00
Martin von Zweigbergk
35a23172ec backend: delete unused Phase enum
The idea was to support phases like in hg, but that hasn't happened
yet. We can add back this simple enum if we do add support for phases.
2023-10-29 12:02:40 -07:00
Martin von Zweigbergk
cfcdd71865 backend: make read_conflict synchronous again
This avoids https://github.com/rust-lang/futures-rs/issues/2090. I
don't think we need to worry about reading legacy conflicts
asynchronously - async is really only useful for Google's backend
right now, and we don't use the legacy format at Google. In
particular, I don't want `MergedTree::value()` to have to be async.
2023-10-28 16:45:40 -07:00
Martin von Zweigbergk
a1ef9dc845 merged_tree: propagate backend errors in diff iterator
I want to fix error propagation before I start using async in this
code. This makes the diff iterator propagate errors from reading tree
objects.

Errors include the path and don't stop the iteration. The idea is that
we should be able to show the user an error inline in diff output if
we failed to read a tree. That's going to be especially useful for
backends that can return `BackendError::AccessDenied`. That error
variant doesn't yet exist, but I plan to add it, and use it in
Google's internal backend.
2023-10-26 06:20:56 -07:00
Martin von Zweigbergk
309f1200d6 merge: introduce a type alias for Merge<Option<TreeValue>>
Reasons to introduce this alias:

* Reduces complexity of a type, to silence Clippy warnings in the
  future if we use this type as a type parameter

* The type is used quite frequently, so it makes sense to have a name
  for it

* It's easier to visually scan for the end of the type when you don't
  have to match opening and closing angle brackets
2023-10-26 06:20:56 -07:00
Martin von Zweigbergk
6ad71e658d merged_tree: rename MergedTreeValue to MergedTreeVal
I'm going to add `MergedTreeValue` as an alias for
`Merge<Option<TreeValue>>`, but we already have a type by that name in
`merged_tree`. This patch renames it away, to make room for the new
alias. I used `MergedTreeVal` for this borrowing version to be a bit
like how `str` is a borrowed version of `String`.
2023-10-26 06:20:56 -07:00
Yuya Nishihara
1bfe5b5b56 cli: add string pattern support to "git push --branch"
Since "jj git fetch --branch" supports glob patterns, users would expect that
"jj git push --branch glob:.." also works.

The error handling bits are copied from "branch" sub commands. We might want to
extract it to a common helper function, but I haven't figured out a reasonable
boundary point yet.
2023-10-26 04:51:17 +09:00
Yuya Nishihara
a6ac9b46e7 git: simply call fetch() with one or more branch name filters 2023-10-25 03:58:48 +09:00
Yuya Nishihara
560b63544a cli: parse "git fetch --branch" parameter as string pattern
Even though "*" can't be used as a branch name to fetch, it should be better
to explicitly enable glob matching like the other commands.
2023-10-25 03:58:48 +09:00
Martin von Zweigbergk
8157d4ff63 merge: materialize conflicts in executable files like regular files
AFAICT, all callers of `Merge::to_file_merge()` are already well
prepared for working with executable files. It's called from these
places:

* `local_working_copy.rs`: Materialized conflicts are correctly
  updated using `Merge::with_new_file_ids()`.

* `merge_tools/`: Same as above.

* `cmd_cat()`: We already ignore the executable bit when we print
  non-conflicted files, so it makes sense to also ignore it for
  conflicted files.

* `git_diff_part()`: We print all conflicts with mode "100644" (the
  mode for regular files). Maybe it's best to use "100755" for
  conflicts that are unambiguously executable, or maybe it's better to
  use a fake mode like "000000" for all conflicts. Either way, the
  current behavior seems fine.

* `diff_content()`: We use the diff content in various diff
  formats. We could add more detail about the executable bits in some
  of them, but I think the current output is fine. For example,
  instead of our current "Created conflict in my-file", we could say
  "Created conflict in executable file my-file" or "Created conflict
  in ambiguously executable file my-file". That's getting verbose,
  though.

So, I think all we need to do is to make `Merge::to_file_merge()` not
require its inputs to be non-executable.

Closes #1279.
2023-10-24 06:45:45 -07:00
Martin von Zweigbergk
21b11df8a9 merge: make non-conflicted debug string for Merge shorter
Resolves states are most common and the current format is pretty
verbose. Let's print it as if `Merge` were an enum with `Resolved` and
`Conflicted` variants instead.
2023-10-24 06:45:45 -07:00
Yuya Nishihara
831dc84c5b git: remove RefName::GitRef variant which isn't used anymore 2023-10-24 09:45:01 +09:00
Yuya Nishihara
a756333216 git: move RefName enum from view module
It's no longer used by View API.
2023-10-24 09:45:01 +09:00
Yuya Nishihara
95919699d4 repo: add merge methods per ref kind, remove RefName indirection
Since local/remote branches are now of different types, it doesn't make much
sense to dispatch merging through RefName. Let's add merge_<kind>() methods
instead.
2023-10-24 09:45:01 +09:00
Yuya Nishihara
6dfe1572a0 git: expand RefName match arms in import_refs()
This prepares for the removal of merge_single_refs().
2023-10-24 09:45:01 +09:00
Yuya Nishihara
ffe7e5f142 repo: inline merge_single_ref() from view
MutableRepo handles merging of the other kind of refs internally, and the
merge function is short enough to inline. I also removed early returns since
most callers provide non-identical ref targets, and merge_ref_targets() should
be cheap if the inputs can be trivially merged.
2023-10-24 09:45:01 +09:00
Yuya Nishihara
5543f7a11c refs: merge tracking state of remote branches
Otherwise "jj op undo" can't roll back tracking states (whereas "op restore"
can.)
2023-10-24 07:13:58 +09:00
Yuya Nishihara
3d45540ff6 refs: add function to diff RemoteRef pairs 2023-10-24 07:13:58 +09:00
Yuya Nishihara
6450194ccd refs: rename diff_named_refs() to diff_named_ref_targets()
I'm going to add RemoteRef variant of this function, and "refs" there will
be ambiguous.
2023-10-24 07:13:58 +09:00
Yuya Nishihara
59eb03eec5 refs: add helper function to iterate local/remote ref pairs
This partially reverts the change in 30fb7995c2 "view: make local/remote
branches iterator yield RemoteRef instead of RefTarget." As I'm going to add
diff function for RemoteRef pairs, we'll need a generic version of merge-join
iterator anyway.
2023-10-24 07:13:58 +09:00
Yuya Nishihara
16ef57907b str_util: add more StringPattern methods for convenience
The Display impl helps to format error messages. Since both Regex and
glob::Pattern implement Display, it's probably okay for our type to copy that.
2023-10-22 04:07:49 +09:00
Yuya Nishihara
4c4eb31a62 view: add methods that filter local/remote branches by pattern
We'll use them in CLI code.
2023-10-22 04:07:49 +09:00
Martin von Zweigbergk
578d61ec2e git: use forward slashes in relative path to backing git repo
Closes #2403.
2023-10-20 22:51:14 -05:00
Yuya Nishihara
cfcc76571c revset: add support for glob:pattern 2023-10-21 09:55:01 +09:00
Yuya Nishihara
f7c8622981 str_util: extract StringPattern filtering iterator from revset module 2023-10-21 09:55:01 +09:00
Yuya Nishihara
2d80f071de str_util: extract function that constructs StringPattern from string 2023-10-21 09:55:01 +09:00
Yuya Nishihara
5707a194d5 str_util: extract StringPattern from revset module
Branch name filtering in CLI will be migrated to this, and I'll probably add
glob:<pattern> in place of --glob option.
2023-10-21 09:55:01 +09:00
Martin von Zweigbergk
8764ad9826 conflicts: make materialization async
We need to let async-ness propagate up from the backend because
`block_on()` doesn't like to be called recursively. The conflict
materialization code is a good place to make async because it doesn't
depends on anything that isn't already async-ready.
2023-10-20 07:38:34 -07:00
Martin von Zweigbergk
e900c97618 conflicts: reduce some duplication in tests by extracting a closure 2023-10-20 07:38:34 -07:00
Martin von Zweigbergk
f541f9f3a6 cleanup: import futures::exectutor::block_on() instead of qualifying
It seems we'll end up using `block_on()` quite a bit, at least until
we're done transitioning to async, and the function name doesn't
conflict with anything else, so let's always import it when we need
it.
2023-10-20 07:38:34 -07:00
Yuya Nishihara
390b3208da revset: always include non-tracking remote branches in suggestion
For the same reason as the previous commit. Non-tracking remote branch
shouldn't be shadowed by a local branch of the same name.
2023-10-17 16:42:36 +09:00
Yuya Nishihara
9e0b9e6dc8 refs: error out push if non-tracking remote branches exist
We can provide more actionable error message than "not fast-forwardable". If
the push was fast-forwardable, "jj branch track" should be able to merge the
remote branch without conflicts, so the added step would be minimal.
2023-10-17 15:06:03 +09:00
Yuya Nishihara
089503abfb refs: classify push action based on tracking target
Although this is logically correct, the error message is a bit cryptic. It's
probably better to reject push if non-tracking remote branches exist.

#1136
2023-10-17 15:06:03 +09:00
Yuya Nishihara
30fb7995c2 view: make local/remote branches iterator yield RemoteRef instead of RefTarget
We'll use remote_ref.tracking_target() to classify push action, but not all
callers of local_remote_branches() need tracking_target() instead of target.
2023-10-17 15:06:03 +09:00
Yuya Nishihara
e0965c4533 git: on push, update jj's view of remote branches without using import_refs()
This means that the commits previously pinned by remote branches are no longer
abandoned. I think that's more correct since "push" is the operation to
propagate local view to remote, and uninteresting commits should have been
locally abandoned.
2023-10-17 15:06:03 +09:00
Yuya Nishihara
c8a848d260 git: prohibit push to remote named "git"
Since I'm going to make git::push_branches() update the repo view internally,
it should fail fast if the remote name is reserved. Before, the problem was
detected on git::import_refs().
2023-10-17 15:06:03 +09:00
Yuya Nishihara
58897d79c7 git: extract push function that processes branches instead of git refs
Since pushed remote branches will share the common base targets with locals,
these branches should be marked as tracking. git::push_branches() will handle
that. It looks ugly that the public GitBranchPushTargets type keeps "force"-d
branches as a separate set, but we'll need to rework that anyway when we
implement --force-with-lease behavior. So let's leave it for now.

Some of the git::push_updates() tests have been migrated to the new function.
I left a couple of basic tests for git::push_updates() because push_updates()
will be used to implement a low-level "jj git push-refs" command.
2023-10-17 15:06:03 +09:00
Yuya Nishihara
57167cefda git: on import_refs(), don't abandon ancestors of newly fetched refs
I made import_refs() not preserve commits referenced by remote branches at
520f692a46 "git: on import_refs(), don't preserve old branches referenced by
remote refs." The idea is that remote branches are weak, and commits referenced
by these refs can be freely rewritten by future local changes without moving
the refs. I don't think that's wrong, but 520f692a46 also made "new" remote
changes be abandoned by old remote refs. This problem occurs only when
git.auto-local-branch is off.

I think there are two ways to fix the problem:
 a. pin non-tracking remote branches just like local refs
 b. pin newly fetched refs in addition to local refs
This patch implements (b) because it's simpler and more obvious that the
fetched commits would never be abandoned immediately.
2023-10-17 14:49:49 +09:00
Martin von Zweigbergk
c3b45b6fd1 workspace: make working-copy type customizable
This add support for custom `jj` binaries to use custom working-copy
backends. It works in the same way as with the other backends, i.e. we
write a `.jj/working_copy/type` file when the working copy is
initialized, and then we let that file control which implementation to
use (see previous commit).

I included an example of a (useless) working-copy implementation. I
hope we can figure out a way to test the examples some day.
2023-10-16 22:33:44 -07:00
Martin von Zweigbergk
6bfd618275 workspace: load working copy implementation dynamically
This makes `Workspace::load()` look a new `.jj/working_copy/type` file
in order to load the right working copy implementation, just like
`Repo::load()` picks the right backends based on `.jj/store/type`,
`.jj/op_store/type`, etc. We don't write the file yet, and we don't
have a way of adding alternative working copy implementations, so it
will always be `LocalWorkingCopy` for now.
2023-10-16 22:33:44 -07:00
Martin von Zweigbergk
e1f00d9426 working copy: pass commit instead of tree into check_out()
Our internal working copy implementations at Google will need the
commit so they can walk history backwards until they get to a "public"
commit. They'll then use that to tell build tools and virtual file
systems to present that as a base.

I'm not sure if we'll need to update `reset()` too. It's currently
only used by `jj untrack`, which doesn't change the commit's parent,
so it wouldn't affect any history walks.
2023-10-16 22:33:44 -07:00
Martin von Zweigbergk
7c8a0a18f9 repo: define types for backend initializer functions
`ReadonlyRepo::init()` takes callbacks for initializing each kind of
backend. We called these things like `op_store_initializer`. I found
that confusing because it is not a `OpStoreFactory` (which is for
loading an existing backend). This patch tries to clarify that by
renaming the arguments and adding types for each kind of callback
function.
2023-10-16 22:33:44 -07:00
Yuya Nishihara
9cafff87e1 cli: add API and branch subcommand to track/untrack remote branches
This patch adds MutableRepo::track_remote_branch() as we'll probably need to
track the default branch on "jj git clone". untrack_remote_branch() is also
added for consistency.
2023-10-16 23:21:05 +09:00
Yuya Nishihara
4af678848d op_store: minimal change to load/store tracking state of remote branches
We could instead migrate the storage types to (local_branches, remote_views),
but that would be more involved and break forward compatibility with little
benefit. Maybe we can do that later when we introduce remote tags.
2023-10-16 23:21:05 +09:00
Yuya Nishihara
4cd2518be0 git: on import_refs(), respect tracking state of existing remote refs
In this commit, new behavior is tested by using in-memory view data. Data
persistence and track/untrack commands will be implemented soon.
2023-10-16 23:21:05 +09:00
Yuya Nishihara
a697175674 view: add tracking state to RemoteRef
The state field isn't saved yet. git import/export code paths are migrated,
but new tracking state is always calculated based on git.auto-local-branch
setting. So the tracking state is effectively a global flag.

As we don't know whether the existing remote branches have been merged in to
local branches, we assume that remote branches are "tracking" so long as the
local counterparts exist. This means existing locally-deleted branch won't
be pushed without re-tracking it. I think it's rare to leave locally-deleted
branches for long. For "git.auto-local-branch = false" setup, users might have
to untrack branches if they've manually "merged" remote branches and want to
continue that workflow. I considered using git.auto-local-branch setting in the
migration path, but I don't think that would give a better result. The setting
may be toggled after the branches got merged, and I'm planning to change it
default off for better Git interop.

Implementation-wise, the state enum can be a simple bool. It's enum just
because I originally considered to pack "forgotten" concept into it. I have
no idea which will be better for future extension.
2023-10-16 23:21:05 +09:00
Martin von Zweigbergk
0582893144 working copy: return Box<dyn LockedWorkingCopy> from start_mutation() 2023-10-15 16:13:19 -07:00
Martin von Zweigbergk
580586d008 working copy: return Box<dyn WorkingCopy> from finish() 2023-10-15 16:13:19 -07:00
Martin von Zweigbergk
6a13fa8264 working copy: add tree_id() to backend trait
Looks like I missed this earlier. I think it makes sense to have on
all working copy implementations.
2023-10-15 16:13:19 -07:00
Martin von Zweigbergk
a733fceba9 working copy: add functions to start/finish modification to backend trait
To keep this patch small, the functions still return the concrete
local-disk implementations. I'll fix that soon.
2023-10-15 16:13:19 -07:00
Martin von Zweigbergk
63654d064b working copy: add sparse pattern functions to backend trait 2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
6457a13260 working copy: add reset() function to the backend trait
This includes documenting the new function and the other types moved
to the `working_copy` module.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
0d2247b0df working copy: add check_out() function to the backend trait
This includes documenting the new function and the other types moved
to the `working_copy` module.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
781859cb51 working copy: add snapshot() function to the backend trait
This includes documenting the new function and the other types moved
to the `working_copy` module.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
3aa57b1a04 working copy: start defining a trait for a locked working copy
As with the `WorkingCopy` trait, this just contains some trivial
methods for now.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
49637cb0fd working copy: don't clear tree_state_dirty just before it's dropped 2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
8826639e4e working copy: remove reference from locked instance to base instance
It's going to be easier to define a `LockedWorkingCopy` trait if it
doesn't need to borrow from `WorkingCopy`, so let's remove the
reference we currently have and have
`LockedLocalWorkingCopy::finish()` return the new `LocalWorkingCopy`
instead.

I think the main disadvantage is that we now have to remember to
replace the old `LocalWorkingCopy` instance by the new one, whereas
the compiler would remind us before this commit. We could make
`start_modification()` take an owned `self`, but that would be a bit
annoying to work with when we have the instance stored in a field.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
c8184845d7 workspace: replace working_copy_mut() by wrapper type
I'm about to make `LockedLocalWorkingCopy` not borrow from
`LocalWorkingCopy`. That will make it easier to forget to update any
`LocalWorkingCopy` variables when the modifications have been
committed. This patch introduces a wrapper around
`LockedLocalWorkingCopy` to help prevent that.

Thanks to Yuya for the suggestion.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
35f9c12cb5 working copy: move LocalWorkingCopy::check_out() to Workspace
`LocalWorkingCopy::check_out()` can be expressed using the planned
`WorkingCopy` trait, so it doesn't need to be in the trait itself
`WorkingCopy`. I wasn't sure if I should make it a free function in
`working_copy`, but I ended up moving it onto `Workspace`.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
9fb5a4bcef tests: rename a repo1 variable to repo since there's only one
I think the `repo1` name was a leftover from when `ReadonlyRepo` had a
`WorkingCopy`.
2023-10-15 15:59:49 -07:00
Yuya Nishihara
8b46712bcb view: add stub method to set remote branch target and state, migrate callers
Since set_remote_branch_target() is called while merging refs, its tracking
state shouldn't be reinitialized. The other callers are migrated to new setter
to keep the story simple.
2023-10-16 05:12:19 +09:00
Yuya Nishihara
35ea607abd view: make get_remote_branch() return RemoteRef instead of RefTarget
I'm going to add tracking state to RemoteRef, and we should compare both
target and state in the tests.
2023-10-16 05:12:19 +09:00
Yuya Nishihara
d730b250a0 view: add absent RemoteRef constructors, proxy methods, and conversion impls
Copied from RefTarget.
2023-10-16 05:12:19 +09:00
Yuya Nishihara
ab1a6e2f71 view: make remote branches iterator yield RemoteRef instead of RefTarget
git::import_refs() will need to read RemoteRef's tracking state.
2023-10-16 05:12:19 +09:00
Yuya Nishihara
e7d93e5bf1 view: turn BranchTarget into borrowed type
This isn't important, but I'm going to change remote_targets to store RemoteRef
instead of RefTarget, so I went ahead and change the other field types as well.
2023-10-16 05:12:19 +09:00
Yuya Nishihara
92facbf21a view: add method to iterate branches of specified remote 2023-10-16 05:12:19 +09:00
Yuya Nishihara
8bdef924c8 view: rename remote_branches() to all_remote_branches()
I'm going to add a method that iterates branches of certain remote, and I
can't find a better name for it than remote_branches(remote_name).
2023-10-16 05:12:19 +09:00
Martin von Zweigbergk
6ca7b5d352 repo: levarage the new static <backend type>::name() functions 2023-10-14 15:21:53 -07:00
Martin von Zweigbergk
f8be0b2030 backends: deduplicate definition of backend names
I copied the example set by `DefaultSubmoduleStore`.
2023-10-14 06:38:35 -07:00
Yuya Nishihara
9186d0fd38 workspace: convert external git repo path to store-relative by constructor
We could fix do_git_clone() instead, but it seemed a bit weird that the
git_repo_path is relative to the store path which is unknown to callers.

Fixes #2374
2023-10-14 22:20:09 +09:00
Yuya Nishihara
b7c7b19eb8 view: migrate in-memory structure to per-remote branches map
There's a subtle behavior change. Unlike the original remove_remote_branch(),
remote_views entry is not discarded when the branches map becomes empty. The
reasoning here is that the remote view can be added/removed when the remote
is added/removed respectively, though that's not implemented yet. Since the
serialized data cannot represent an empty remote, such view may generate
non-unique content hash.
2023-10-14 22:20:00 +09:00
Yuya Nishihara
3ec3cac90b revset: use op_store::View type to resolve branches()/remote_branches()
These functions depend heavily on the underlying data structure, and I haven't
decided abstract View API to access to per-remote data types. Let's use the
underlying data type for now.
2023-10-14 22:20:00 +09:00
Yuya Nishihara
0160eaefd9 view: add function that converts branches to/from per-remote view-like data 2023-10-14 22:20:00 +09:00
Yuya Nishihara
2b78275e22 view: add per-remote view types and iterator that reconstructs BranchTarget
I'm planning to add support for untracked remote branches, and under that
model, there will be many remote branches without local counterparts. That's
the main reason why remote branches are grouped by remote, not by branch name.

The added helper functions will be used by simple_op_store and view.
2023-10-14 22:20:00 +09:00
Yuya Nishihara
198dfa5cbe view: extract remove_remote() from git module
This will be just self.data.remote_views.remove(remote_name), so I'm not gonna
refactor the implementation at this point.
2023-10-13 18:12:45 +09:00
Yuya Nishihara
1d3c830e85 view: rewrite get_branch() callers in tests, remove the method
get_branch() would need to reconstruct the remote_targets map if we migrate
the underlying data structure to per-remote views. Let's remove the method as
it is only used in tests.
2023-10-13 18:12:45 +09:00
Yuya Nishihara
3fb0a3b926 view: add has_branch(name), replace some of get_branch(name) callers
get_branch(name) will be removed soon.
2023-10-13 18:12:45 +09:00
Yuya Nishihara
d840610bad view: rewrite set_branch() callers in tests, remove the method
I'm going to reorganize the underlying data structure, and set_branch() won't
be as simple as it is now.
2023-10-13 18:12:45 +09:00
Martin von Zweigbergk
0aa5f1ae10 working copy: rename working_copy_path() to just path()
It seems pretty clear from the context. Turns out we only use the
function in a test case. Maybe we don't even need it. It's easy to
provide it, though.
2023-10-12 16:10:38 -07:00
Martin von Zweigbergk
9e43207911 working copy: don't expose TreeStateError in LocalWorkingCopy API
The `TreeStateError` type is specific to the current local-disk
working-copy backend, so it should not be part of the generic
working-copy interface I'm trying to create.
2023-10-12 16:10:38 -07:00
Martin von Zweigbergk
0e09d53ce6 working copy: make some reset errors less specific
Same reasoning as the previous commits.
2023-10-12 16:10:38 -07:00
Martin von Zweigbergk
645be615b4 working copy: make some snapshot errors less specific
Same reasoning as the previous commit.
2023-10-12 16:10:38 -07:00
Martin von Zweigbergk
324c40d4c5 working copy: make some checkout errors less specific
I think some of the errors variants in `CheckoutError` are too
specific to the local-disk implementation. Let's merge them and make
them less specific, so it's easier to define a reasonable trait for
the working copy.
2023-10-12 16:10:38 -07:00
Martin von Zweigbergk
33d27ed09f working copy: start defining a working copy trait
This just extracts a trait for the trivial bits to start with.
2023-10-12 16:10:38 -07:00
Yuya Nishihara
69a30b47af refs: migrate classify_branch_push_action() to local/remote targets pair 2023-10-12 16:50:09 +09:00
Yuya Nishihara
420bf79217 view: add method to iterate local/remote pairs of certain remote
This is common operation, and we don't need a map of all remote targets
to calculate push action.
2023-10-12 16:50:09 +09:00
Yuya Nishihara
679a591a22 repo: rewrite diffing of named refs to compare each targets pair
As I'm going to split branches into per-remote map, .get_branch(name) will need
to gather remote branches by name to construct remote_targets map. Let's instead
iterate local/remote branches separately. I also migrated diffing of the other
kinds of refs to filter out unchanged entries upfront.
2023-10-11 19:24:24 +09:00
Yuya Nishihara
6bc19bfa95 git: remove workaround for "branch forget && git fetch"
As we now diff incoming git refs against our known remote branches, the problem
described in the comment no longer occurs. test_branch_forget_fetched_branch()
passes, and the inline comments in the test are still valid.
2023-10-11 06:18:36 +09:00
Yuya Nishihara
7d78ef60d1 git: rewrite diffing of exportable branches to not re-lookup targets by name
As we need to build a set of all branch names anyway, we can also put old/new
targets there. InvalidGitName is moved to caller since the diff function no
longer converts RefName to "refs/" string.
2023-10-11 06:18:36 +09:00
Yuya Nishihara
6f5cc2fd32 git: on export_refs(), copy already-exported local branches to "git" remote
This ensures that our view of the "git" remote is updated even if the last
imported/exported git_refs were out of sync because of "op restore".
2023-10-11 06:18:36 +09:00
Yuya Nishihara
aaf1bbcb4a git: use HashMap to track failed branches internally in export_refs()
This helps to filter out unexported refs in the next commit.
2023-10-11 06:18:36 +09:00
Yuya Nishihara
e160970b79 git: migrate import_refs() to diffing git_refs and known remote refs 2023-10-09 22:31:20 +09:00
Yuya Nishihara
36ee24379b git: build separate lists of git/remote refs to be imported
The idea is that the "remote" refs could have been "op restore"-d whereas
view.git_refs() will never be. The next commit will update known_remote_refs
to be constructed from the current remote branches.

Instead of building these lists in a single loop, we could load new git_refs
to the view first, and then build diffs of the remote refs. I considered that,
but I feel it would be a bit awkward to update refs before importing commits
to the view.

The "remote" refs are stored in BTreeMap since merging order should be stable.
2023-10-09 22:31:20 +09:00
Yuya Nishihara
5bd2ab76f4 git: check reserved remote name while diffing
As I'm going to add separate lists of changed git_refs/remote_refs, it'll
become a bit unclear which one we should check for reserved remotes. The
diff might also be reorganized as a list of (remote, name, kind, old_target,
new_target) where remote == "git" means the git-tracking branch. In this
data structure, the notion of reserved remote name would be lost.
2023-10-09 22:31:20 +09:00
Yuya Nishihara
f062df6da7 git: rename local variables in import_refs()
I'm going to add separate lists of changes for git_refs and remote refs, and
the current changed_git_refs will be the list for the remote refs.
2023-10-09 22:31:20 +09:00
Martin von Zweigbergk
1b9a3e27e0 merged_tree: read before/after trees concurrently
I'm going to rewrite `TreeDiffIterator` to fetch one level (depth) of
the tree at a time and concurrently. One step towards that is to
convert the iterator to a `Stream`. I'd like to do that by making the
current `Iterator` implementation call the new `Stream`
implementation. However, we can't call `futures::executor::block_on()`
on a future that itself calls `futures::executor::block_on()` (as
`Store::read_tree()` does), so the first step is to bubble up the
async-ness a bit. This patch does that by fetching both sides of the
diff concurrently. That should give close to a 2x speedup on
high-latency backends. (It doesn't help with our backend at Google,
however, because we have a daemon process that does some speculative
prefetching that usually downloads the child trees anyway.)
2023-10-08 23:36:49 -07:00
Martin von Zweigbergk
815cf9bf07 merge: implement Default and Extend on MergeBuilder
`futures::stream::Stream::collect()` requires a collection that
implements `Default` and `Extend`, and I would like to to be able to
collect a stream of trees.
2023-10-08 23:36:49 -07:00
Martin von Zweigbergk
5174489959 backend: make read functions async
The commit backend at Google is cloud-based (and so are the other
backends); it reads and writes commits from/to a server, which stores
them in a database. That makes latency much higher than for disk-based
backends. To reduce the latency, we have a local daemon process that
caches and prefetches objects. There are still many cases where
latency is high, such as when diffing two uncached commits. We can
improve that by changing some of our (jj's) algorithms to read many
objects concurrently from the backend. In the case of tree-diffing, we
can fetch one level (depth) of the tree at a time. There are several
ways of doing that:

 * Make the backend methods `async`
 * Use many threads for reading from the backend
 * Add backend methods for batch reading

I don't think we typically need CPU parallelism, so it's wasteful to
have hundreds of threads running in order to fetch hundreds of objects
in parallel (especially when using a synchronous backend like the Git
backend). Batching would work well for the tree-diffing case, but it's
not as composable as `async`. For example, if we wanted to fetch some
commits at the same time as we were doing a diff, it's hard to see how
to do that with batching. Using async seems like our best bet.

I didn't make the backend interface's write functions async because
writes are already async with the daemon we have at Google. That
daemon will hash the object and immediately return, and then send the
object to the server in the background. I think any cloud-based
solution will need a similar daemon process. However, we may need to
reconsider this if/when jj gets used on a server with a custom backend
that writes directly to a database (i.e. no async daemon in between).

I've tried to measure the performance impact. That's the largest
difference I've been able to measure was on `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0` in the Linux repo,
which increases from 749 ms to 773 ms (3.3%). In most cases I've
tested, there's no measurable difference. I've tried diffing from the
root commit, as well as `jj --ignore-working-copy log --no-graph -r
'::v3.0 & author(torvalds)' -T 'commit_id ++ "\n"'` (to test a
commit-heavy load).
2023-10-08 23:36:49 -07:00
Martin von Zweigbergk
bd5eef9c5e git_backend: rename some store variables backend in tests
This is to avoid confusion with instances of the `Store` type.
2023-10-08 23:36:49 -07:00
Martin von Zweigbergk
b9a122ffe7 working_copy: inline apply_diff closure
This effectively undoes d8a313cdd4, which is no longer needed since
we just changed that error handling. It should make it easier to share
some of the current if/else blocks.
2023-10-07 14:02:31 -07:00
Martin von Zweigbergk
44eb902171 working_copy: don't crash when updating and tracked file exits on disk
Before this patch, when updating to a commit that has a file that's
currently an ignored file on disk, jj would crash. After this patch,
we instead leave the conflicting files or directories on disk. We
print a helpful message about how to inspect the differences between
the intended working copy and the actual working copy, and how to
discard the unintended changes.

Closes #976.
2023-10-07 14:02:31 -07:00
Martin von Zweigbergk
4601c87710 working_copy: move creation of parent dirs to one place
I'm about to add handling of parent dirs that are existing ignored
files, so it's better to have it in one place. The only functional
difference should be that we now create parent directories for git
submodules. I don't think that matters.
2023-10-07 14:02:31 -07:00
Martin von Zweigbergk
187ba9430a working_copy: rename to local_working_copy
It's about time we make the working copy a pluggable backend like we
have for the other storage. We will use it at Google for at least two
reasons:

 * To support our virtual file system. That will be a completely
   separate working copy backend, which will interact with the virtual
   file system to update and snapshot the working copy.

 * On local disk, we need to tell our build system where to find the
   paths that are not in the sparse patterns. We plan to do that by
   wrapping the standard local working copy backend (the one moved in
   this commit), writing a symlink that points to the mainline commit
   where the "background" files can be read from.

Let's start by renaming the exising implementation to
`local_working_copy`.
2023-10-07 08:19:03 -07:00
Yuya Nishihara
d0bc34e0f2 git: look up "git" remote branches normally 2023-10-07 19:33:35 +09:00
Yuya Nishihara
717d0d3d6d git: on deserialize/import/export, copy refs/heads/* to remote named "git"
I've added a boolean flag to the store to ensure that the migration never runs
more than once after the view gets "op restore"-d. I'll probably reorganize the
branches structure to support non-tracking branches later, but updating the
storage format in a single commit would be too involved.

If jj is downgraded, these "git" remote refs would be exported to the Git repo.
Users might have to remove them manually.
2023-10-07 19:33:35 +09:00
Yuya Nishihara
9407d4ecca view: on deserialize, remove reserved "git" remote refs stored by old jj
I'm going to migrate "refs/heads/" branches to .remote_targets["git"]. This
commit will simplify the story as we won't have to exclude "refs/remotes/git/"
refs when diffing or renaming/removing remote.
2023-10-07 19:33:35 +09:00
Yuya Nishihara
0e82e52c3a revset: remove extra step to resolve full commit id, use prefix matching
Since both has_id() and resolve_prefix() do binary search, their costs are
practically the same. I think has_id() would complete with fewer ops, but such
level of optimization wouldn't be needed here. More importantly, this ensures
that unreachable commits aren't imported by GitBackend::read_commit().
2023-10-07 02:08:36 +09:00
Yuya Nishihara
f0ad1f53ea git_backend: on read_commit(), fall back to importing extras as needed
One problematic scenario is that we have commits imported by old jj, and all
of their descendant commits are created by jj. Therefore import_head_commits()
wouldn't reach the old ancestor commits.

This change might bury a real bug, but I don't have a better alternative. Maybe
we can remove this hack after a couple of jj releases, and add a debug command
that imports all reachable Git commits from all historical heads.

Closes #2343
2023-10-07 02:08:36 +09:00
Yuya Nishihara
a302090de9 git: add hack to unset Git HEAD by using placeholder ref
As we can set HEAD to an arbitrary ref by using .reference_symbolic(), we don't
have to manage a ref that can also be valid as a branch name.

Fixes #1495
2023-10-05 01:32:48 +09:00
Yuya Nishihara
7ccbc0424c git: extract function that resets Git HEAD
I'll add a workaround for the root parent issue #1495 there. We can pass in
the wc parent id instead of the wc_commit object, but we might want to use
wc_commit.id() to generate a unique placeholder ref name.
2023-10-04 01:43:34 +09:00
Yuya Nishihara
7c96cead34 git_backend: rename git_repo_clone() as it isn't just cloning, propagate error
Since git2::Repository::open() will access to the filesystem, it can technically
fail.
2023-10-04 00:04:24 +09:00
Yuya Nishihara
837dc4f47c git_backend: rewrite remaining git_repo() callers, make it private
While debugging git issues, I often ended up creating a deadlock by adding
debug prints. It's also not obvious that git::export_refs() works even if the
git_repo() has already been locked, whereas git::import_refs() wouldn't. Let's
consolidate lock handling to the backend implementation.
2023-10-04 00:04:24 +09:00
Yuya Nishihara
0d63223dad git_backend: proxy git2::Repository methods to manage lock scope internally
Since git_repo() acquires Mutex, it's super easy to create a deadlock.
2023-10-04 00:04:24 +09:00
Yuya Nishihara
65ecac10e9 git: exclude hidden commits from list of commits to be abandoned
This wasn't a problem before, but we wouldn't want to report previously-hidden
commits as abandoned.
2023-10-02 17:31:05 +09:00
Yuya Nishihara
16d3bcd4c5 git: make import_refs() return abandoned commits to caller
It'll be reported to user.
2023-10-02 17:31:05 +09:00
Martin von Zweigbergk
58de7c292b cli: redefine default log revset using immutable_heads()
I think most users who change the set of immutable heads away from
`trunk() | tags()` are going to also want to change the default log
revset to include the newly mutable commit and to exclude the newly
immutable commits. So let's update the default log revset to use
`immutable_heads()` instead.

`test_templater` changed because we have overridden the set of
immutable commits there so `jj log` now includes the remote branch.
2023-10-01 11:15:30 -07:00
Yuya Nishihara
f0f1d72cf3 view: add method that iterates remote branches only
There aren't many callers, but let's add it for consistency.
2023-09-30 12:02:35 +09:00
Yuya Nishihara
c5474ff505 view: extract method that iterates local branches only
I'll probably reorganize the local/remote branches structure, so let's
minimize call sites which rely on the BranchTarget struct.
2023-09-30 12:02:35 +09:00
Yuya Nishihara
520f692a46 git: on import_refs(), don't preserve old branches referenced by remote refs
If we made @git branches real .remote_targets["git"], remotely-rewritten
commits could also be pinned by the @git branches, and therefore wouldn't be
abandoned. We could exclude the "git" remote, but I don't think local commits
should be pinned by remote refs in general. If we squashed a fetched commit,
remote ref would point to a hidden commit anyway.
2023-09-30 12:02:35 +09:00
Martin von Zweigbergk
7fda80fc22 tree: simplify conflict before resolving at hunk level
I ran into a bug the other day where `jj status` said there was a
conflict in a file but there were no conflict markers in the working
copy. The commit was created when I squashed a conflict resolution
into the commit's parent. The rebased child commit then ended up in
this state. I.e., it looked something like this before squashing:

```
C (no conflict)
|
| B conflict
|/
A conflict
```

The conflict in B was different from the conflict in A. When I
squashed in C, jj would try to resolve the conflicts by first creating
a 7-way conflict (3 from A, 3 from B, 1 from C). Because of the exact
content-level changes, the 7-way conflict couldn't be automatically
resolved by `files::merge()` (the way it currently works
anyway). However, after simplifying the conflict, it could be
resolved. Because `MergedTree::merge()` does another round of conflict
simplification of the result at the end of the function, it was the
simplifed version that actually got stored in the commit. So when
inspecting the conflict later (e.g. in the working copy, as I did), it
could be automatically resolved.

I think there are at least two ways to solve this. One is to call
`merge_trees()` again after calling `tree.simplify()` in
`MergedTree::merge()`. However, I think it would only matter in the
case of content-level conflicts. Therefore, it seems better to make
the content-level resolution solve this case to start with. I've done
that by simplifying the conflict before passing it into
`files::merge()`. We could even do the fix in `files::merge()`, but
doing it before calling it has the advantage that we can avoid reading
some unchanged content from the backend.
2023-09-27 22:14:39 -07:00
Martin von Zweigbergk
af80e4e407 files: take Merge argument to merge()
All non-test callers already have a `Merge` object, so let's pass that
instead. We thereby simplify the callers a little, and we enforce the
"adds.len() == removes.len() + 1" constraint in the type.
2023-09-27 22:14:39 -07:00
Martin von Zweigbergk
e50f6acab1 templater: fast-path empty and conflict to not read trees
When there's a single parent, we can determine if a commit is empty by
just comparing the tree ids. Also, when using tree-level conflicts, we
don't need to read the trees to determine if there's a conflict. This
patch adds both of those fast paths, speeding up `jj log -r ::main`
from 317 ms to 227 ms (-28.4%). It has much larger impact with our
cloud-based backend at Google (~5x faster).

I made the same fix in the revset engine and the Git push code (thanks
to Yuya for the suggestion).
2023-09-26 18:18:52 -07:00
Martin von Zweigbergk
a6ef3f0b6c cli: make set of immutable commits configurable
This adds a new `revset-aliases.immutable_heads()s` config for
defining the set of immutable commits. The set is defined as the
configured revset, as well as its ancestors, and the root commit
commit (even if the configured set is empty).

This patch also adds enforcement of the config where we already had
checks preventing rewrite of the root commit. The working-copy commit
is implicitly assumed to be writable in most cases. Specifically, we
won't prevent amending the working copy even if the user includes it
in the config but we do prevent `jj edit @` in that case. That seems
good enough to me. Maybe we should emit a warning when the working
copy is in the set of immutable commits.

Maybe we should add support for something more like [Mercurial's
phases](https://wiki.mercurial-scm.org/Phases), which is propagated on
push and pull. There's already some affordance for that in the view
object's `public_heads` field. However, this is simpler, especially
since we can't propagate the phase to Git remotes, and seems like a
good start. Also, it lets you say that commits authored by other users
are immutable, for example.

For now, the functionality is in the CLI library. I'm not sure if we
want to move it into the library crate. I'm leaning towards letting
library users do whatever they want without being restricted by
immutable commits. I do think we should move the functionality into a
future `ui-lib` or `ui-util` crate. That crate would have most of the
functionality in the current `cli_util` module (but in a
non-CLI-specific form).
2023-09-25 15:41:45 -07:00
Martin von Zweigbergk
551abee1d6 local_backend: don't write commits with no parents 2023-09-25 15:41:45 -07:00
Yuya Nishihara
0ca5cf48db git: make fetch() import local tags in addition to remote branches 2023-09-26 00:47:00 +09:00
Martin von Zweigbergk
380e204e73 test: use test backend in most remaining tests too
I don't think the backend should matter for any of these tests, so
let's test with only one, and let's make that the strictest one - the
new test backend.

This reduces the number of tests by 74 (from 974 to 900), but saves no
measurable run time.
2023-09-24 21:24:01 -07:00
Martin von Zweigbergk
9d8be29d94 watchman: use single-threaded async runtime
The `#[tokio::main]` annotation uses a multi-threaded runtime by
default. We don't need that for querying watchman. Switching to the
single-threaded runtime saves about 20 ms.
2023-09-24 15:46:13 -07:00
Yuya Nishihara
9697970f71 revset: simplify RevsetAliasesMap getters to not construct id
I originally attempted to embed function parameters in RevsetAliasId. That's
probably why these getters return id. Let's move id construction to callers
since the id only serves as a recursion blocker.
2023-09-24 23:21:23 +09:00
Martin von Zweigbergk
acf84f5cd8 cargo: add LICENSE file to each crate we publish
According to https://github.com/clap-rs/clap/pull/2810, the Apache
license must be distributed with sources, including with the sources
we publish to crates.io.
2023-09-22 21:48:28 -07:00
Martin von Zweigbergk
9946e52fdf tree: leverage Merge::try_map() when reading file contents to merge 2023-09-22 19:33:48 -07:00
Yuya Nishihara
4974065edb git: update comment about remote-tracking branches to be exported
Spotted while porting it to per-remote views. Undone fetch/push is tricky. If
we want to detect git/jj conflicts in that scenario, we would need to track
both "known" and "current" remote targets. It's probably okay to export jj's
remote targets as we do the reverse for import_refs(), but I need to think
that a bit more.
2023-09-23 09:56:56 +09:00
Martin von Zweigbergk
6f53d3a103 tests: test views, operations, and mutable repos only with test backend
I don't think the backend type should matter for any of these.
2023-09-20 07:47:30 -07:00
Martin von Zweigbergk
e3f82cd99a tests: leverage TestRepo::init() in test_merged_tree
I forgot to update these call sites when I introduced (the new version
of) `TestRepo::init()`.
2023-09-20 07:47:30 -07:00
Martin von Zweigbergk
f39b0d24c8 tests: use test backend in working copy tests, fix MergedTree bug
Only tests dealing with Git submodules care about the backend type.

Switching the tests to use the test backend also uncovered another bug
in `MergedTree`, so I fixed that too. The bug only happens with legacy
trees (path-level conflicts) and backends that care about the conflict
path, so it wouldn't happen with Git backends, and it wouldn't happen
at Google either (because we use tree-level conflicts).
2023-09-19 20:49:41 -07:00
Martin von Zweigbergk
0f7054e8c3 tests: wherever we test with only one backend, use the test backend
I don't think there's any reason to use the local backend in tests
instead of using the stricter test backend.

I think we should generally use the test backend in tests and only use
the local backend or git backend when there's a particular reason to
do so (such as in `test_bad_locking` where the on-disk directory
structure matters). But this patch only deals with the simpler cases
where we were only testing with the local backend.
2023-09-19 20:49:41 -07:00
Martin von Zweigbergk
d575aaeca8 backend: move constant functions first
`root_commit_id()`, `root_change_id()`, and `empty_tree_id()` were
strangely ordered between `write_symlink()` and `read_tree().
2023-09-19 05:24:51 -07:00
Martin von Zweigbergk
7ecd64fde1 merged_tree: use child path when merging child
This fixes a bug where we used the parent directory's path when trying
read trees and files for a child entry. Many tests in
`test_merged_tree` fail after switching to the test backend there
without this fix/
2023-09-18 07:53:19 -07:00
Martin von Zweigbergk
63ba2a6346 tests: add a strict backend for use in tests
We ran into a bug in `MergedTree` with our commit backend at
Google. The problem there was that `MergedTree` sometimes uses the
wrong path when reading files and trees. We didn't catch the bug in
our tests (outside of Google) because both our backends let you read
files and trees at any path.

This commit introduces a stricter backend that we can use in tests to
catch this kind of bug. For simplicity, it stores all data in
memory. Since tests are short-lived, I think that should be fine.

For now, this backend is stricter only in that it doesn't mix objects
written to different paths. We can make it strict/lossy in other ways
later (e.g. modifying written commit objects).

I think having a backend designed for tests can also be useful for
later making it possible to control the backend, e.g. to inject
errors.

We may want to replace almost all uses of the local backend in tests
with uses of this new test backend.
2023-09-18 07:53:19 -07:00
Martin von Zweigbergk
cfffbb6cd7 content_hash: make public
I'm going to use `content_hash` in a new commit backend for use in
tests.
2023-09-18 07:53:19 -07:00
Martin von Zweigbergk
9c30d7500b testutils: delete bool-typed init() in favor of enum-typed version
It makes the call sites clearer if we pass the `TestRepoBackend` enum
instead of the boolean `use_git` value. It's also more extensible (I
plan to add another backend for tests).
2023-09-18 07:15:37 -07:00
Martin von Zweigbergk
50596c499e testutils: allow passing TestRepoBackend to TestWorkspace too 2023-09-18 07:15:37 -07:00
Martin von Zweigbergk
c6cf9d54f6 testutils: add an enum for TestRepo backend
I plan to add another backend for use in tests.
2023-09-18 07:15:37 -07:00
Martin von Zweigbergk
79527d707c testutils: use .jj-internal git repos in most tests
I don't think there's much reason to run most tests with a `.git`
directory outside of `.jj`. I think it's just that way for historical
reasons. It's been that way since I added support for `.jj`-internal
repos in a8a9f7dedd.

The reason I want to switch is to make it a little easier to create
test repos for different backends. The problem with `.jj`-external git
repos is that they depend on an additional path.

I had to update `test_bad_locking.rs` to make the code merging
directories able handle missing directories on some side, because
git's loose objects result in directories getting created on one or
both sides.
2023-09-18 07:15:37 -07:00
Martin von Zweigbergk
2bc641c57c test_bad_locking: make merge_directories() expect non-existent target
I ran into some issues here when switching our tests to use
`.jj`-internal git repos. For example, the `std::fs::copy()` calls
started failing, which may be related to #2103. I think one problem is
that we could end up calling `merge_directories()` twice for the same
directory. This patch fixes that by deduping the paths we call with,
and makes the function assume that the output directory doesn't exist.
2023-09-18 06:59:28 -07:00
Yuya Nishihara
fc86d91b15 tests: fix test_merge_views_branches() to set up local "feature" branches
Git refs aren't involved in this test.
2023-09-11 01:07:29 +09:00
Yuya Nishihara
b80d04319d tests: rewrite "@"/"root" revset resolution tests without using bad git refs 2023-09-11 01:07:29 +09:00
Martin von Zweigbergk
86206e83a4 repo_path: avoid repeated copying of PathBuf in to_fs_path()
I've noticed `WorkspaceCommandHelper::format_file_path()` appear in
profiles a few times. A big part of that is spent in
`RepoPath::to_fs_path()`. I think I had been thinking that
`PathBuf::join()` takes `self` by value and mutates it, but it turns
out it creates a new instance. So our `result = result.join(...)` in a
loop was copying the `PathBuf` over and over. This fixes that and also
reserves the expected size. That speeds up `jj files
--ignore-working-copy -r v6.0` in the Linux repo from 546.0 s to 509.3
s (6.7%).
2023-09-09 08:43:29 -07:00
Yuya Nishihara
d05aaf30b4 git: promote imported commits to tree-conflict format if configured 2023-09-08 21:54:43 +09:00
Yuya Nishihara
0c4e667e48 git: import new head commits all at once 2023-09-08 21:54:43 +09:00
Yuya Nishihara
f744e5920b git: create no-gc ref by GitBackend
Since we now have an explicit method to import heads, it makes more sense to
manage no-gc refs by the backend.
2023-09-08 21:54:43 +09:00
Yuya Nishihara
17f502c83a git: do not import unknown commits by read_commit(), use explicit method call
The main goal of this change is to enable tree-level conflict format, but it
also allows us to bulk-import commits on clone/init. I think a separate method
will help if we want to provide progress information, enable check for
.jjconflict entries under certain condition, etc.

Since git::import_refs() now depends on GitBackend type, it might be better to
remove git_repo from the function arguments.
2023-09-08 21:54:43 +09:00
Yuya Nishihara
dde1c1ec9c revset: use ancestors(x, 0) in tree substitution test 2023-09-08 09:28:14 +09:00
Yuya Nishihara
8b825796ff cli: rewrite "x | x-" in default log revset as "ancestors(x, 2)" 2023-09-08 09:28:14 +09:00
James Sully
0946934ca6 revset: Add optional argument n to ancestors() in revset language 2023-09-08 02:50:58 +10:00
James Sully
6ea10906a8 revset: add ancestors_range() method 2023-09-08 02:50:58 +10:00
Yuya Nishihara
c4769e0b7c revset: translate symbol rules in error message
Since we have overloaded operator symbols, we need to deduplicate them
upfront. Legacy and compat operators are also removed from the suggestion.

It's a bit ugly to mutate the error struct before calling Error::renamed_rule(),
but I think it's still better than reimplementing message formatting function.
2023-09-07 15:29:39 +09:00
Yuya Nishihara
a868b2d9a5 revset: do not lookup unimported tags or remote branches by unqualified name
Since e7e49527ef "git: ensure that remote branches never diverge", the last
known "refs/remotes" ref should be synced with the corresponding remote branch.
So we can always trust the branch@remote expression. We don't need "refs/tags"
lookup either since tags should have been imported by git::import_refs().

FWIW, I'm thinking of reorganizing view.git_refs() map as per-remote views.
It would be nice if we can get rid of revsets and template keywords exposing
low-level Git ref primitives.
2023-09-06 13:42:22 +09:00
Yuya Nishihara
4e6323c0a4 revset: add basic test for tag name resolution 2023-09-06 13:42:22 +09:00
Yuya Nishihara
8f4512b109 git: rename local variables in diff_refs_to_export()
old/new_branch sounds like a branch name, but it's a RefTarget. And
jj_known_refs_passing_filter may contain "new" ref names.
2023-09-06 08:17:49 +09:00
Yuya Nishihara
119cc88832 git: extract calculation step from export_some_refs()
export_some_refs() is big, let's split it to functions of reasonable size.
2023-09-06 08:17:49 +09:00
Yuya Nishihara
89f1d83a22 git: do not export new ref if racy process created one with the same name
Since we've checked the ref doesn't exist in this code path, I think it's
better to not overwrite the existing ref.
2023-09-06 08:17:49 +09:00
Yuya Nishihara
a4dd598e3e git: extract helper that exports single updated ref 2023-09-06 08:17:49 +09:00
Yuya Nishihara
d1a08782c9 git: extract helper that exports single deleted ref 2023-09-06 08:17:49 +09:00
Yuya Nishihara
7282b517e8 git: remove stale TODO comment about FailedRefExportReason 2023-09-06 08:17:49 +09:00
Philip Metzger
8dda7e21e0 revsets: Add descendants_at and ancestors_at
They are needed for `next` and `prev`.
They behave symmetrically for parents and children.
2023-09-05 23:13:39 +02:00
Martin von Zweigbergk
f198a1e5f9 git: skip branches on root commit on export
Git doesn't have a root commit, so we should skip branches pointing to
it on export, just like we do with conflicted branches (which Git also
doesn't support).
2023-09-04 20:08:11 -07:00
Martin von Zweigbergk
ef550a9d6d git: include reason for each failed ref export 2023-09-04 20:08:11 -07:00
Martin von Zweigbergk
f6c24fbcef git: return refs that failed to export in sorted order
Before this patch, the order would depend on the reason we failed to
export a ref, because we would add to the `failed_branches` list in
several different places. What's worse, when the export failed because
the branch was conflicted or had an invalid name (from Git's
perspective), it was non-deterministic because we iterated over a
HashSet. This patch fixes that by sorting at the end.

Note that we still want the `branches_to_update` map to be a
`BTreeMap` so we update branches in deterministic order. Otherwise the
error when trying to export both branches `main` and `main/sub` will
become non-deterministic.
2023-09-04 20:08:11 -07:00
Yuya Nishihara
b0c8e9ef62 revset: add 0-ary "::" and ".." operators as short for "all()" and "~root()"
Suppose "x::y" is the operator that defaults to "root()::visible_heads()"
respectively, "::" is identical to "all()". Since we've just changed the
behavior of "..y", ".." is now "root()..visible_heads()" meaning "~root()".
2023-09-05 10:40:04 +09:00
Yuya Nishihara
6b2ad23f8f revset: evaluate "..y" expression to "root()..y"
This seems useful since the root commit is often uninteresting. It's also
consistent with "x::y" in a way that the left operand defaults to "root()".
2023-09-05 10:40:04 +09:00
Yuya Nishihara
f2c1697362 revset: fix comment about "@" expression 2023-09-05 10:40:04 +09:00
Yuya Nishihara
e3c85d6ecc revset: convert root symbol to function
The idea is that we can fully eliminate special symbols that would otherwise
shadow user branches, tags, or change ID prefixes.

Closes #2095
2023-09-04 10:36:30 +09:00
Martin von Zweigbergk
e3825495e3 store: don't include cached objects in Debug format
I was looking into some overly verbose logs and happened to notice
that `Store` uses the default derived, which presumably means it's
going to include all objects in its cached. Just including the
backend's debug string seems enough.
2023-09-01 16:49:23 -07:00
Martin von Zweigbergk
16c897d8b4 conflicts: leverage Merge::map() in materialize_merge_result() 2023-09-01 16:23:39 -07:00
Martin von Zweigbergk
b8f71a4b30 working_copy: in LockedWorkingCopy::drop(), discard unsaved changes
In `LockedWorkingCopy::drop()`, we panic if the caller had not called
`finish()`. IIRC, the idea was both to find bugs where we forgot to
call `finish()` and to prevent continuing with a modified
`WorkingCopy` instance. I don't think the former has been a problem in
practice. It has been a problem in practice to call `discard()` to
avoid the panic, though. To address that, we can make the `Drop`
implementation discard the changes (forcing a reload of the state if
the working copy is accessed again).
2023-09-01 12:25:47 -07:00
Martin von Zweigbergk
5ef0be73c1 merged_tree: allow building trees with variable-arity overrides
When restoring (`jj restore`) a 3-sided conflict from one tree into a
2-sided tree (or a resolved tree), we'll need to extend the size arity
of the target tree to that of the source tree. I had not considered
this case before. This patch relaxes the constraint in
`MergedTreeBuilder` to allow such cases. The additional trees are
based on empty trees with only the larger overrides in them.
2023-09-01 06:09:37 -07:00
Martin von Zweigbergk
d29c831ef9 merge: add a function for padding a Merge to a given size
This will be useful for padding merge objects in `MergedTreeBuilder`
to the size of the widest override.
2023-09-01 06:09:37 -07:00
Martin von Zweigbergk
be08707a3a op_heads_store: propagate errors from OpStore 2023-08-31 12:36:47 -07:00
Martin von Zweigbergk
9d9b2cd057 tests: leverage create_tree() in a few more tests 2023-08-30 19:58:42 -07:00
Martin von Zweigbergk
26581750fe store: add a empty_merged_tree_id() helper
Many (most?) callers of `Store::empty_tree_id()` really want a
`MergedTreeId`, so let's create a helper for that. It returns the
`Legacy` variant, which is what all current callers used. That should
be all we need since the two variants compare equal these days, and
since trees built based on the legacy variant can get promoted to the
new variant on write if the config is enabled.
2023-08-30 19:58:42 -07:00
Martin von Zweigbergk
61501db8ec merged_trees: consider conflict-format-change-only commits empty
When we start writing tree-level conflicts in an existing repo, we
don't want commits that change the format to be non-empty if they
don't change any content. This patch updates `MergeTreeId::eq()` to
consider two resolved trees equal even if only their `MergedTreeId`
variant is different (one is path-level and one is tree-level).

I think I've gone through all places we compare tree ids and checked
that it's safe to compare them this way. One consequence is that
rebasing a commit without changing the parents (typically
auto-rebasing after `jj describe`) will not lead to the tree id
getting upgraded, due to an optimization we have for that case. I
don't think that's serious enough to handle specially; we'll have to
support the old format for existing repos for a while regardless of a
few commits not getting upgraded right away.

The number of failing tests with the config option enabled drop from
108 to 11 with this patch.
2023-08-30 06:17:21 -07:00
Martin von Zweigbergk
8e47d2d66f merged_tree: add config option to write trees using new format
We're finally ready to start writing trees using the new format where
we represent conflicts by having multiple trees in the commit instead
of having a single tree with multiple entries at a path. This patch
adds a config option for that. It's not ready to be used yet, so I
haven't updated the release notes or other documentation.

I added only a simple CLI test for testing what happens when the
config is enabled in an existing repo. 108 tests currently fail if we
flip the default.
2023-08-30 06:17:21 -07:00
Martin von Zweigbergk
bb755ae754 working_copy: use MergedTreeBuilder in test 2023-08-30 06:17:21 -07:00
Martin von Zweigbergk
6b3fa3ee79 working_copy: avoid a nested TreeBuilder in a test
It's just slightly simpler to not create a `TreeBuilder` to create a
`TreeValue`; we can instead write to the outer builder.
2023-08-30 06:17:21 -07:00
Martin von Zweigbergk
7e6930b56f backend: remove last few instances of MergedTreeId::as_legacy_tree_id() 2023-08-30 06:17:21 -07:00
Martin von Zweigbergk
962da1947e tests: make dump_tree() work with merged trees
My goal is to minimize impact on tests when we start using the new
format.
2023-08-30 06:17:21 -07:00