Commit graph

730 commits

Author SHA1 Message Date
Yuya Nishihara
ea32c0cb9e git_backend: pass UserSettings to GitBackend constructors 2023-11-11 22:35:54 +09:00
Yuya Nishihara
8a2048a0e5 repo: pass UserSettings to store factories and initializers
GitBackend will use it to configure gix::Repository. I think UserSettings
is generally useful to pass store-specific parameters, so I've updated all
factory functions.
2023-11-11 22:35:54 +09:00
Yuya Nishihara
2ac9865ce7 revset: exclude @git branches from remote_branches()
As discussed in Discord, it's less useful if remote_branches() included
Git-tracking branches. Users wouldn't consider the backing Git repo as
a remote.

We could allow explicit 'remote_branches(remote=exact:"git")' query by changing
the default remote pattern to something like 'remote=~exact:"git"'. I don't
know which will be better overall, but we don't have support for negative
patterns anyway.
2023-11-08 07:34:30 +09:00
Yuya Nishihara
e0c35684af merge: rename Merge::new() to Merge::from_removes_adds()
Since (removes, adds) pair is no longer the canonical representation of Merge,
the name Merge::new() seems too generic. Let's give more verbose name.
2023-11-07 17:10:12 +09:00
Yuya Nishihara
dd26b7be40 merge: add Merge constructor that accepts interleaved values
Also migrated some callers of 3-way merge, where [left, base, right] order
looks okay.
2023-11-07 17:10:12 +09:00
Martin von Zweigbergk
d989d4093d merged_tree: let backend influence whether to use new diff algo
Since the concurrent diff algorithm is significantly slower when using
the Git backend, I think we'll have to use switch between the two
algorithms depending on backend. Even if the concurrent version always
performed as well as the sequential version, exactly how concurrent it
should be probably still depends on the backend. This commit therefore
adds a function to the `Backend` trait, so each backend can say how
much concurrency they deal well with. I then use that number for
choosing between the sequential and concurrent versions in
`MergedTree::diff_stream()`, and also to decide the number of
concurrent reads to do in the concurrent version.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
f40adb84fc merged_tree: add a Stream for concurrent diff off trees
When diffing two trees, we currently start at the root and diff those
trees. Then we diff each subtree, one at a time, recursively. When
using a commit backend that uses remote storage, like our backend at
Google does, diffing the subtrees one at a time gets very slow. We
should be able to diff subtrees concurrently. That way, the number of
roundtrips to a server becomes determined by the depth of the deepest
difference instead of by the number of differing trees (times 2,
even). This patch implements such an algorithm behind a `Stream`
interface. It's not hooked in to `MergedTree::diff_stream()` yet; that
will happen in the next commit.

I timed the new implementation by updating `jj diff -s` to use the new
diff stream and then ran it on the Linux repo with `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0`. That slowed down by
~20%, from ~750 ms to ~900 ms. Maybe we can get some of that
performance back but I think it'll be hard to match
`MergedTree::diff()`. We can decide later if we're okay with the
difference (after hopefully reducing the gap a bit) or if we want to
keep both implementations.

I also timed the new implementation on our cloud-based repo at
Google. As expected, it made some diffs much faster (I'm not sure if
I'm allowed to share figures).
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
9af09ec236 test_meregd_tree: test diffing with a matcher
We didn't have any tests at all for `MergedTree::diff()` with a
matcher other than `EverythingMatcher`. This patch adds a few.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
16aa8e8f10 test_merged_tree: nest each part of test_diff_dir_file()
I'm about to add a few more checks for diffing with a matcher. I think
it will help make it readable and reduce the risk of mixing up
variables between each part of the test if we use some nested blocks.

I also removed some unnecessary `.clone()` calls while at it.
2023-11-06 23:12:02 -08:00
Yuya Nishihara
93601541cb refs: use swap_remove() in non-trivial merge of ref targets
I'm going to add a Merge method that removes negative/positive terms pair, and
swap_remove() is the easiest option. The order of the conflicted ref targets
doesn't matter.
2023-11-07 06:52:35 +09:00
Yuya Nishihara
895bbce8c0 files: use borrowed Merge iterator in merge()
Since the underlying Merge data type is no longer (Vec<T>, Vec<T>), it doesn't
make sense to build removes/adds Vecs and concatenate them.
2023-11-07 06:52:35 +09:00
Yuya Nishihara
f1898a31b5 merge: simply print interleaved conflict values in debug output
We could apply that for the resolved case, but Resolved/Conflicted label
seems more useful than just printing Merge([value]).
2023-11-06 07:21:06 +09:00
Martin von Zweigbergk
7c923514ee git: add config to disable abandoning of unreachable commits
Some users prefer to have commits not get abandoned when importing
refs. This adds a config option for that.

Closes #2504.
2023-11-05 06:10:54 -08:00
Yuya Nishihara
d9fbf21794 merge: have Merge::adds()/removes() return iterator
The Merge type will be changed to store interleaved values internally.
2023-11-05 16:43:06 +09:00
Yuya Nishihara
602b44258e workspace: add function that initializes colocated git repository
One less git2 API use in CLI.

The function name GitBackend::init_colocated() is a bit odd, but we need to
specify the work-tree path, not the ".git" repo path. So we can't eliminate
the notion of the working copy path anyway.
2023-11-05 08:48:35 +09:00
Yuya Nishihara
77e16243d6 tests: assert paths of initialized GitBackend 2023-11-05 08:48:35 +09:00
Martin von Zweigbergk
24b706641f async: switch to pollster's block_on()
During the transition to using more async code, I keep running into
https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want
to convert `MergedTree::diff()` into a `Stream`. I don't want to
update all call sites at once, so instead I'm adding a
`MergedTree::diff_stream()` method, which just wraps
`MergedTree::diff()` in a `Stream. However, since the iterator is
synchronous, it needs to block on the async `Backend::read_tree()`
calls. If we then also block on the `Stream` in the CLI, we run into
the panic.
2023-11-03 08:15:10 -07:00
Martin von Zweigbergk
3a378dc234 cli: add a function for restoring part of a tree from another tree
We had similar code in two places for restoring paths from one tree to
another. Let's reuse it instead.

I put the new function in the `rewrite` module. I'm not sure if that's
right place. Maybe it belongs in `tree`?
2023-11-02 06:07:45 -07:00
Martin von Zweigbergk
a1ef9dc845 merged_tree: propagate backend errors in diff iterator
I want to fix error propagation before I start using async in this
code. This makes the diff iterator propagate errors from reading tree
objects.

Errors include the path and don't stop the iteration. The idea is that
we should be able to show the user an error inline in diff output if
we failed to read a tree. That's going to be especially useful for
backends that can return `BackendError::AccessDenied`. That error
variant doesn't yet exist, but I plan to add it, and use it in
Google's internal backend.
2023-10-26 06:20:56 -07:00
Martin von Zweigbergk
6ad71e658d merged_tree: rename MergedTreeValue to MergedTreeVal
I'm going to add `MergedTreeValue` as an alias for
`Merge<Option<TreeValue>>`, but we already have a type by that name in
`merged_tree`. This patch renames it away, to make room for the new
alias. I used `MergedTreeVal` for this borrowing version to be a bit
like how `str` is a borrowed version of `String`.
2023-10-26 06:20:56 -07:00
Yuya Nishihara
a6ac9b46e7 git: simply call fetch() with one or more branch name filters 2023-10-25 03:58:48 +09:00
Martin von Zweigbergk
21b11df8a9 merge: make non-conflicted debug string for Merge shorter
Resolves states are most common and the current format is pretty
verbose. Let's print it as if `Merge` were an enum with `Resolved` and
`Conflicted` variants instead.
2023-10-24 06:45:45 -07:00
Yuya Nishihara
a756333216 git: move RefName enum from view module
It's no longer used by View API.
2023-10-24 09:45:01 +09:00
Yuya Nishihara
5543f7a11c refs: merge tracking state of remote branches
Otherwise "jj op undo" can't roll back tracking states (whereas "op restore"
can.)
2023-10-24 07:13:58 +09:00
Yuya Nishihara
cfcc76571c revset: add support for glob:pattern 2023-10-21 09:55:01 +09:00
Martin von Zweigbergk
8764ad9826 conflicts: make materialization async
We need to let async-ness propagate up from the backend because
`block_on()` doesn't like to be called recursively. The conflict
materialization code is a good place to make async because it doesn't
depends on anything that isn't already async-ready.
2023-10-20 07:38:34 -07:00
Martin von Zweigbergk
e900c97618 conflicts: reduce some duplication in tests by extracting a closure 2023-10-20 07:38:34 -07:00
Yuya Nishihara
390b3208da revset: always include non-tracking remote branches in suggestion
For the same reason as the previous commit. Non-tracking remote branch
shouldn't be shadowed by a local branch of the same name.
2023-10-17 16:42:36 +09:00
Yuya Nishihara
e0965c4533 git: on push, update jj's view of remote branches without using import_refs()
This means that the commits previously pinned by remote branches are no longer
abandoned. I think that's more correct since "push" is the operation to
propagate local view to remote, and uninteresting commits should have been
locally abandoned.
2023-10-17 15:06:03 +09:00
Yuya Nishihara
58897d79c7 git: extract push function that processes branches instead of git refs
Since pushed remote branches will share the common base targets with locals,
these branches should be marked as tracking. git::push_branches() will handle
that. It looks ugly that the public GitBranchPushTargets type keeps "force"-d
branches as a separate set, but we'll need to rework that anyway when we
implement --force-with-lease behavior. So let's leave it for now.

Some of the git::push_updates() tests have been migrated to the new function.
I left a couple of basic tests for git::push_updates() because push_updates()
will be used to implement a low-level "jj git push-refs" command.
2023-10-17 15:06:03 +09:00
Yuya Nishihara
57167cefda git: on import_refs(), don't abandon ancestors of newly fetched refs
I made import_refs() not preserve commits referenced by remote branches at
520f692a46 "git: on import_refs(), don't preserve old branches referenced by
remote refs." The idea is that remote branches are weak, and commits referenced
by these refs can be freely rewritten by future local changes without moving
the refs. I don't think that's wrong, but 520f692a46 also made "new" remote
changes be abandoned by old remote refs. This problem occurs only when
git.auto-local-branch is off.

I think there are two ways to fix the problem:
 a. pin non-tracking remote branches just like local refs
 b. pin newly fetched refs in addition to local refs
This patch implements (b) because it's simpler and more obvious that the
fetched commits would never be abandoned immediately.
2023-10-17 14:49:49 +09:00
Martin von Zweigbergk
c3b45b6fd1 workspace: make working-copy type customizable
This add support for custom `jj` binaries to use custom working-copy
backends. It works in the same way as with the other backends, i.e. we
write a `.jj/working_copy/type` file when the working copy is
initialized, and then we let that file control which implementation to
use (see previous commit).

I included an example of a (useless) working-copy implementation. I
hope we can figure out a way to test the examples some day.
2023-10-16 22:33:44 -07:00
Martin von Zweigbergk
6bfd618275 workspace: load working copy implementation dynamically
This makes `Workspace::load()` look a new `.jj/working_copy/type` file
in order to load the right working copy implementation, just like
`Repo::load()` picks the right backends based on `.jj/store/type`,
`.jj/op_store/type`, etc. We don't write the file yet, and we don't
have a way of adding alternative working copy implementations, so it
will always be `LocalWorkingCopy` for now.
2023-10-16 22:33:44 -07:00
Martin von Zweigbergk
e1f00d9426 working copy: pass commit instead of tree into check_out()
Our internal working copy implementations at Google will need the
commit so they can walk history backwards until they get to a "public"
commit. They'll then use that to tell build tools and virtual file
systems to present that as a base.

I'm not sure if we'll need to update `reset()` too. It's currently
only used by `jj untrack`, which doesn't change the commit's parent,
so it wouldn't affect any history walks.
2023-10-16 22:33:44 -07:00
Martin von Zweigbergk
7c8a0a18f9 repo: define types for backend initializer functions
`ReadonlyRepo::init()` takes callbacks for initializing each kind of
backend. We called these things like `op_store_initializer`. I found
that confusing because it is not a `OpStoreFactory` (which is for
loading an existing backend). This patch tries to clarify that by
renaming the arguments and adding types for each kind of callback
function.
2023-10-16 22:33:44 -07:00
Yuya Nishihara
4cd2518be0 git: on import_refs(), respect tracking state of existing remote refs
In this commit, new behavior is tested by using in-memory view data. Data
persistence and track/untrack commands will be implemented soon.
2023-10-16 23:21:05 +09:00
Yuya Nishihara
a697175674 view: add tracking state to RemoteRef
The state field isn't saved yet. git import/export code paths are migrated,
but new tracking state is always calculated based on git.auto-local-branch
setting. So the tracking state is effectively a global flag.

As we don't know whether the existing remote branches have been merged in to
local branches, we assume that remote branches are "tracking" so long as the
local counterparts exist. This means existing locally-deleted branch won't
be pushed without re-tracking it. I think it's rare to leave locally-deleted
branches for long. For "git.auto-local-branch = false" setup, users might have
to untrack branches if they've manually "merged" remote branches and want to
continue that workflow. I considered using git.auto-local-branch setting in the
migration path, but I don't think that would give a better result. The setting
may be toggled after the branches got merged, and I'm planning to change it
default off for better Git interop.

Implementation-wise, the state enum can be a simple bool. It's enum just
because I originally considered to pack "forgotten" concept into it. I have
no idea which will be better for future extension.
2023-10-16 23:21:05 +09:00
Martin von Zweigbergk
0582893144 working copy: return Box<dyn LockedWorkingCopy> from start_mutation() 2023-10-15 16:13:19 -07:00
Martin von Zweigbergk
580586d008 working copy: return Box<dyn WorkingCopy> from finish() 2023-10-15 16:13:19 -07:00
Martin von Zweigbergk
6a13fa8264 working copy: add tree_id() to backend trait
Looks like I missed this earlier. I think it makes sense to have on
all working copy implementations.
2023-10-15 16:13:19 -07:00
Martin von Zweigbergk
63654d064b working copy: add sparse pattern functions to backend trait 2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
0d2247b0df working copy: add check_out() function to the backend trait
This includes documenting the new function and the other types moved
to the `working_copy` module.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
781859cb51 working copy: add snapshot() function to the backend trait
This includes documenting the new function and the other types moved
to the `working_copy` module.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
8826639e4e working copy: remove reference from locked instance to base instance
It's going to be easier to define a `LockedWorkingCopy` trait if it
doesn't need to borrow from `WorkingCopy`, so let's remove the
reference we currently have and have
`LockedLocalWorkingCopy::finish()` return the new `LocalWorkingCopy`
instead.

I think the main disadvantage is that we now have to remember to
replace the old `LocalWorkingCopy` instance by the new one, whereas
the compiler would remind us before this commit. We could make
`start_modification()` take an owned `self`, but that would be a bit
annoying to work with when we have the instance stored in a field.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
c8184845d7 workspace: replace working_copy_mut() by wrapper type
I'm about to make `LockedLocalWorkingCopy` not borrow from
`LocalWorkingCopy`. That will make it easier to forget to update any
`LocalWorkingCopy` variables when the modifications have been
committed. This patch introduces a wrapper around
`LockedLocalWorkingCopy` to help prevent that.

Thanks to Yuya for the suggestion.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
35f9c12cb5 working copy: move LocalWorkingCopy::check_out() to Workspace
`LocalWorkingCopy::check_out()` can be expressed using the planned
`WorkingCopy` trait, so it doesn't need to be in the trait itself
`WorkingCopy`. I wasn't sure if I should make it a free function in
`working_copy`, but I ended up moving it onto `Workspace`.
2023-10-15 15:59:49 -07:00
Martin von Zweigbergk
9fb5a4bcef tests: rename a repo1 variable to repo since there's only one
I think the `repo1` name was a leftover from when `ReadonlyRepo` had a
`WorkingCopy`.
2023-10-15 15:59:49 -07:00
Yuya Nishihara
8b46712bcb view: add stub method to set remote branch target and state, migrate callers
Since set_remote_branch_target() is called while merging refs, its tracking
state shouldn't be reinitialized. The other callers are migrated to new setter
to keep the story simple.
2023-10-16 05:12:19 +09:00
Yuya Nishihara
35ea607abd view: make get_remote_branch() return RemoteRef instead of RefTarget
I'm going to add tracking state to RemoteRef, and we should compare both
target and state in the tests.
2023-10-16 05:12:19 +09:00
Yuya Nishihara
ab1a6e2f71 view: make remote branches iterator yield RemoteRef instead of RefTarget
git::import_refs() will need to read RemoteRef's tracking state.
2023-10-16 05:12:19 +09:00