While working on ancestor generation, I noticed Mercurial has this
substitution rule. Since it's easier to deal with Ancestors() than Range {},
'roots..heads' is first decomposed to ':heads & ~:roots'.
I failed to solve type puzzle for to_predicate_fn<'a>(&'a self) where
'repo: 'a, so struct RevWalkRevset<'repo, T> is bounded by T to consume
the lifetime parameter.
This will be a building block of 'parents(base)' revset. 'base---' will
be .filter_by_generation(3..4) for example. I think 'ancestors(base)' can
also have an optional generation parameter, but I haven't considered any
particular syntax yet.
Even though I couldn't determine if RevWalkGenerationRange has a measurable
cost compared to RevWalk, I'm not comfortable with enabling generation
tracking by default. So this patch adds a separate struct. I duplicated
Iterator::next() method as it seemed rather complicated to extract a common
iterator wrapper.
Actual filtering function and tests will be added by the next commit.
We're more likely to filter out empty commits, so this should be slightly
faster in practice.
The extra Option<> isn't needed, but it should clarify that "prefix([])"
is not "everything".
This basically transforms 's1 & (f() | s2)' to
's1.iter().filter(all && f || s2)'. Still the predicate part includes "all",
the filter function doesn't need to load commit data for every entry since
's1.iter().filter(all)' is tested first. To optimize "all" predicate out,
maybe we can add a wrapper that returns '|_: &IndexEntry| true'.
Instead of inserting AsFilter(_) node, I could add a recursive is_filter()
function. That would also work so long as the height of RevsetExpression tree
is limited. I chose node insertion just for ease of snapshot testing.
The idea behind this is to extend RevWalk to track generation (or depth from
the initial wanted items.) Basic DAG walk doesn't need such data, but a query
like 'rev---' could be translated to a RevWalk yielding nth ancestors.
The default log revset can also be expressed as 0/1-th ancestors of
'(remote_branches() | tags())..'.
Also, this appears to be faster than using boundary sets, based on the
bench extracted from test_index_commits_criss_cross().
In order to optimize a query like '(author(_) | @) & main..', we'll probably
need a predicate form of an iterable set so that the query can be evaluated
to '(main..).iter().filter(author(_) | @)'. And if a predicate function can
terminate the source iterator early (by returning true/false/false_forever),
complexity of a filtered revset is basically the same as an intersection of
iterator pair. This means we can eventually merge IntersectionRevset with
FilterRevset.
With that in mind, this patch removes the redundant 'candidates' field from
the filter node, which would otherwise appear in the predicate function as
'candidates.contains(entry)'. A filter node with candidates was somewhat
useful while rewriting the tree, but that can be dealt with a view function
like as_filter_intersection() in this patch.
This also simplify the subsequent filter transformation as we no longer need
to test if candidates == All.
Previously we only have a test for the left recursion. The added test
contains right recursion path, which should have caught the error I made
while working on the next "unary filter node" patch.
The doc comment summarizes what I'm going to implement. I'm not sure if
we'll add all of them because revset evaluation isn't the key performance
bottleneck at the moment. Anyway, I don't think any of these ideas would
logically conflict with segmented changelog adaptation unless we decide to
replace the whole revset stack with Eden/Sapling's.
To prevent git's GC from breaking a repo, we already add a git ref to
commits we create in the git backend. However, we don't add refs to
commits we import from git. This fixes that.
Closes#815.
There's no need to update our record of the ref if it didn't
change. This is just about making it clearer; I doubt it will have
measurable performance impact.
This patch adds a `legacy-thrift` Cargo feature that's enabled by
default. If it's disabled, the upgrade from Thrift-based operation log
does not happen, and the `thrift` depdendency is not included.
With this patch, we auto-upgrade existing repos that use Thrift format
for the operation log to use Protobuf format. That would only be repos
used with an unreleased version of jj after 0.5.1 (which may be the
majority of repos?).
The upgrade from Thrift is simpler because we now use the same hashing
scheme for the Protobuf-based storage, so the operation and view IDs
remain the same as they were in the Thrift-based storage. We could
simplify the code a bit more as a result, but since this code is
supposed to be short-lived, I didn't bother.
Since the change from the Protobuf format with the old hashing scheme
to a the (same) Protobuf format with the new hashing scheme shouldn't
impact users, I removed the entry we had in the changelog about the
format change.
The code for migrating from ProtoBuf to Thrift is almost completely
independent of which direction the upgrade goes, so we can very easily
reuse it for migrating from Thrift to Protobuf. This patch renames
some variables to "old/new" instead of "proto/thrift", making the next
patch even simpler.
Since we're now allowed to use the `protobuf` crate, I'm going to make
`SimpleOpStore` use it again. This moves the `ThriftOpStore` into a
new `legacy_thrift_op_store.rs` file.
Since we now have approval to use the `protobuf` crate at Google, it's
no longer the "legacy" format, so we should remove it. I'll almost
definitely soon add `legacy_thrift` feature instead.
This refactors `conflicts.rs` to:
1. Make `describe_conflict` public
2. Extract the functionality to create text version of
a conflict as the `materialize_merge_result` function.
3. Extract the functionality to turn a conflicted file
into the complete contents of each version of the file
"added" or removed" (when possible). This becomes the
`extract_file_conflict_data` function.
This is useful in order to present these text versions
in a merge tool.
The function doesn't do much at all now and there's a single caller,
so let's inline it.
I tried to clean up the code a bit futher so it wouldn't even create
the `old_view`, but it was harder than I had hoped. I might get back
to it later.
@yuja asked on #701 about the difference between the state in the
`git_export_view` and what we have in `mut_repo.view()`. It's true
that the branches in `mut_repo.view().git_refs()` should match what we
wrote to disk. We can therefore remove the on-disk storage and
simplify quite a bit. For now, I create the `last_export_view` from
the `mut_repo.view().git_refs()` before calling
`export_changes()`. I'll clean up a bit more next.
I think this is correct even considering e.g. undo. Let's consider
what would happen in a non-colocated Git repo (not because tricky
cases cannot happen there but because the explicit exports and imports
make it easier to discuss, and more cases can occur). If the user
moved a branch and then did `jj git export`, `jj undo`, and then `jj
git export` again, we would think on the second export that we should
perform the same changes to the Git repo, which should have no effect.
This patch also fixes the bug we were forced to work around in the
test case in the previous patch.
This removes one of our uses of Thrift.
This fixes the bugs shown by the tests added in the previous patch by
checking that the git branches we're about to update have not been
updated by git since our last export. If they have, we fail those
branches. The user can then re-import from the git repo and resolve
any conflicts before exporting again.
I had to update the `test_export_import_sequence` to make it
pass. That shows a new bug, which I'll fix next. The problem is that
the exported view doesn't get updated on import, so we would try to
export changes compared to an earlier export, even though we actually
knew (because of the `jj git import`) that the state in git had
changed.