Commit graph

132 commits

Author SHA1 Message Date
Yuya Nishihara
6c2525cb93 revset: add "resolve" method to RevsetExpression, always call it
I'll make the resolution stage mandatory, and have it return a "resolved"
type. RevsetExpression::evaluate() will be moved to the "resolved" type.
2023-04-10 00:39:58 +09:00
Yuya Nishihara
adfd52445b revset: reimplement children to not scan visible ancestors twice
It's slightly faster, and removes the use of RevsetExpression::descendants()
API.
2023-04-08 12:13:30 +09:00
Yuya Nishihara
85fb1f74c3 revset: for roots:heads, terminate ancestor lookup at min(roots) 2023-04-08 12:13:30 +09:00
Martin von Zweigbergk
24a512683b revset: add a revset function for finding commits with conflicts
This adds `conflict()` revset that selects commits with conflicts. We
may want to extend it later to consider only conflicts at certain
paths.
2023-04-06 16:46:21 -07:00
Martin von Zweigbergk
e1c57338a1 revset: split out no-args head() to visible_heads()
The `heads()` revset function with one argument is the counterpart to
`roots()`. Without arguments, it returns the visible heads in the
repo, i.e. `heads(all())`. The two use cases are quite different, and
I think it would be good to clarify that the no-arg form returns the
visible heads, so let's split that out to a new `visible_heads()`
function.
2023-04-03 23:46:34 -07:00
Yuya Nishihara
982062bd75 revset: do not always evaluate filter node to InternalRevset
This basically removes hidden 'all() &' from union/negation of filters. To
achieve that, I have two options: 1. add separate evaluation path (like the
one this commit introduced), or 2. wrap "all()" revset to override predicate
as Box::new(|_| true) function. I took the former since it's less ad-hoc.

We can add an explicit RevsetExpression node to branch between evaluate()
and evaluate_predicate(), but I don't think it would simplify the
implementation at this point. We might need such node if we want to resolve
"all()" at resolve_symbols(). It might be even better to extract a subset of
RevsetExpression enum, which only contains evaluatable nodes.

The cost of 'all() &' isn't significant for most filters. '~merges()' is
the exception. For jj repo,

    revsets/:v0.3.0 & (author(martinvonz) | committer(martinvonz))
    --------------------------------------------------------------
    base     1.06      11.2±0.04m
    new      1.00      10.5±0.05m

    revsets/~merges()
    -----------------
    base     1.69     750.0±8.47µ
    new      1.00     444.1±3.50µ
2023-04-04 15:21:21 +09:00
Yuya Nishihara
c28d2d7784 revset: split RevsetError into RevsetResolution/EvaluationError
This makes it clear that RevsetExpression::Present node is noop at the
evaluation stage.

RevsetEvaluationError::StoreError is unused right now, but I'm not sure if
it should be removed. It makes some sense that evaluate() can propagate
StoreError as it has access to the store.
2023-04-03 10:55:03 +09:00
Martin von Zweigbergk
3ff1ab520b revset: remove public_heads()
The `public_heads()` revset only contains the root commit in
practice. I'm not sure what we want to do about phases, but since we
don't have any real support for them yet, let's just remove this
revset. I didn't update the changelog because we don't seem to have
documented the revset function (and it seems unlikely that users who
found out about it found it useful enough to use it when they could
just use `root`).
2023-03-30 20:15:45 -07:00
Yuya Nishihara
0532301e03 revset: add latest(candidates, count) predicate
This serves the role of limit() in Mercurial. Since revsets in JJ is
(conceptually) an unordered set, a "limit" predicate should define its
ordering criteria. That's why the added predicate is named as "latest".

Closes #1110
2023-03-25 23:48:50 +09:00
Martin von Zweigbergk
75605e36af revset: iterate over commit ids instead of index entries
There are no remaining places where we iterate over a revset and need
the `IndexEntry`s, so we can now make `Revset::iter()` yield
`CommitId`s instead.
2023-03-23 21:58:15 -07:00
Martin von Zweigbergk
27a7fccefa revset: add a method returning a change id index
One of the remaining places we depend on index positions is when
creating a `ChangeIdIndex`. This moves that into the revset engine
(which is coupled to the commit index implementation) by adding a
`Revset::change_id_index()` method. We will also use this function
later when add support for resolving change id prefixes within a small
revset.

The current implementation simply creates an in-memory index using the
existing `IdIndex` we have in `repo.rs`.

The custom implementation at Google might do the same for small
revsets that are available on the client, but for revsets involving
many commits on the server, it might use a suboptimmal implementation
that uses longer-than-necessary prefixes for performance reasons. That
can be done by querying a server-side index including changes not in
the revset, and then verifying that the resulting commits are actually
in the revset.
2023-03-23 20:49:15 -07:00
Martin von Zweigbergk
d3cf543abc revset: move revset_for_commits() to test
The function is only used in tests, so it doesn't belong in
`default_revset_engine`. Also, it's not specific to that
implementation, so I rewrote as a revset evaluation.
2023-03-23 04:50:33 -07:00
Martin von Zweigbergk
5f74dd5db3 repo: implement Repo on ReadonlyRepo instead of its Arc
I'd like to be able to pass a `self` of `type `&ReadonlyRepo` to
functions that take a `&dyn Repo`. For that, we need `ReadonlyRepo`
itself to implement `Repo` instead of having `Arc<ReadonlyRepo>`
implement it. I could have solved it in a different way, but the `Arc`
requirement seems like an unnecessary constraint.
2023-03-21 21:43:44 -07:00
Martin von Zweigbergk
01d7239732 revset: make graph iterator yield commit ids (not index entries)
We only need `CommitId`s, and `IndexEntry` is specific to the default
index implementation.
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
2f876861ae graphlog: key by commit id (not index position)
The index position is specific to the default index implementation and
we don't want to use it in outside of there. This commit removes the
use of it as a key for nodes in the graphlog.

I timed it on the git.git repo using `jj log -r 'all()' -T commit_id`
(the worst case I can think of) and it slowed down from ~2.02 s to
~2.20 s (~9%).
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
70d4a0f42e revset: remove context parameter from evaluate()
The `RevsetWorkspaceContext` argument is now instead used by the new
`resolve_symbol()` function.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
d971148e4e revset: move resolve_symbol() back to revset module
The only caller is now in `revset.rs`.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
94aec90bee revset: resolve symbols earlier, before passing to revset engine
For large repos, it's useful to be able to use shorter change id and
commit id prefixes by resolving the prefix in a limited subset of the
repo (typically the same subset that you'd want to see in your default
log output). For very large repos, like Google's internal one, the
shortest unique prefix evaluated within the whole repo is practically
useless because it's long enough that the user would want to copy and
paste it anyway.

Mercurial supports this with its `revisions.disambiguatewithin` config
(added in https://www.mercurial-scm.org/repo/hg/rev/503f936489dd). I'd
like to add the same feature to jj. Mercurial's implementation works
by attempting to resolve the prefix in the whole repo and then, if the
prefix was ambiguous, it resolves it in the configured subset
instead. The advantage of doing it that way is that there's no extra
cost of resolving the revset defining the subset if the prefix was not
ambiguous within the whole repo. However, there are two important
reasons to do it differently in jj:

* We support very large repos using custom backends, and it's probably
  cheaper to resolve a prefix within the subset because it can all be
  cached on the client. Resolving the prefix within the whole repo
  requires a roundtrip to the server.

* We want to be able to resolve change id prefixes, which is always
  done in *some* revset. That revset is currently `all()`, i.e. all
  visible commits. Even on local disk, it's probably cheaper to
  resolve a small revset first and then resolve the prefix within that
  than it is to build up the index of all visible change ids.

We could achieve the goal by letting each revset engine respect the
configured subset, but since the solution proposed above makes sense
also for local-disk repos, I think it's better to do it outside of the
revset engine, so all revset engines can share the code.

This commit prepares for the new functionality by moving the symbol
resolution out of `Index::evaluate_revset()`.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
3871efd2f9 revset: move ReverseRevsetGraphIterator into revset module
The iterator is not specific to the implementation in
`revset_graph_iterator`, so it belongs in the standard `revset`
module.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
eed0b23009 revset: move current implementation to new module
We want to allow customization of the revset engine, so it can query
server indexes, for example. The current revset implementation will be
our default implementation for now. What's left in the `revset` module
after this commit is mostly parsing code.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
ada48c6f71 revset: rename file() test for consistency
This should have been part of bbd6ef0c7b.
2023-03-13 07:20:35 -07:00
Martin von Zweigbergk
bbd6ef0c7b revset: remove filter_by_diff(), have caller intersect expression
To be able to make e.g. `jj log some/path` perform well on cloud-based
repos, a custom revset engine needs to be able to see the paths to
filter by. That way it is able pass those to a server-side index. This
commit helps with that by effectively converting `jj log -r foo
some/path` into `jj log -r 'foo & file(some/path)'`.
2023-02-28 17:45:34 -08:00
Martin von Zweigbergk
bc9f66dad3 revset: replace RevsetIterator wrapper by extension
The type doesn't seem to provide any benefit. I don't think I had a
good reason for creating it in the first place; it was probably just
unfamiliarity with Rust.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
d8997999f2 repo: replace RepoRef by Repo trait 2023-02-15 19:15:17 -08:00
Martin von Zweigbergk
f6a4cb57da repo: extract a Repo trait for Arc<ReadonlyRepo> and MutableRepo
This will soon replace the `RepoRef` enum, just like how the `Index`
trait replaced the `IndexRef` enum.
2023-02-15 19:15:17 -08:00
Martin von Zweigbergk
9261bfe5fc revset: resolve change ids only using the new hex digits
Now that we use the new hex digits when we display change ids, we no
longer need to be able to resolve the old (conventional) digits.
2023-02-13 22:49:21 -08:00
Martin von Zweigbergk
39640cc288 revset: allow resolving change id using hex digits from reverse alphabet
By separating the value spaces change ids and commit ids, we can
simplify lookup of a prefix. For example, if we know that a prefix is
for a change id, we don't have to try to find matching commit ids. I
think it might also help new users more quickly understand that change
ids are not commit ids.

This commit is a step towards that separation. It allows resolving
change ids by using hex digits from the back of the alphabet instead
of 0-f, so 'z'='0', 'y'='1', etc, and 'k'='f'. Thanks to @ilyagr for
the idea. The regular hex digits are still allowed.
2023-02-13 22:49:21 -08:00
Martin von Zweigbergk
4e8fbaa210 git: allow conflicts in "HEAD@git"
Git's HEAD ref is similar to other refs and can logically have
conflicts just like the other refs in `git_refs`. As with the other
refs, it can happen if you run concurrent commands importing two
different updates from Git. So let's treat `git_head` the same as
`git_refs` by making it an `Option<RefTarget>`.
2023-01-30 09:05:03 -08:00
Glen Choo
3418c8ff73 git: add git.auto-local-branch
Add a new git.auto-local-branch config option. When set to false, a
remote-tracking branch imported from Git will not automatically create a
local branch target. This is implemented by a new GitSettings struct
that passes Git-related settings from UserSettings.

This behavior is particularly useful in a co-located jj and Git repo,
because a Git remote might have branches that are not of everyday
interest to the user, so it does not make sense to export them as local
branches in Git. E.g. https://github.com/gitster/git, the maintainer's
fork of Git, has 379 branches, most of which are topic branches kept
around for historical reasons, and Git developers wouldn't be expected
to have local branches for each remote-tracking branch.
2023-01-29 20:17:49 -08:00
Yuya Nishihara
824f2106fd repo: migrate revset::resolve_change_id() to use IdIndex for ReadonlyRepo
The MutableRepo implementation is the same as before.
2023-01-26 14:10:26 +09:00
Martin von Zweigbergk
10725c095f cleanup: update more "checkout" to "working-copy commit" and similar
I've preferred "working-copy commit" over "checkout" for a while
because I think it's clearer, but there were lots of places still
using "checkout". I've left "checkout" in places where it refers to
the action of updating the working copy or the working-copy commit.
2023-01-25 11:02:59 -08:00
Vamsi Avula
60d1537731 let branches and remote_branches revset functions take needles as arguments
- branches has the signature branches([needle]), meaning the needle is optional (branches() is equivalent to branches("")) and it matches all branches whose name contains needle as a substring
- remote_branches has the signature remote_branches([branch_needle[, remote_needle]]), meaning it can be called with no arguments, or one argument (in which case, it's similar to branches), or two arguments where the first argument matches branch names and the second argument matches remote names (similar to branches, remote_branches(), remote_branches("") and remote_branches("", "") are all equivalent)
2023-01-16 12:15:30 +05:30
Ilya Grigoriev
aef0801917 Fix random seed in all tests that use testutils::user_settings
This doesn't change any tests, but could prevent change ids
randomly matching commit id prefixes from causing tests
to fail in the future.

Follows up on bbd49cdf29, https://github.com/martinvonz/jj/issues/1024
and https://github.com/martinvonz/jj/pull/1033.
2023-01-15 10:15:44 -08:00
Yuya Nishihara
bbd49cdf29 tests: stabilize change id in test_resolve_symbol_commit_id()
Hopefully fixes #1024.
2023-01-14 23:49:16 +09:00
Waleed Khan
e299963fae backend: remove PartialEq/Eq implementations
As soon as we start tracking the `#[source]` for error variants, we won't be able to rely on the presence of `Eq` implementations.
2023-01-02 12:28:51 -06:00
Waleed Khan
7f8a196ab2 backend: create ObjectId trait
This lets us operate over various kinds of objects polymorphically (e.g. call `.hex()` on any kind of object hash).
2023-01-02 12:28:51 -06:00
Yuya Nishihara
996ac22921 matchers: simplify FilesMatcher::new() to take slice of paths 2022-12-30 14:15:27 +09:00
Martin von Zweigbergk
812ef97adb repo: add MutableRepo::new_commit() returning CommitBuilder
Since `CommitBuilder` now has a reference to `MutableRepo`, it's
convenient to create instances of it by calling a method on
`MutableRepo`.
2022-12-26 23:30:52 -08:00
Martin von Zweigbergk
f3208f59c4 store: propagate error from Backend::write_commit() 2022-12-26 23:30:52 -08:00
Martin von Zweigbergk
f1d7bbe508 testutils: create a function for writing a random commit to MutableRepo
We already have `create_random_commit()`, which returns a
`CommitBuilder`. Most callers directly write that to a
`MutableRepo`. That currently returns a `Commit`, but I'm about to
make it propagate errors from the backend. That would add an
`unwrap()` to this sequence, making it longer. Let's create a simple
helper for these callers to simplify this common pattern.
2022-12-26 23:30:52 -08:00
Martin von Zweigbergk
49b2f3b6ca commit_builder: keep MutableRepo reference
When you're done with the `CommitBuilder`, you're going to have to
call `write_to_repo()`, passing it a mutable `MutableRepo`
reference. It's a bit simpler to pass that reference when we create
the `CommitBuilder` instead, so that's what this patch does.

A drawback of passing in the mutable reference when we create the
builder is that we can't have multiple unfinished `CommitBuilder`
instance live at the same time. We don't have any such use cases yet,
and it's not hard to work around them, so I think this change is worth
it.
2022-12-26 23:30:52 -08:00
Yuya Nishihara
986649623d commit_builder: relax type of description parameter
The next commit will introduce a newtype for -m/--message argument which
can be converted Into<String>.

Since CommitBuilder is a thin wrapper, code bloat caused by generic parameters
wouldn't matter. I have another set of commits that makes all builder methods
accept Into/IntoIterator, which will remove some of .clone() calls from tests.
2022-12-22 14:59:03 +09:00
Martin von Zweigbergk
7f9a0a2820 cleanup: let new Clippy move variables into format strings
I ran an upgraded Clippy on the codebase. All the changes seem to be
about using variables directly in format strings instead of passing
them as separate arguments.
2022-12-14 21:30:58 -08:00
Yuya Nishihara
6237f3cdfd revset: fold nested parents expressions
Some other ancestors() expressions can also be substituted. Practically,
this is the rule to fold repeated '-' operators to evaluate them lazily.
2022-12-13 15:55:18 +09:00
Yuya Nishihara
951eb0b61a revset: use filter intersection for tree containing filter
This basically transforms 's1 & (f() | s2)' to
's1.iter().filter(all && f || s2)'. Still the predicate part includes "all",
the filter function doesn't need to load commit data for every entry since
's1.iter().filter(all)' is tested first. To optimize "all" predicate out,
maybe we can add a wrapper that returns '|_: &IndexEntry| true'.

Instead of inserting AsFilter(_) node, I could add a recursive is_filter()
function. That would also work so long as the height of RevsetExpression tree
is limited. I chose node insertion just for ease of snapshot testing.
2022-12-07 11:01:59 +09:00
Yuya Nishihara
48d10d648c revset: add unary negate (or set complement) operator '~y'
Because a unary negation node '~y' is more primitive than the corresponding
difference node 'x~y', '~y' is easier to deal with while rewriting the tree.
That's the main reason to add RevsetExpression::NotIn node.

As we have a NotIn node, it makes sense to add an operator for that. This
patch reuses '~' token, which I feel intuitive since the other set operators
looks like bitwise ops. Another option is '!'.

The unary '~' operator has the highest precedence among the set operators,
but they are lower than the ranges. This might be counter intuitive, but
useful because a prefix range ':x' can be negated without parens.

Maybe we can remove the redundant infix operator 'x ~ y', but it isn't
decided yet.
2022-11-29 15:46:15 +09:00
Martin von Zweigbergk
d8feed9be4 copyright: change from "Google LLC" to "The Jujutsu Authors"
Let's acknowledge everyone's contributions by replacing "Google LLC"
in the copyright header by "The Jujutsu Authors". If I understand
correctly, it won't have any legal effect, but maybe it still helps
reduce concerns from contributors (though I haven't heard any
concerns).

Google employees can read about Google's policy at
go/releasing/contributions#copyright.
2022-11-28 06:05:45 -10:00
Yuya Nishihara
e40c041384 revset: merge AmbiguousChange/CommitIdPrefix error into one
Follows up c5ed3e1477. Now change/commit ids are resolved at the same
precedence, which means there are at least three types of ambiguity.
I don't think we would need to discriminate these.
2022-11-28 22:49:07 +09:00
Yuya Nishihara
c5ed3e1477 revset: for short hash, look up both commit and change ids to disambiguate
Because the use of the change id is recommended, any operation should abort
if a valid change id happens to match a commit id. We still try the commit
id lookup first as the change id lookup is more costly.

Ambiguous change/commit id is reported as AmbiguousCommitIdPrefix for now.
Maybe we can merge AmbiguousCommit/ChangeIdPrefix errors into one?

Closes #799
2022-11-28 17:30:53 +09:00
Yuya Nishihara
7632466cc0 revset: add table of symbol aliases and pass around parse functions
The CLI will load aliases from config, insert them one by one, and warn if
declaration part is invalid. That's why RevsetAliasesMap is a public struct
and needs to be instantiated by the caller.
2022-11-27 20:12:22 +09:00