I'm going to split RevsetError into symbol resolution and evaluation errors,
and evaluate_revset() is the only place where CLI crate explicitly specifies
RevsetError type.
I want to adjust --sample-size while running slow revset benchmarks. The
baseline option seems also useful.
Short options are derived from Criterion::configure_from_args().
If you set e.g.`ui.pager = 5`, we currently ignore that and fall back
to the default. It seems better to let the user know that their config
is invalid, as we generally do. This commit does that. With this
change, the error message will look like this:
```
Config error: Invalid `ui.pager`: data did not match any variant of untagged enum CommandNameAndArgs
```
Maybe the key name will be redundant once the `config` crate releases
a version including https://github.com/mehcode/config-rs/pull/413
(thanks, Yuya).
The `Repo` is a higher-level type that the index shouldn't have to
know about. With this change, a custom revset implementation should be
able evaluate the revset on a server without knowing which repo it
refers to.
We already pass a `CompositeIndex` to
`default_revset_engine::evaluate()` so let's use that wherever we
currently use `repo.index()`. That will help us remove the `repo`
argument, and it will also let us internal types (like `IndexEntry`)
in the index methods we call.
I'm about to replace the `&dyn Repo` argument by several smaller
types, and it's easier to collect those in a single context type than
to pass them separately as arguments.
I also moved `revset_for_commit_ids()` and `take_latest_revset()` onto
the new type because it was easy. `build_predicate_fn()` and
`has_diff_from_parent()` ran into some lifetime issue when I tried.
The `public_heads()` revset only contains the root commit in
practice. I'm not sure what we want to do about phases, but since we
don't have any real support for them yet, let's just remove this
revset. I didn't update the changelog because we don't seem to have
documented the revset function (and it seems unlikely that users who
found out about it found it useful enough to use it when they could
just use `root`).
The `ProtoOpStore` was separated out to simplify the migration from
Thrift. Now that the `ThriftOpStore` is gone, we can inline
`ProtoOpStore` as the TODO says.
The impact of not having configured one's name and email is not apparent from the warning message. Under the Toulmin model:
- Claim (implicit): You should configure your name and email.
- Grounds: Your name and email are not currently configured.
- Warrant (currently missing): Configuring your name and email will let you do...
Some benchmark numbers in the following order:
- original
- fresh repo, BatchSize::SmallInput
- fresh but index-preloaded repo, BatchSize::SmallInput
- fresh but index-preloaded repo, BatchSize::LargeInput
% cargo run --release --features bench -- bench revset ':main'
revsets/:main time: [271.49 µs 271.74 µs 272.07 µs]
revsets/:main time: [754.17 µs 758.22 µs 764.02 µs]
revsets/:main time: [367.11 µs 372.65 µs 381.99 µs]
revsets/:main time: [341.76 µs 342.98 µs 344.35 µs]
% cargo run --release --features bench -- bench revset 'author(martinvonz)'
revsets/author(martinvonz)
time: [767.43 µs 770.52 µs 775.59 µs]
revsets/author(martinvonz)
time: [31.960 ms 31.984 ms 32.011 ms]
revsets/author(martinvonz)
time: [31.478 ms 31.538 ms 31.615 ms]
revsets/author(martinvonz)
time: [31.503 ms 31.526 ms 31.550 ms]
I think the fresh but index-preloaded repo is close to the practical
evaluation environment. With BatchSize::SmallInput, it appears to consume
~600MB (RES) memory (compared to ~50MB in LargeInput.) I don't think that's
huge, but it might affect cache usage, so I chose LargeInput.
https://docs.rs/criterion/latest/criterion/enum.BatchSize.html
Inline diffs on multi-byte UTF-8 characters would match individual
bytes, causing garbled diffs in some cases. For example, replacing
`⊢` with `⊣`, which differ in the final byte only, caused the
diff to display a diff of the bytes instead the character.
This commit uses a workaround present in Mercurial by treating all
bytes 0x80 and above as word characters, causing any multi-byte
character to be treated as a word and not segmented.
https://www.mercurial-scm.org/repo/hg/file/6.3.3/mercurial/patch.py#l51
This command is similar to Mercurial's revset benchmarking command. It
lets you pass in a file containing revsets. I also included a file
with some revsets to test on the git.git repo. I put it in `testing/`,
which doesn't seem perfect. I'm happy to hear suggestions for better
places, or we can move it later if we find a better place.
Note that these tests don't clear caches between each run (or even
between tests), so revsets that rely on filtering commit data that's
not indexed appear faster than they typically are in reality.
I suspect the `jj bench walkrevs` command was from before we had
support for revsets. Now there doesn't seem to be any reason to have a
specific command for only range revsets (`foo..bar`), so let's replace
it by a command for benchmarking an arbitrary revset.
The `jj bench` commands are mostly meant for developers, so lets hide
the command from help and behind a `bench` feature flag. The feature
flags avoids bloating the binary with the `criterion` dependencies,
which was the reason I removed the command in 18c0b97d9d.
This just backs out commit 18c0b97d9d without making any changes,
except for resolving conflicts.
I want a way to benchmark different revsets on e.g. the Git Core repo
or the Linux repo.