The index position is specific to the default index implementation and
we don't want to use it in outside of there. This commit removes the
use of it as a key for nodes in the graphlog.
I timed it on the git.git repo using `jj log -r 'all()' -T commit_id`
(the worst case I can think of) and it slowed down from ~2.02 s to
~2.20 s (~9%).
Since we hid the graph iterator implementation behind
`Revset::iter_graph()`, I don't think we have any callers of
`Revset::iter()` require the iteration to be in index position order,
so let's not promise that. We do want to promise that the iteration is
in topological order with children before parents, however.
I think requests to reset the author came up twice in the last week,
so let's just add support for it. I copied git's behavior of resetting
the name, email, and timestamp. The flag name is also from git.
We need 1.64 to bump `clap` to `4.1`. We don't really need to upgrade
to that, but being on an older version causes minor confusions like
#1393. Rust 1.64 is very close to 6 months old at this point.
For large repos, it's useful to be able to use shorter change id and
commit id prefixes by resolving the prefix in a limited subset of the
repo (typically the same subset that you'd want to see in your default
log output). For very large repos, like Google's internal one, the
shortest unique prefix evaluated within the whole repo is practically
useless because it's long enough that the user would want to copy and
paste it anyway.
Mercurial supports this with its `revisions.disambiguatewithin` config
(added in https://www.mercurial-scm.org/repo/hg/rev/503f936489dd). I'd
like to add the same feature to jj. Mercurial's implementation works
by attempting to resolve the prefix in the whole repo and then, if the
prefix was ambiguous, it resolves it in the configured subset
instead. The advantage of doing it that way is that there's no extra
cost of resolving the revset defining the subset if the prefix was not
ambiguous within the whole repo. However, there are two important
reasons to do it differently in jj:
* We support very large repos using custom backends, and it's probably
cheaper to resolve a prefix within the subset because it can all be
cached on the client. Resolving the prefix within the whole repo
requires a roundtrip to the server.
* We want to be able to resolve change id prefixes, which is always
done in *some* revset. That revset is currently `all()`, i.e. all
visible commits. Even on local disk, it's probably cheaper to
resolve a small revset first and then resolve the prefix within that
than it is to build up the index of all visible change ids.
We could achieve the goal by letting each revset engine respect the
configured subset, but since the solution proposed above makes sense
also for local-disk repos, I think it's better to do it outside of the
revset engine, so all revset engines can share the code.
This commit prepares for the new functionality by moving the symbol
resolution out of `Index::evaluate_revset()`.
The callers don't need to hold on to the revset expression once it's
been evaluated, and having an owned expression (well, an expression
with shared ownership) will avoid a clone in the next commit.
A mapped template is basically a combined function that takes context: &C,
extracts Vec<O>, and formats each item with Template<C>. It cannot be cleanly
turned into a function of (&C) -> Vec<Template<()>> type. So list-like methods
are implemented on Box<dyn ListTemplate<C>> instead.
I'm going to add a trait that provides .join() -> Box<dyn Template>.
wrap_template() should handle it transparently, but the current interface
would require excessive boxing.
This involves a little hack to insert a lambda parameter 'x' to be used at
keyword position. If the template language were dynamically typed (and were
interpreted), .map() implementation would be simpler. I considered that, but
interpreter version has its own warts (late error reporting, uneasy to cache
static object, etc.), and I don't think the current template engine is
complex enough to rewrite from scratch.
.map() returns template, which can't be join()-ed. This will be fixed later.
A lambda expression will be allowed only in .map() operation. The syntax is
borrowed from Rust closure.
In Mercurial, a map operation is implemented by context substitution. For
example, 'parents % "{node}"' prints parents[i].node for each. There are two
major problems: 1. the top-level context cannot be referred from the inner map
expression. 2. context of different types inserts arbitrarily-named keywords
(e.g. a dict type inserts "{key}" and "{value}", but how we could know.)
These issues should be avoided by using explicitly named parameters.
parents.map(|parent| parent.commit_id ++ " " ++ commit_id)
^^^^^^^^^ global keyword
A downside is that we can't reuse template fragment in map expression. Suppose
we have -T commit_summary, -T 'parents.map(commit_summary)' doesn't work.
# only usable as a top-level template
'commit_summary' = 'commit_id.short() ++ " " ++ description.first_line()'
Another problem is that a lambda expression might be confused with an alias
function.
# .map(f) doesn't work, but .map(g) does
'f(x)' = 'x'
'g' = '|x| x'
The `jj debug` commands are hidden from help and are described as
"Low-level commands not intended for users", but e.g. `jj debug
completion` is intended for users, and should be visible in the help
output.
By using one letter for the path type before and one letter for path
type after, we can encode much more information than just the current
'M'/'A'/'R'. In particular, we can indicate new and resolved
conflicts. The color still encodes the same information as before. The
output looks a bit weird after many years of using `hg status`. It's a
bit more similar to the `git status -s` format with one letter for the
index and one with the working copy. Will we get used to it and find
it useful?
I'm going to add a lambda expression, and the current type-error message
wouldn't work for the lambda type. I also renamed "argument" to "expression"
as the expect_<type>() helper may be called against any expression node.
This is similar to the structure of RevsetParseError. It's unlikely we would
need to discriminate parsing errors, so let's avoid wasting time on naming
things.
In templater, it's easier to handle invalid format string at parsing stage, so
I want to build formatting items upfront. Since the formatting items borrow
the input string by reference, we need to manually convert them to the owned
variants.
While measuring overhead of interpreter version of the template engine, I
noticed the templater spend some time in chrono. I don't think this would
matter in practice, but it's easy to cache the formatting items.
% jj log -r'all()' -T'".\n"' --no-graph | wc -l
2996
% hyperfine --warmup 3 --runs 20 "jj log --ignore-working-copy -r 'all()' -Tshow --no-graph"
(original)
Time (mean ± σ): 120.0 ms ± 18.7 ms [User: 97.5 ms, System: 22.5 ms]
Range (min … max): 96.7 ms … 144.1 ms 20 runs
(new)
Time (mean ± σ): 106.2 ms ± 12.3 ms [User: 86.1 ms, System: 20.1 ms]
Range (min … max): 96.3 ms … 130.4 ms 20 runs
Regarding the template engine rewrites, I'm yet sure that the interpreter
version is strictly better. It's simpler, but could make some caching story
difficult. So I'm not gonna replace the engine anytime soon.
We want to allow custom revset engines define their own graph
iterator. This commit helps with that by adding a
`Revset::iter_graph()` function that returns an abstract iterator.
The current `RevsetGraphIterator` can be configured to skip or include
transitive edges. It skips them by default and we don't expose option
in the CLI. I didn't bother including that functionality in the new
`iter_graph()` either. At least for now, it will be up to the
implementation whether it includes such edges (it would of course be
free to ignore the caller's request even if we added an option for it
in the API).
This commit adds an `evaluate_revset()` function to the `Index`
trait. It will require some further cleanup, but it already achieves
the goal of letting the index implementation decide which revset
engine to use.
We want to allow customization of the revset engine, so it can query
server indexes, for example. The current revset implementation will be
our default implementation for now. What's left in the `revset` module
after this commit is mostly parsing code.