Commit graph

1285 commits

Author SHA1 Message Date
Yuya Nishihara
5dd99db250 revset: make evaluation helper not create trait object eagerly
We wouldn't care for the cost of virtual dispatch at this level, but I
think a concrete struct type is easier to deal with than trait object.
2023-04-08 12:13:30 +09:00
Yuya Nishihara
85fb1f74c3 revset: for roots:heads, terminate ancestor lookup at min(roots) 2023-04-08 12:13:30 +09:00
Yuya Nishihara
ddff089286 revset: do not evaluate roots() candidates three times 2023-04-08 12:13:30 +09:00
Yuya Nishihara
eef6a77aa4 revset: reuse reachable dag-range set to calculate roots
This also removes the use of RevsetExpression::connected() API from the
evaluation engine.
2023-04-08 12:13:30 +09:00
Yuya Nishihara
20aa31336e revset: extract dag-range calculation to function
The returned reachable set can be reused to calculate roots() expression.
2023-04-08 12:13:30 +09:00
Yuya Nishihara
7dc35b82b0 revset: evaluate ancestors without using RevsetExpression builder API
I'm thinking of transforming RevsetExpression to a enum dedicated for
the evaluation stage. To help the migration, I want to remove the use of
the RevsetExpression builder API from the evaluation engine.

Fewer virtual dispatch is also better.
2023-04-08 12:13:30 +09:00
Martin von Zweigbergk
24a512683b revset: add a revset function for finding commits with conflicts
This adds `conflict()` revset that selects commits with conflicts. We
may want to extend it later to consider only conflicts at certain
paths.
2023-04-06 16:46:21 -07:00
Yuya Nishihara
308a5b9eae revset: make empty()/file(".") not load root tree for liner history
TreeDiffIterator wouldn't load identical subtrees, but it's up to caller to
optimize out the root tree loading.
2023-04-05 21:53:24 +09:00
Martin von Zweigbergk
e1c57338a1 revset: split out no-args head() to visible_heads()
The `heads()` revset function with one argument is the counterpart to
`roots()`. Without arguments, it returns the visible heads in the
repo, i.e. `heads(all())`. The two use cases are quite different, and
I think it would be good to clarify that the no-arg form returns the
visible heads, so let's split that out to a new `visible_heads()`
function.
2023-04-03 23:46:34 -07:00
Yuya Nishihara
982062bd75 revset: do not always evaluate filter node to InternalRevset
This basically removes hidden 'all() &' from union/negation of filters. To
achieve that, I have two options: 1. add separate evaluation path (like the
one this commit introduced), or 2. wrap "all()" revset to override predicate
as Box::new(|_| true) function. I took the former since it's less ad-hoc.

We can add an explicit RevsetExpression node to branch between evaluate()
and evaluate_predicate(), but I don't think it would simplify the
implementation at this point. We might need such node if we want to resolve
"all()" at resolve_symbols(). It might be even better to extract a subset of
RevsetExpression enum, which only contains evaluatable nodes.

The cost of 'all() &' isn't significant for most filters. '~merges()' is
the exception. For jj repo,

    revsets/:v0.3.0 & (author(martinvonz) | committer(martinvonz))
    --------------------------------------------------------------
    base     1.06      11.2±0.04m
    new      1.00      10.5±0.05m

    revsets/~merges()
    -----------------
    base     1.69     750.0±8.47µ
    new      1.00     444.1±3.50µ
2023-04-04 15:21:21 +09:00
Yuya Nishihara
69794f2585 revset: add method to upcast InternalRevset to ToPredicateFn 2023-04-04 15:21:21 +09:00
Yuya Nishihara
426f3e4e0a revset: simplify evaluation of "all()"
I think this is more readable, and apparently it produces slightly better code
maybe because the compiler can determine that there are no unwanted markers.
2023-04-04 15:21:21 +09:00
Yuya Nishihara
0bfdbcaa1e revset: don't rewrite '~set & filter' as difference
Since filter is slow in general, its input set should be minimized. This has
measurable impact on artificial query like '~(v0.4.0..) & author(_)'. If it
were evaluated as a difference of sets, all commits would have to be loaded.
2023-04-04 15:21:21 +09:00
Yuya Nishihara
3927c01d08 revset: make error type opaque to try_transform_expression()
It no longer handles RevsetResolutionError.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
f1e2d19d57 revset: fully consume Present(_) node by resolve_symbols()
Since resolve_symbols() now removes Present(_) node, it make sense to
handle symbol resolution error there. That's why I added a "pre" callback
to try_transform_expression().

Perhaps, "operation" scope (#1283) can be implemented in a similar way,
(but somehow need to resolve operation id and call repo.reload_at(op).)
2023-04-03 10:55:03 +09:00
Yuya Nishihara
aeb93c7591 revset: insert pre-order callback that can terminate transformation early
This will be a hook for resolve_symbols() to transform Present(_) subtree.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
feaad6b5fa revset: add type alias for Option<Rc<RevsetExpression>>
I'm going to parameterize error type of TransformResult, and the result type
will be replaced with Result<TransformedExpression, E>.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
c28d2d7784 revset: split RevsetError into RevsetResolution/EvaluationError
This makes it clear that RevsetExpression::Present node is noop at the
evaluation stage.

RevsetEvaluationError::StoreError is unused right now, but I'm not sure if
it should be removed. It makes some sense that evaluate() can propagate
StoreError as it has access to the store.
2023-04-03 10:55:03 +09:00
Yuya Nishihara
429562ca2f revset: implement Debug for RevsetImpl and add trait bound accordingly 2023-04-02 22:54:46 +09:00
Yuya Nishihara
2aab6c7825 revset: implement Debug for InternalRevset objects
Even though predicate function and RevWalk internals can't be debug printed,
it's useful to see an overview of InternalRevset tree.
2023-04-02 22:54:46 +09:00
Yuya Nishihara
b297b7c965 revset: turn PurePredicateFn into newtype struct
This will avoid extra boxing when converting PurePredicateFn to dyn
ToPredicateFn object.
2023-04-02 22:54:46 +09:00
Yuya Nishihara
fbb292f7c9 revset: relax lifetime bound of ToPredicateFn
We don't have to require that the input IndexEntry<'_> has 'index lifetime.
2023-04-02 22:54:46 +09:00
Yuya Nishihara
2404dc8cd3 revset: remove redundant type bound from RevWalkRevset
The "'index: 'a" bound can be removed by bypassing the Box<dyn> indirection
of self.iter().
2023-04-02 22:54:46 +09:00
Ilya Grigoriev
a58af4f19d Work around a couple of false positives for recent nightly clippy
This is likely https://github.com/rust-lang/rust-clippy/issues/10577
2023-04-01 18:35:38 -07:00
Martin von Zweigbergk
3546cc1bf6 revset: pass in store, index, and heads instead of whole Repo
The `Repo` is a higher-level type that the index shouldn't have to
know about. With this change, a custom revset implementation should be
able evaluate the revset on a server without knowing which repo it
refers to.
2023-03-30 20:15:45 -07:00
Martin von Zweigbergk
9f9e356f3d revset: use the default index impl more in default revset engine
We already pass a `CompositeIndex` to
`default_revset_engine::evaluate()` so let's use that wherever we
currently use `repo.index()`. That will help us remove the `repo`
argument, and it will also let us internal types (like `IndexEntry`)
in the index methods we call.
2023-03-30 20:15:45 -07:00
Martin von Zweigbergk
002ec1ac68 revset: move internal_evaluate() onto new context type
I'm about to replace the `&dyn Repo` argument by several smaller
types, and it's easier to collect those in a single context type than
to pass them separately as arguments.

I also moved `revset_for_commit_ids()` and `take_latest_revset()` onto
the new type because it was easy. `build_predicate_fn()` and
`has_diff_from_parent()` ran into some lifetime issue when I tried.
2023-03-30 20:15:45 -07:00
Martin von Zweigbergk
3ff1ab520b revset: remove public_heads()
The `public_heads()` revset only contains the root commit in
practice. I'm not sure what we want to do about phases, but since we
don't have any real support for them yet, let's just remove this
revset. I didn't update the changelog because we don't seem to have
documented the revset function (and it seems unlikely that users who
found out about it found it useful enough to use it when they could
just use `root`).
2023-03-30 20:15:45 -07:00
Martin von Zweigbergk
2a3d402d0c revset: also resolve branches(), tags(), etc. when resolving symbols
This is another step towards removing the `Repo` argument from
`Index::evaluate_revset()`.
2023-03-30 20:15:45 -07:00
Martin von Zweigbergk
6643fb2bff op_store: inline ProtoOpStore into SimpleOpStore
The `ProtoOpStore` was separated out to simplify the migration from
Thrift. Now that the `ThriftOpStore` is gone, we can inline
`ProtoOpStore` as the TODO says.
2023-03-30 20:00:33 -07:00
Martin von Zweigbergk
68fb46b2af op_store: drop support for upgrading from Thrift implementation 2023-03-30 20:00:33 -07:00
B Wilson
01a9ce0c71 diff: Treat multi-byte UTF-8 runes as word characters
Inline diffs on multi-byte UTF-8 characters would match individual
bytes, causing garbled diffs in some cases. For example, replacing
`⊢` with `⊣`, which differ in the final byte only, caused the
diff to display a diff of the bytes instead the character.

This commit uses a workaround present in Mercurial by treating all
bytes 0x80 and above as word characters, causing any multi-byte
character to be treated as a word and not segmented.

https://www.mercurial-scm.org/repo/hg/file/6.3.3/mercurial/patch.py#l51
2023-03-30 00:06:56 +09:00
Yuya Nishihara
0532301e03 revset: add latest(candidates, count) predicate
This serves the role of limit() in Mercurial. Since revsets in JJ is
(conceptually) an unordered set, a "limit" predicate should define its
ordering criteria. That's why the added predicate is named as "latest".

Closes #1110
2023-03-25 23:48:50 +09:00
Yuya Nishihara
185549f031 revset: extract helper to parse literal to e.g. usize 2023-03-25 23:48:50 +09:00
Yuya Nishihara
d04556cf18 revset: use unstable sort to enforce ordering of commit ids
This wouldn't matter in practice, but there should be no reason to stick
to stable sort.
2023-03-25 23:48:50 +09:00
Martin von Zweigbergk
ce5c90b4e5 revset: use Index::has_id() for checking if a commit has been indexed
This avoids another use of `IndexEntry`.
2023-03-24 10:09:40 -07:00
Martin von Zweigbergk
a5b79f9b0e index: make topo_order() return commit ids instead of index entries
`IndexEntry` is specific to the default index store; we don't want it
in the interface.
2023-03-24 10:09:40 -07:00
Martin von Zweigbergk
772cb1a0e9 revset: replace an unnecessary iterator adapter by a simple map()
As noted by @yuja in #1423.
2023-03-24 10:09:40 -07:00
Martin von Zweigbergk
75605e36af revset: iterate over commit ids instead of index entries
There are no remaining places where we iterate over a revset and need
the `IndexEntry`s, so we can now make `Revset::iter()` yield
`CommitId`s instead.
2023-03-23 21:58:15 -07:00
Martin von Zweigbergk
b5ea79f32e revset: add new graph iterator function for tests
I'm about to make `Revset::iter()` yield just `CommitId`s, but the
tests in `test_default_revset_graph_iterator.rs` need an `IndexEntry`
iterator so they can pass it into `RevsetGraphIterator::new()`. This
commits prepares for the change by adding a
`RevsetImpl::iter_graph_impl()` that returns `RevsetGraphIterator`,
keeping `InternalRevset` still hidden within the revset engine. We
could instead have made that (and `ToPredicateFn`) visible to tests. I
can't say which is better.
2023-03-23 21:58:15 -07:00
Martin von Zweigbergk
c8f387d5b3 revset: pass IndexEntry iterator to graph iterator
The graph iterator is specific to the index implementation, and it
needs access to `IndexEntry`, which `Revset::iter()` will soon not
yield.
2023-03-23 21:58:15 -07:00
Martin von Zweigbergk
0b506d8461 index: remove position-based methods 2023-03-23 20:49:15 -07:00
Martin von Zweigbergk
d4e1156957 repo: move IdIndex to revset engine 2023-03-23 20:49:15 -07:00
Martin von Zweigbergk
13a000caa7 repo: get change id index from revset also for mutable repo
I don't know if we ever resolve revsets in a mutable repo, but now
that we can get a change id index from a revset, it's easier to
implement this functionality that way.
2023-03-23 20:49:15 -07:00
Martin von Zweigbergk
68cff2fa22 repo: get change id index from revset instead of building it in repo
This replaces the direct use of `IdIndex` in `ReadonlyRepo` by use of
`Revset::change_id_index()`.

I made the `Index` trait require `Send` and `Sync` in order to be able
to store an instance of it in `ReadonlyRepo` (via `ChangeIdIndex`) and
still have that be `Send` and `Sync`. We could alternatively store the
`ChangeIdIndex` in a `Mutex`. Now that will be up to the
`ChangeIdIndex` instead.
2023-03-23 20:49:15 -07:00
Martin von Zweigbergk
27a7fccefa revset: add a method returning a change id index
One of the remaining places we depend on index positions is when
creating a `ChangeIdIndex`. This moves that into the revset engine
(which is coupled to the commit index implementation) by adding a
`Revset::change_id_index()` method. We will also use this function
later when add support for resolving change id prefixes within a small
revset.

The current implementation simply creates an in-memory index using the
existing `IdIndex` we have in `repo.rs`.

The custom implementation at Google might do the same for small
revsets that are available on the client, but for revsets involving
many commits on the server, it might use a suboptimmal implementation
that uses longer-than-necessary prefixes for performance reasons. That
can be done by querying a server-side index including changes not in
the revset, and then verifying that the resulting commits are actually
in the revset.
2023-03-23 20:49:15 -07:00
Martin von Zweigbergk
a65e0e771c revset: remove unnecessary wrapping of every node in RevsetImpl
Thanks to @yuja for the suggestion.
2023-03-23 20:49:15 -07:00
Yuya Nishihara
9d661d6f69 cli: render other kind of revset error suggestion as hint 2023-03-23 23:08:17 +09:00
Yuya Nishihara
ddeb645d7f cli: provide hint for typo of revset function name
This is similar to what Mercurial does. The similarity threshold is copied
from clap, but we might want to adjust it later.
2023-03-23 23:08:17 +09:00
Yuya Nishihara
d3d8afc77b revset: rewrite match table of builtin functions as HashMap
So that we can build a list of function names.
2023-03-23 23:08:17 +09:00
Martin von Zweigbergk
5709822f05 rewrite: keep commits to visit instead of looking up again
Since the commits are cached in `Store`, this doesn't have any impact
on performance, but it's a little simpler.
2023-03-23 04:50:33 -07:00
Martin von Zweigbergk
e81890b319 rewrite: work with commits instead of index entries in setup code
When deciding the order to visit commits to rebase, we currently look
up parents in the index. I'm trying to remove the current `IndexEntry`
type and will probably have revsets iterators yield simply
`CommitId`. Let's therefore look up commit objects here.

I timed this by rewriting all commits in the jj repo. I couldn't
measure any difference. That makes sense since we cache the commits in
`Store` and we would read the commit when rebasing it anyway.
2023-03-23 04:50:33 -07:00
Martin von Zweigbergk
d3cf543abc revset: move revset_for_commits() to test
The function is only used in tests, so it doesn't belong in
`default_revset_engine`. Also, it's not specific to that
implementation, so I rewrote as a revset evaluation.
2023-03-23 04:50:33 -07:00
Martin von Zweigbergk
5f74dd5db3 repo: implement Repo on ReadonlyRepo instead of its Arc
I'd like to be able to pass a `self` of `type `&ReadonlyRepo` to
functions that take a `&dyn Repo`. For that, we need `ReadonlyRepo`
itself to implement `Repo` instead of having `Arc<ReadonlyRepo>`
implement it. I could have solved it in a different way, but the `Arc`
requirement seems like an unnecessary constraint.
2023-03-21 21:43:44 -07:00
Martin von Zweigbergk
d089c22232 repo: move base_repo() off of Repo trait
We only ever call `base_repo()` on `MutableRepo` instances.
2023-03-21 21:43:44 -07:00
Martin von Zweigbergk
ea498df539 repo: when resolving change id prefix, directly return commit ids
The functions resolving a change id to commits currently return a
`Vec<IndexEntry>`. We want to avoid depending on `IndexEntry` and we
only need the commit ids here.
2023-03-21 21:43:44 -07:00
Martin von Zweigbergk
01d7239732 revset: make graph iterator yield commit ids (not index entries)
We only need `CommitId`s, and `IndexEntry` is specific to the default
index implementation.
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
2f876861ae graphlog: key by commit id (not index position)
The index position is specific to the default index implementation and
we don't want to use it in outside of there. This commit removes the
use of it as a key for nodes in the graphlog.

I timed it on the git.git repo using `jj log -r 'all()' -T commit_id`
(the worst case I can think of) and it slowed down from ~2.02 s to
~2.20 s (~9%).
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
e721a81780 revset: update documentation of Revset::iter()
Since we hid the graph iterator implementation behind
`Revset::iter_graph()`, I don't think we have any callers of
`Revset::iter()` require the iteration to be in index position order,
so let's not promise that. We do want to promise that the iteration is
in topological order with children before parents, however.
2023-03-20 01:45:54 -07:00
Martin von Zweigbergk
f758b646a9 commit_builder: add accessors for most fields
I'd like to be able to access the current committer on a
`CommitBuilder`.
2023-03-19 00:48:05 -07:00
Martin von Zweigbergk
2495c8f27e cargo: update MSRV to 1.64
We need 1.64 to bump `clap` to `4.1`. We don't really need to upgrade
to that, but being on an older version causes minor confusions like
#1393. Rust 1.64 is very close to 6 months old at this point.
2023-03-17 22:44:29 -07:00
Martin von Zweigbergk
70d4a0f42e revset: remove context parameter from evaluate()
The `RevsetWorkspaceContext` argument is now instead used by the new
`resolve_symbol()` function.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
d971148e4e revset: move resolve_symbol() back to revset module
The only caller is now in `revset.rs`.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
94aec90bee revset: resolve symbols earlier, before passing to revset engine
For large repos, it's useful to be able to use shorter change id and
commit id prefixes by resolving the prefix in a limited subset of the
repo (typically the same subset that you'd want to see in your default
log output). For very large repos, like Google's internal one, the
shortest unique prefix evaluated within the whole repo is practically
useless because it's long enough that the user would want to copy and
paste it anyway.

Mercurial supports this with its `revisions.disambiguatewithin` config
(added in https://www.mercurial-scm.org/repo/hg/rev/503f936489dd). I'd
like to add the same feature to jj. Mercurial's implementation works
by attempting to resolve the prefix in the whole repo and then, if the
prefix was ambiguous, it resolves it in the configured subset
instead. The advantage of doing it that way is that there's no extra
cost of resolving the revset defining the subset if the prefix was not
ambiguous within the whole repo. However, there are two important
reasons to do it differently in jj:

* We support very large repos using custom backends, and it's probably
  cheaper to resolve a prefix within the subset because it can all be
  cached on the client. Resolving the prefix within the whole repo
  requires a roundtrip to the server.

* We want to be able to resolve change id prefixes, which is always
  done in *some* revset. That revset is currently `all()`, i.e. all
  visible commits. Even on local disk, it's probably cheaper to
  resolve a small revset first and then resolve the prefix within that
  than it is to build up the index of all visible change ids.

We could achieve the goal by letting each revset engine respect the
configured subset, but since the solution proposed above makes sense
also for local-disk repos, I think it's better to do it outside of the
revset engine, so all revset engines can share the code.

This commit prepares for the new functionality by moving the symbol
resolution out of `Index::evaluate_revset()`.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
1711cb61fb revset: add a Result version of transform_expression_bottom_up()
I'd like to add a stage after optimization for resolving most symbols
to commit IDs, and that needs to be able to fail.
2023-03-17 22:42:41 -07:00
Martin von Zweigbergk
5afe5091a0 revset: add default_ prefix to graph iterator module
The current revset graph iterator is the default one, which the
default revset engine provides.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
3871efd2f9 revset: move ReverseRevsetGraphIterator into revset module
The iterator is not specific to the implementation in
`revset_graph_iterator`, so it belongs in the standard `revset`
module.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
f62fac24ac revset: move graph iteration onto Revset trait
We want to allow custom revset engines define their own graph
iterator. This commit helps with that by adding a
`Revset::iter_graph()` function that returns an abstract iterator.

The current `RevsetGraphIterator` can be configured to skip or include
transitive edges. It skips them by default and we don't expose option
in the CLI. I didn't bother including that functionality in the new
`iter_graph()` either. At least for now, it will be up to the
implementation whether it includes such edges (it would of course be
free to ignore the caller's request even if we added an option for it
in the API).
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
28cbd7b1c5 revset: move evaluation into index
This commit adds an `evaluate_revset()` function to the `Index`
trait. It will require some further cleanup, but it already achieves
the goal of letting the index implementation decide which revset
engine to use.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
eed0b23009 revset: move current implementation to new module
We want to allow customization of the revset engine, so it can query
server indexes, for example. The current revset implementation will be
our default implementation for now. What's left in the `revset` module
after this commit is mostly parsing code.
2023-03-14 05:32:02 -07:00
Martin von Zweigbergk
6a57456067 revset: remove is_empty() implementation from trait
Now that there's a single implementation of `Revset`, I think it makes
more sense for `is_empty()` to be defined there. Maybe different
revset engines have different ways of implementing it. Even if they
don't, this is trivial to re-implement in each revset engine.
2023-03-13 07:20:35 -07:00
Martin von Zweigbergk
2d1b13b338 revset: make ToPredicateFn private, decouple public type from private
As the comment above `ToPredicateFn` says, it could be a private
type. This commit makes that happen by making the private `Revset`
implementations (`DifferenceRevset` etc.) instead implement an
internal revset type called `InternalRevset`. That type is what
extends `ToPredicateFn`, so the public type doesn't have to. The new
type will not need to implement the new functions I'm about to add to
the `Revset` trait.
2023-03-13 07:20:35 -07:00
Martin von Zweigbergk
9e6c139fa0 revset: create wrapper for current revset implementation
We don't want the public `Revset` interface to know about
`ToPredicateFn`. In order to hide it, I'm wrapping the internal type
in another type, so only the internal type can keep implementing
`ToPredicateFn`.
2023-03-13 07:20:35 -07:00
Martin von Zweigbergk
6c9cefb8a0 revset: make evaluate_revset() private and use it internally
I'd like to be able to change the return type of `evaluate_revset()`
to be an internal type. Since all external callers currently call the
function via `RevsetExpression::evaluate()`, it turns out it's easy to
make it private. To benefit from an internal type, we also need to
make the recursive calls be directly to the internal function.
2023-03-13 07:20:35 -07:00
Martin von Zweigbergk
74ffe7f688 index: remove num_commits() from API 2023-03-12 22:08:31 -07:00
Martin von Zweigbergk
52e4bee3fe index: remove stats() from API
The `stats()` function is specific to the default implementation, so
it shouldn't be part of the `Index` trait.
2023-03-12 22:08:31 -07:00
Martin von Zweigbergk
5423feb8e1 tests: call stats() on specific implementation
This removes the remaining calls to `Index::stats()`.
2023-03-12 22:08:31 -07:00
Martin von Zweigbergk
8c5c0f0e83 cli: make jj debug index get stats from specific implementation
We don't want custom index implementations to have to conform to the
same kind of stats as the default implementation. This commit also
makes the command error out on non-default index types.
2023-03-12 22:08:31 -07:00
Martin von Zweigbergk
504a2b3fd0 cli: add test of jj debug reindex, do full reindexing
I broke the commands in a27da7d8d5 and thought I just fixed it in
c7cf914694a8. However, as I added a test, I realized that I made it
only reindex the commits since the previous operation. I meant for the
command to do a full reindexing of th repo. This fixes that.
2023-03-12 22:08:31 -07:00
Martin von Zweigbergk
63b3c0899a index: make jj debug reindex actually reindex
I broke `jj debug reindex` in a27da7d8d5. From that commit, we no
longer delete the pointer to the old index, so nothing happens when we
reload the index. This commit fixes that, and also makes the command
error out if run on a repo with a non-default index type.
2023-03-12 03:23:59 -07:00
Martin von Zweigbergk
ff1b6ce3d1 index: extract a MutableIndex trait
This is yet another step towards making the index pluggable. The
`IndexStore` trait seems reasonable after this commit. There's still a
lot of work to remove `IndexPosition` from the `Index` trait.
2023-03-11 22:22:46 -08:00
Martin von Zweigbergk
b02fac7786 index: drop pub keyword from functions on private types
It has no effect, and these functions were not meant to be public.
2023-03-11 22:22:46 -08:00
Martin von Zweigbergk
65a6353f06 index: stop implementing Index trait on wrapper type
Since `ReadonlyIndex` and `Index` are different types, we don't need
to implement `Index` on `ReadonlyIndexWrapper`.
2023-03-11 22:22:46 -08:00
Martin von Zweigbergk
2eab85964a index: extract a ReadonlyIndex trait
I didn't make `ReadonlyIndex` extend `Index` because it needed an
`as_index()` to convert to `&dyn Index` trait object
anyway. Separating the types also gives us flexibility to implement
the two traits on different types.
2023-03-11 22:22:46 -08:00
Martin von Zweigbergk
d2457d3f38 index: wrap ReadonlyIndex in new type, hiding Arc
Not all index implementations may want to store the readonly index
implementation in an Arc. Exposing the Arc in the interface is also
problematic because `Arc<IndexImpl>` cannot be cast to `Arc<dyn
Index>`.
2023-03-11 22:22:46 -08:00
Martin von Zweigbergk
6ab8d9d0d0 index: merge index_store.rs into index.rs
These two files are closely related, and `Index` and `IndexStore` are
expected to be customized together, so it seems better to keep them in
a single file.
2023-03-11 22:22:46 -08:00
Martin von Zweigbergk
e5ba9d9e42 index: move current implementation to default_index_store.rs
The idea is that `index.rs` contains the interface, similar to
`backend.rs` for commits, and `op_store.rs` for operations.
2023-03-11 22:22:46 -08:00
Martin von Zweigbergk
8fc2062a38 index: move DefaultIndexStore to new default_index_store.rs 2023-03-11 22:22:46 -08:00
Martin von Zweigbergk
37151e0ff9 index: load store based on type recorded in .jj/repo/index/type
This is another step towards allowing a custom `jj` binary to have its
own index type. We're going to have a server-backed index
implementation at Google, for example.
2023-03-11 22:22:46 -08:00
Martin von Zweigbergk
3e4e0dc916 index_store: extract trait
This is a step towards making the index storage pluggable. The
interface will probably change a bit soon, but let's start with
functions that match the current implementation.

I called the current implementation the `DefaultIndexStore`. Calling
it `SimpleIndexStore` (like `SimpleOpStore` and `SimpleOpHeadsStore`)
didn't seem accurate.
2023-03-11 22:22:46 -08:00
Samuel Tardieu
616058c2fa lib: add Commit::is_discardable() 2023-03-05 23:50:20 +01:00
Yuya Nishihara
8f9bc4e7a6 revset: ignore all ascii whitespace characters 2023-03-04 00:01:54 +09:00
Martin von Zweigbergk
ba94f58d7e index_store: remove unused reinit() function 2023-03-02 12:33:11 -08:00
Martin von Zweigbergk
94bdbb7ff7 index: make IndexStore factory functions take &Path
This is just for consistency with the other backends.
2023-03-02 12:33:11 -08:00
Martin von Zweigbergk
da1c259211 index_store: use custom error type for write errors
Public APIs should use custom error types (not `io::Error` as
here). The caller isn't affected by this commit because it just
unwraps the error.
2023-03-02 12:33:11 -08:00
Martin von Zweigbergk
2cc15f40ef store: remove obsolete comment about root commit
The commit backends are responsible defining the root commit since
5ab2854df6.
2023-03-02 12:33:11 -08:00
Samuel Tardieu
d4b13d7495 git: use our own default refspec 2023-03-02 10:09:08 +01:00
Samuel Tardieu
5ecdeed606 git: only consider references matching globs when fetching 2023-03-02 10:09:08 +01:00
Samuel Tardieu
182919ff6f git: add function to import a selection of the git refs 2023-03-02 10:09:08 +01:00
Samuel Tardieu
0ca4e2dad2 git: absence of globs is None rather than &[]
In `git_fetch()`, any glob present in `globs` is an "allow" mark. Using
`&[]` to represent an "allow-all" may be misleading, as it could
indicate that no branch (only the git HEAD) should be fetched.

By using an `Option<&[&str]>`, it is clearer that `None` means that
all branches are fetched.
2023-03-02 10:09:08 +01:00
Samuel Tardieu
6fd65cca30 git: use &[&str] instead of &[String]
Using &[String] forces the caller to materalize owned strings if they
have only references, which is costly. Using &[&str] makes it cheap
if the caller owns strings as well.
2023-03-02 10:09:08 +01:00
Martin von Zweigbergk
bbd6ef0c7b revset: remove filter_by_diff(), have caller intersect expression
To be able to make e.g. `jj log some/path` perform well on cloud-based
repos, a custom revset engine needs to be able to see the paths to
filter by. That way it is able pass those to a server-side index. This
commit helps with that by effectively converting `jj log -r foo
some/path` into `jj log -r 'foo & file(some/path)'`.
2023-02-28 17:45:34 -08:00
Martin von Zweigbergk
fd6c7f3bb3 op_heads_store: let caller create initial operation
It makes the APIs much simpler if we don't have to pass in information
about the initial operation when we create the `OpHeadsStore`. It also
makes the alternative `OpHeadsStore` implementations simpler since we
move some logic into a shared location (`ReadonlyRepo::init()`).

This effectively undoes ec07104126. Maybe some further refactoring
made it possible to move it back as I'm doing in this commit?
2023-02-28 08:08:31 -08:00
Martin von Zweigbergk
4c5695f0cd index_store: take OperationId when writing new index
By taking an `OperationId` argument to `IndexStore::write_index()`, we
can remove `associate_file_with_operation()` from the trait. That
simplifies the interace a little bit. The reason I noticed this was
that I'm trying to extract a trait for `IndexStore`, and the word
"file" in it is too specific for e.g. a cloud-based implementation.
2023-02-27 09:44:28 -08:00
Martin von Zweigbergk
346e3c849b repo: propagate error when failing to look up backend type 2023-02-27 09:44:28 -08:00
Martin von Zweigbergk
011d9e3486 workspace: make WorkspaceLoader::load() return WorkspaceLoadError
I plan to make `RepoLoader::init()` return a `Result`, which means
that `WorkspaceLoader::load()` will need to return more kinds of
errors. Making it return `WorkspaceLoadError` is a good start. By also
extracting a function for converting `WorkspaceLoadError` to
`CommandError`, we can reuse a the handling of `PathError` in
`cli_util`.
2023-02-27 09:44:28 -08:00
Martin von Zweigbergk
491ecc6b2e repo: replace load_at_head() by helper in tests
I'm about to make `RepoLoader::init()` return a `Result`, and I don't
want to have to wrap that in a new error in
`ReadonlyRepo::load_at_head()` since that's only used in tests.
2023-02-27 09:44:28 -08:00
Yuya Nishihara
0b083b2ddb conflicts: in materialize_merge_result(), always use adds.get(index)
I don't think this is more readable than the original code, but it gives
diff1/right1 and diff2/right2 pairs consistent names.
2023-02-24 19:58:10 +09:00
Yuya Nishihara
410be93339 conflicts: in materialize_merge_result(), borrow both adds/removes sides
Just for consistency.
2023-02-24 19:58:10 +09:00
Yuya Nishihara
da16bf340c conflicts: fix off-by-one error in materialize_merge_result()
This should fix #1304. I think the added test simulates the behavior of
multiple rebase conflicts, but I don't have expertise around this.

add_index could be replaced with a peekable iterator, but the iterator version
wouldn't be as readable as the current implementation.
2023-02-24 19:58:10 +09:00
Ilya Grigoriev
30d03a66e6 cmd: --branch option for git fetch.
Thanks to @samueltardieu for noticing a subtle bug in the refspecs, providing
the fix, as well as the two `conflicting_branches` tests.
2023-02-21 18:33:40 -08:00
Samuel Tardieu
e9009cf21e style: simplify topo_order()
- Calling `.into_iter()` on an iterator is a no-op.
- There is no need to wrap index entries into another type to unwrap
  them right after sorting.
2023-02-20 11:05:54 +01:00
Yuya Nishihara
b5f1728ffb templater: migrate op log to template language
The outermost "op-log" label isn't moved to the default template. I think
it belongs to the command's formatter rather than the template.

Old bikeshedding items:
- "current_head", "is_head", or "is_head_op"
  => renamed to "current_operation"
- "templates.op-log" vs "templates.op_log" (the whole template is labeled
  as "op-log")
  => renamed to "op_log"
- "template-aliases.'format_operation_duration(time_range)'"
  => renamed to 'format_time_range(time_range)'
2023-02-20 18:20:41 +09:00
Ilya Grigoriev
39dd1a99c1 Add comment for topo_order 2023-02-20 00:36:32 -08:00
Martin von Zweigbergk
bc9f66dad3 revset: replace RevsetIterator wrapper by extension
The type doesn't seem to provide any benefit. I don't think I had a
good reason for creating it in the first place; it was probably just
unfamiliarity with Rust.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
91e56c7f2f revset: add type parameters to remaining iterator adapters
This is another step towards removing `RevsetIterator`. These types
are private, so someone using the library can't accidentally create a
`UnionRevsetIterator` with inputs in different order, for example.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
44e6ef9bae revset: make into_predicate_fn() a free function
I'm about to convert `RevsetIterator` to a trait and I don't want
`into_predicate_fn()` on it.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
36aa6e0be6 revset: add type parameter in commit iterator adapters
The standard way of creating iterator adapters seems to have the input
iterator type by a parameter to the function, so let's follow that
pattern.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
30160f4d20 revset: pass revset, not iterator, into RevsetGraphIterator
I was thinking of replacing `RevsetIterator` by a regular
`Iterator<Item=IndexEntry>`. However, that would make it easier to
pass in an iterator that produces revisions in a non-topological order
into `RevsetGraphIterator`, which would produce unexpected results (it
would result in nodes that are not connected to their parents, if
their parents had already been emitted). I think it makes sense to
instead pass in a revset into `RevsetGraphIterator`.

Incidentally, it will also be useful to have the full revset available
in `RevsetGraphIterator` if we rewrite the algorithm to be more
similar to Mercurial's and Sapling's algorithm, which involves asking
the revset if it contains parent revisions.
2023-02-19 21:37:26 -08:00
Martin von Zweigbergk
f70e6987b5 conflicts: preserve order of adds in materialized conflict
We write conflict to the working copy by materializing them as
conflict markers in a file. When the file has been modified (or just
the mtime has changed), we parse the markers to reconstruct the
conflict. For example, let's say we see this conflict marker:

```
<<<<<<<
+++++++
b
%%%%%%%
-a
+c
>>>>>>>
```

Then we will create a hunk with ["a"] as removed and ["b", "c"] as
added.

Now, since commit b84be06c08, when we materialize conflicts, we
minimize the diff part of the marker (the `%%%%%%%` part). The problem
is that that minimization may result in a different order of the
positive conflict terms. That's particularly bad because we do the
minimization per hunk, so we can end up reconstructing an input that
never existed.

This commit fixes the bug by only considering the next add and the one
after that, and emitting either only the first with `%%%%%%%`, or both
of them, with the first one in `++++++++` and the second one in
`%%%%%%%`.

Note that the recent fix to add context to modify/delete conflicts
means that when we parse modified such conflicts, we'll always
consider them resolved, since the expected adds/removes we pass will
not match what's actually in the file. That doesn't seem so bad, and
it's not obvious what the fix should be, so I'll leave that for later.
2023-02-18 22:01:25 -08:00
Martin von Zweigbergk
92f9fe5a1b tree: make conflict_term_to_conflict() take a TreeValue
The function only needs the `TreeValue` so it makes more sense this
way, I think. That will also let the caller keep the rest of the
`Conflict` value owned (though there is nothing but the `value` field
in it right now).
2023-02-18 00:09:51 -08:00
Martin von Zweigbergk
cf672de792 tree: avoid some cloning by passing by value 2023-02-18 00:09:51 -08:00
Martin von Zweigbergk
a87125d08b backend: rename ConflictPart to ConflictTerm
It took a while before I realized that conflicts could be modeled as
simple algebraic expressions with positive and negative terms (they
were modeled as recursive 3-way conflicts initially). We've been
thinking of them that way for a while now, so let's make the
`ConflictPart` name match that model.
2023-02-17 23:28:50 -08:00
Martin von Zweigbergk
e48ace56d1 conflicts: replace missing files by empty in materialized conflict
When we materialize modify/delete conflicts, we currently don't
include any context lines. That's because modify/delete conflicts have
only two sides, so there's no common base to compare to. Hunks that
are unchanged on the "modify" side are therefore not considered
conflicting, and since they they don't contribute new changes, they're
simply skipped (here:
3dfedf5814/lib/src/files.rs (L228-L230)).

It seems more useful to instead pretend that the missing side is an
empty file. That way we'll get a conflict in the entire file.

We can still decide later to make e.g. `jj resolve` prompt the user on
modify/delete conflicts just like `hg resolve` does (or maybe it
actually happens earlier there, I don't remember).

Closes #1244.
2023-02-17 22:19:04 -08:00
Martin von Zweigbergk
7f334656b1 tree: return early when trying to resolve modify/delete conflict
Modify/delete conflicts cannot be automatically resolved, so there's
no point in wasting resources calculating the diff(s).
2023-02-17 22:05:44 -08:00
Martin von Zweigbergk
8966580ba4 revset: clean up 'revset lifetimes on dyn Revset
If I understand correctly, the 'revset lifetimes on `Box<dyn
Revset<'index> + 'revset>` are not constrained by the lifetime of a
revset; we don't have any revsets that borrow data from other
revsets. Instead, they're all about constraining a boxed revset to the
index's lifetime. Without the lifetime annotation, it would default to
'static, and the borrow-checker doesn't like `dyn Revset<'index> +
'static`, since the revset could then live longer than the index it
borrows.
2023-02-17 07:52:17 -08:00
Martin von Zweigbergk
0159092c31 revset: elide lifetime parameter on iter()
The self-lifetime is the default when elided.
2023-02-17 07:52:17 -08:00
Martin von Zweigbergk
53565a816c revset: rename 'repo lifetime to the more accurate 'index 2023-02-17 07:52:17 -08:00
Ilya Grigoriev
e4aa2cb2e5 Rename ui.relative-timestamps to ui.oplog-relative-timestamps 2023-02-15 21:26:14 -08:00
Ilya Grigoriev
859b0f680c Make ui.relative-timestamps default to true
This seems like a better default for `jj op log`, which is now the only thing
this option affects.
2023-02-15 21:26:14 -08:00
Martin von Zweigbergk
d8997999f2 repo: replace RepoRef by Repo trait 2023-02-15 19:15:17 -08:00
Martin von Zweigbergk
f6a4cb57da repo: extract a Repo trait for Arc<ReadonlyRepo> and MutableRepo
This will soon replace the `RepoRef` enum, just like how the `Index`
trait replaced the `IndexRef` enum.
2023-02-15 19:15:17 -08:00
Martin von Zweigbergk
b7dad291df repo: make all base_repo() functions return &Arc<ReadonlyRepo>
This is to prepare for a `Repo` trait implemented for
`Arc<ReadonlyRepo>` (not for `ReadonlyRepo` itself).
2023-02-15 19:15:17 -08:00
Martin von Zweigbergk
8a067282c8 repo: make ReadonlyRepo::index() return a &dyn Index
This is just a little preparation for extracting a `Repo` trait that's
implemented by both `ReadonlyRepo` and `MutableRepo`. The `index()`
function in that trait will of course have to return the same type in
both implementations, and that type will be `&dyn Index`.
2023-02-15 19:15:17 -08:00
Yuya Nishihara
3c61e9239c config: remove ui.log-author-format in favor of template alias 2023-02-16 11:43:17 +09:00
Yuya Nishihara
a00767bc0f config: remove ui.unique-prefixes/log-id-preferred-length in favor of alias 2023-02-16 11:43:17 +09:00
Martin von Zweigbergk
2d8aa2d90e index: delete IndexRef, use Index trait
I don't know why I didn't create a trait to begin with. Maybe I had
trouble with lifetimes or object-safety.
2023-02-14 06:51:49 -08:00
Martin von Zweigbergk
b955e3de03 index: extract a trait for the index
Even though we don't know the details yet, we know that we want to
make the index pluggable like the commit and opstore
backends. Defining a trait for it should be a good step. We can refine
the trait later.
2023-02-14 06:51:49 -08:00
Martin von Zweigbergk
7a985ed122 index: remove lifetime parameter to IndexRef::heads()/topo_order()
I want to replace `IndexRef` by a trait, and I want that trait to be
object-safe.
2023-02-14 06:51:49 -08:00
Martin von Zweigbergk
81af5f820b repo: calculate shortest unique prefix separately for commit/change
We now resolve the two kinds of ids in separate spaces, so the
shortest prefixes should also be calculated in separate spaces.
2023-02-13 22:49:21 -08:00
Martin von Zweigbergk
222709196a repo: remove code for conflict between root commit/change id
The two ids no longer share a prefix, so we don't need to worry about
one being a prefix of the other.
2023-02-13 22:49:21 -08:00
Martin von Zweigbergk
d6909002f0 repo: elide lifetime on resolve_change_id_prefix() 2023-02-13 22:49:21 -08:00
Martin von Zweigbergk
04a0c60b16 revset: remove code for conflict between commit/change id
Commit ids and change ids now use non-overlapping symbols for their
digits, so they can't share a prefix.
2023-02-13 22:49:21 -08:00
Martin von Zweigbergk
9261bfe5fc revset: resolve change ids only using the new hex digits
Now that we use the new hex digits when we display change ids, we no
longer need to be able to resolve the old (conventional) digits.
2023-02-13 22:49:21 -08:00
Martin von Zweigbergk
39640cc288 revset: allow resolving change id using hex digits from reverse alphabet
By separating the value spaces change ids and commit ids, we can
simplify lookup of a prefix. For example, if we know that a prefix is
for a change id, we don't have to try to find matching commit ids. I
think it might also help new users more quickly understand that change
ids are not commit ids.

This commit is a step towards that separation. It allows resolving
change ids by using hex digits from the back of the alphabet instead
of 0-f, so 'z'='0', 'y'='1', etc, and 'k'='f'. Thanks to @ilyagr for
the idea. The regular hex digits are still allowed.
2023-02-13 22:49:21 -08:00
Martin von Zweigbergk
4fbf09ed8c revset: remove obsolete silencing of Clippy check 2023-02-12 23:07:09 -08:00
Martin von Zweigbergk
a67fbb6714 cli: switch default graph style to be Sapling's curved style
We seem to quite unanimously prefer this style, so let's make the
default.
2023-02-12 07:23:29 -08:00
Samuel Tardieu
c0c3f87574 git fetch: prune old branch names before adding new ones 2023-02-12 02:10:17 +01:00
Vamsi Avula
daf7b656e3 config: add and parse ui.log_author_format for use in the default template
Supported values are,

- `none` for no author information,
- `full` for both the name and email,
- `name` for just the name,
- `username` for username part of the email,
- (default) `email` (or any other gibberish for that matter) for the full email.
2023-02-11 20:54:23 +05:30
Martin von Zweigbergk
3744ae4508 index: remove unused iter() function 2023-02-10 10:31:42 -08:00
Yuya Nishihara
038497638f revset: parse keyword arguments, accept remote_branches(remote=needle)
The syntax is identical to Mercurial's revset, which is derived from Python.
2023-02-09 12:11:58 +09:00
Yuya Nishihara
b2825c22d7 revset: move whitespace rule out of expression
There's a subtle difference between
 - 'expression = { whitespace* ... whitespace* }', and
 - '_{ whitespace* ~ expression ~ whitespace* }'.

The former includes surrounding whitespace in an "expression", the latter
doesn't. This affects the span of error indication.
2023-02-09 12:11:58 +09:00
Yuya Nishihara
78227dc7bc revset: consolidate argument parsing functions
The added expect_arguments() is basically a copy from the template_parser.
I'll reimplement it to support keyword arguments, so I don't care much about
the current implementation.

I leave expect_no/one_argument() as wrappers because parsing 0/1 arguments
is pretty common.

Error messages are slightly changed. I personally prefer not to add extra
code for singular/plural handling, but if we do, I'll add 'if N == 1' case.
2023-02-09 12:11:58 +09:00
Martin von Zweigbergk
d1dc22d957 backend: let backend decide length of change id
As mentioned in the previous commit, our internal backend at Google
uses a 32-byte long change id. This commit will make us able to use
that.
2023-02-07 22:31:34 -08:00
Martin von Zweigbergk
e6693d0f68 backend: let backend choose root change id
Our internal backend at Google uses a 32-byte change id, so I'd like
to make the backend able to decide the length. To start with, let's
make the backend able to decide what the root change id should
be. That's consistent with how we already let the backend decide what
the root commit id should be.
2023-02-07 22:31:34 -08:00
Martin von Zweigbergk
98259346df backend: make hash_length() specifically about commit IDs
The function is currently only about the length of commit IDs, so
let's clarify that. I'm going to add another function for the length
of change IDs next. I don't know if we're going to care about lengths
of other hashes in the future. We might even be able to remove the
current restriction that all commit IDs and all change IDs have the
same length.
2023-02-07 22:31:34 -08:00
Yuya Nishihara
fa045d632c revset: allow trailing comma
It's unlikely we would write multi-line function call in revset, but let's
allow trailing comma for consistency.
2023-02-07 23:19:36 +09:00
Martin von Zweigbergk
f4374086b3 git_backend: return error when told to write commit without parents
There should be no other commits than the root commit without parents.
2023-02-05 22:52:23 -08:00
Martin von Zweigbergk
8c63fbc4ed git_backend: don't panic if told to write merge with root commit
I think the CLI currently checks that the backend is not told to write
a merge commit with the root as one parent, but we should not panic if
those checks fail.
2023-02-05 22:52:23 -08:00
Martin von Zweigbergk
2b2a9a36d7 git_backend: test conversion of parents, including root
We didn't seem to have any tests showing how we convert the set of
parents, and especially how we handle the root commit, so let's add
some.
2023-02-05 22:52:23 -08:00
Ilya Grigoriev
5fb17925eb jj log: option to specify preferred id length
The new option is `ui.log-id-preferred-length`. Setting it to 6
is quite convenient for the `jj` repo, for example.

Screenshot: https://user-images.githubusercontent.com/4123047/216535699-ad1e2ac8-73dd-44be-b28a-ebdebc00c63c.png
2023-02-05 21:18:42 -08:00
Ilya Grigoriev
6e05c5a829 jj log: Change the default of ui.unique-prefixes to "styled" 2023-01-30 22:48:38 -08:00
Martin von Zweigbergk
fafa9b70fc view: also merge git_heads when merging views
I don't know if I had just forgotten to merge `git_heads` when I added
it to the view object, but it seems like it should be merged just like
refs.
2023-01-30 09:05:03 -08:00
Martin von Zweigbergk
4e8fbaa210 git: allow conflicts in "HEAD@git"
Git's HEAD ref is similar to other refs and can logically have
conflicts just like the other refs in `git_refs`. As with the other
refs, it can happen if you run concurrent commands importing two
different updates from Git. So let's treat `git_head` the same as
`git_refs` by making it an `Option<RefTarget>`.
2023-01-30 09:05:03 -08:00
Glen Choo
3418c8ff73 git: add git.auto-local-branch
Add a new git.auto-local-branch config option. When set to false, a
remote-tracking branch imported from Git will not automatically create a
local branch target. This is implemented by a new GitSettings struct
that passes Git-related settings from UserSettings.

This behavior is particularly useful in a co-located jj and Git repo,
because a Git remote might have branches that are not of everyday
interest to the user, so it does not make sense to export them as local
branches in Git. E.g. https://github.com/gitster/git, the maintainer's
fork of Git, has 379 branches, most of which are topic branches kept
around for historical reasons, and Git developers wouldn't be expected
to have local branches for each remote-tracking branch.
2023-01-29 20:17:49 -08:00
Martin von Zweigbergk
5fecb396aa working_copy: write tree_state file on init
I don't think there's a good reason not to write the
`.jj/working_copy/tree_state` file on init. Being able to assume that
the file exists means that we won't need the store object to to lazily
load the `TreeState` object. Well, except that `TreeState` keeps an
`Arc<Store>`, but I'm trying to change that.
2023-01-29 20:01:22 -08:00
Martin von Zweigbergk
2971c45e04 index_store: don't look up whole commit when only id is needed
When building an initial index from an existing Git repo, for example,
we walk parents and predecessors to find all commits to index. Part of
that code was looking up the whole parent and predecessor commits even
though it only needed the ids. I don't know if this has a measurable
impact on performance, but it's not really any more complex to just
get the ids anyway.
2023-01-29 10:45:03 -08:00
Martin von Zweigbergk
be638d0205 dag_walk: delete unused common_ancestor() 2023-01-29 10:42:11 -08:00
Martin von Zweigbergk
aaf75b4793 repo: inline single-caller, and surprising, Commit::is_empty()
I would expect `Commit::is_empty()` to check if the commit is empty in
our usual sense, i.e. that there are no changes compared to the
auto-merged parents. However, it would return `false` for any merge
commit (and for the root commit). Since we only use it in one place,
let's inline it there. The use there does seem reasonable, because
it's about abandoning an "uninteresting" working-copy commit.
2023-01-28 15:54:03 -08:00
Samuel Tardieu
a7aed0171d style: fix typos found by codespell 2023-01-28 07:23:45 -08:00
Martin von Zweigbergk
d8942d5f96 cli: rename ui.graph.format to ui.graph.style
I think of it more as style than a format, so using `style` in the
config key makes sense to me.

I didn't bother making upgrades easy by supporting the old name since
this was just released and only a few developers probably have it set.
2023-01-27 10:36:26 -08:00
Martin von Zweigbergk
0b99e5b16e graphlog: enable Sapling's graph styles by default
I would also rename the feature, but I hope we can instead soon make
it a non-optional dependency and delete the feature.
2023-01-27 09:46:57 -08:00
Yuya Nishihara
d771c12637 index: make HexPrefix accessor simply return "min" prefix as bytes slice
This is low-level function, so I think using &[u8] should be good here.
2023-01-27 03:37:44 +09:00
Yuya Nishihara
770ca72a1f index: use iterator to simplify segment_resolve_prefix() a bit further 2023-01-27 03:37:44 +09:00
Yuya Nishihara
956a2d5f83 index: remove redundant prefix tests from resolve_prefix functions
The "min" prefix guarantees that the first entry matches the hex prefix
if any. Spotted by @ilyagr.
2023-01-27 03:37:44 +09:00
Yuya Nishihara
b4c837fd4a index: simplify segment_resolve_prefix() loop to make both impls look close 2023-01-27 03:37:44 +09:00
Yuya Nishihara
b9fc6d4203 templater: rewrite divergent property by leveraging IdIndex 2023-01-26 14:10:26 +09:00
Yuya Nishihara
824f2106fd repo: migrate revset::resolve_change_id() to use IdIndex for ReadonlyRepo
The MutableRepo implementation is the same as before.
2023-01-26 14:10:26 +09:00
Yuya Nishihara
4f15d1f779 repo: implement method to look up change_id prefix by using IdIndex
revset::resolve_change_id() for ReadonlyRepo will be replaced with this
implementation. This doesn't mean revset query will speed up. A trivial
query will become slower due to the initialization cost of the change id
index. "jj log -r hex" will get faster since we have to pay the cost anyway.

Benchmark numbers (against my "linux" repo):

Command:
    hyperfine --warmup 3 --runs 20 \
      "jj log -r $hex -T '' --no-commit-working-copy --no-graph"

Linear search (e874570947):
    Time (mean ± σ):     223.9 ms ±  16.2 ms    [User: 181.2 ms, System: 42.7 ms]
    Range (min … max):   207.7 ms … 247.6 ms    50 runs

Building IdIndex:
    Time (mean ± σ):     855.0 ms ±  21.7 ms    [User: 788.4 ms, System: 66.6 ms]
    Range (min … max):   822.6 ms … 927.5 ms    50 runs

Building IdIndex, but hacked to store SmallVec<[u8; 20]>:
    Time (mean ± σ):     406.1 ms ±  15.9 ms    [User: 354.1 ms, System: 52.0 ms]
    Range (min … max):   382.2 ms … 428.6 ms    50 runs

For my "jj" work repo, changes are < ~1ms.
2023-01-26 14:10:26 +09:00
Yuya Nishihara
38a9180bb7 repo: generalize IdIndex over key and value types
Though we'll only need IdIndex<ChangeId, IndexPosition>, this allows us to
write unit tests without setting up MutableIndex.
2023-01-26 14:10:26 +09:00
Martin von Zweigbergk
10725c095f cleanup: update more "checkout" to "working-copy commit" and similar
I've preferred "working-copy commit" over "checkout" for a while
because I think it's clearer, but there were lots of places still
using "checkout". I've left "checkout" in places where it refers to
the action of updating the working copy or the working-copy commit.
2023-01-25 11:02:59 -08:00
Martin von Zweigbergk
37ba17589d simple_op_heads_store: rename storage directory
`SimpleOpHeadsStore` currently stores its files in
`.jj/repo/op_heads/simple_op_heads/`. The `.jj/repo/op_heads/type`
file indicates the type of op-heads backend. If that contains
"simple_op_head_store", we use the `SimpleOpHeadsStore`
backend. There's no need for the `simple_op_heads` directory to also
indicate the type of backend in its name. I kept just the `heads` in
the name to make it less redundant with the parent directory (which is
`op_heads)`. We could alternatively call the directory `values` or
similar.
2023-01-25 09:22:38 -08:00
Martin von Zweigbergk
0d1ec835c1 repo: rename .jj/repo/store/backend to .jj/repo/store/type
We decided to call the files identifying the backend type `type`. We
already use that name for `OpStore` and `OpHeadsStore`.
2023-01-25 09:22:38 -08:00
Yuya Nishihara
c018ef229b repo: proxy shortest unique prefix function through RepoRef
Since this function depends on both index and view, it can't be moved to
one of the storage objects. If we go forward with this approach, some
revset::resolve_*() functions will also be migrated to RepoRef.

This patch slightly changes the function name since a "prefix" might have
various meanings.
2023-01-25 10:47:39 +09:00
Yuya Nishihara
c0c5e8f041 repo: rewrite "all()" query to clarify data dependency 2023-01-25 10:47:39 +09:00
Martin von Zweigbergk
ce094c618b repo: propagate error when current working-copy commit is not found
This should fix the panic in the case reported in #1107. It's a bit
hard to reproduce because we normally notice the missing commit when
we snapshot the working copy, but it's possible to reproduce it using
`--no-commit-working-copy`.

I suspect the added test is too brittle because it checks the exact
error message. On the other hand, it might be useful to have one test
case like this so we catch accidental changes in the format.
2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
63aa484046 repo: add a specific error type for MutableRepo::check_out() 2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
eb7de6dd3c repo: inline leave_commit() into single caller 2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
4777508df0 repo: make check_out() call edit()
This reduces duplication a little, and it makes logical sense.
2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
dd3472924b repo: add a specific error type for MutableRepo::edit()
The new type is just an enum version of `RewriteRootCommit`.  I'll add
another variant soon.
2023-01-24 12:20:28 -08:00
Yuya Nishihara
c82a62cf99 repo: turn IdIndex into sorted Vec, use binary search
Since IdIndex is immutable, we don't need fast insertion provided by BTreeMap.
Let's simply use Vec for some speed up. More importantly, this allows us to
store multiple (ChangeId, CommitId) pairs for the same change id, and will
unblock the use of IdIndex in revset::resolve_symbol().

Some benchmark numbers (against my "linux" repo) follow.

Command:
    hyperfine --warmup 3 "jj log -r master \
      -T 'commit_id.short_prefix_and_brackets()' \
      --no-commit-working-copy --no-graph"

Original:
    Time (mean ± σ):      1.892 s ±  0.031 s    [User: 1.800 s, System: 0.092 s]
    Range (min … max):    1.833 s …  1.935 s    10 runs

This commit:
    Time (mean ± σ):     867.5 ms ±   2.7 ms    [User: 809.9 ms, System: 57.7 ms]
    Range (min … max):   862.3 ms … 871.0 ms    10 runs
2023-01-23 07:38:04 +09:00
Yuya Nishihara
879f585b21 repo: leverage stored index to calculate shortest prefix in commit id space
With my "jj" work repo, this saves ~4ms to show the log with default revset.

Command:
    JJ_CONFIG=/dev/null hyperfine --warmup 3 --runs 100 \
      "jj log -T 'commit_id.short_prefix_and_brackets() \
                  change_id.short_prefix_and_brackets()' \
              --no-commit-working-copy"

Baseline (a7541e1ba4):
    Time (mean ± σ):      54.1 ms ±  16.4 ms    [User: 46.4 ms, System: 7.8 ms]
    Range (min … max):    36.5 ms …  78.1 ms    100 runs

This commit:
    Time (mean ± σ):      49.5 ms ±  16.4 ms    [User: 42.4 ms, System: 7.2 ms]
    Range (min … max):    31.4 ms …  70.9 ms    100 runs
2023-01-22 17:24:03 +09:00
Yuya Nishihara
2e9468772b index: add method to calculate shortest commit_id prefix
For simplicity, I made public API that returns the shortest length.
2023-01-22 17:24:03 +09:00
Yuya Nishihara
5a0931885d index: add ancestor iterators to CompositeIndex and rewrite loop/recursion
This iterator will be used to merge neighbor commit ids across segments.

resolve_prefix() is simplified to non-short-circuiting loop. I think that's
fine because visiting parents is cheap, and the costly operation here is
segment_resolve_prefix().

entry_by_pos() could also be migrated to iterator, but I leave the unsafe
bits there.
2023-01-22 17:24:03 +09:00
Yuya Nishihara
e71e9c99b2 index: add neighbor commit_id lookup to IndexSegment
ReadonlyIndex implementation leverages the existing binary search
function. MutableIndex one is basically the same as repo::IdIndex.

Shortest prefix length could be calculated for each segment, but I think
returning neighbors is better for testing.
2023-01-22 17:24:03 +09:00
Yuya Nishihara
a7541e1ba4 repo: add workaround for shortest prefix calculation of root ids
This is ugly, but we need a special case because root_change_id and
root_commit_id aren't equal but share the same prefix bytes. In practice,
no one would care for the shortest root id prefix, but we'll need to deal
with a similar problem when migrating prefix id resolution to repo layer.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
1a4b5c5ee6 index: make IdIndex store raw bytes, not hex bytes
This helps us to migrate commit_id index to ReadonlyIndex. For large
repositories, this also reduces initialization cost, but that's not the main
intent of this change.

https://github.com/martinvonz/jj/pull/1041#issuecomment-1399225876

common_hex_len() and iter_half_bytes() are added to backend.rs since more
call sites will be added to index.rs, and I feel index.rs isn't a good place
to host this kind of utility functions.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
65a659347e tests: pad odd-length hex bytes passed in to repo::IdIndex
This allows us to migrate IdIndex to raw bytes. In practice, these ids are
full hashes which should never be odd length.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
1d2642de1e repo: split commit_id and change_id indices
The goal is to replace the commit_id index with ReadonlyIndex to save the
initialization cost, but this also helps to fix root id handling.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
8c0f7d7707 backend: define root change id statically
I made it a free function. Alternatively, the root id could be instantiated
by and obtained through backend, but I don't think we'll need such level of
abstraction.

I'm going to add a workaround for shortest prefix calculation of the root ids,
where this function will be used.
2023-01-22 12:03:08 +09:00