Commit graph

50 commits

Author SHA1 Message Date
Martin von Zweigbergk
05e9149157 revsets: add a non_obsolete_heads() revset
This change adds a `non_obsolete_heads(<set>)` revset, which walks up
ancestors of the input set until it gets to a non-obsolete and
non-pruned commit. That's what we do by default in `jj log`
(i.e. without `--all`). Now we can make `jj log` use revsets and teach
it a `-r` option!
2021-04-18 22:45:10 -07:00
Martin von Zweigbergk
0d62a336af revsets: initial support for Mercurial-style revsets
This patch adds initial support for a DSL for specifying revisions
inspired by Mercurial's "revset" language. The initial support
includes prefix operators ":" (parents) and "*:" (ancestors) with
naive parsing of the revsets. Mercurial uses postfix operator "^" for
parent 1 just like Git does. It uses prefix operator "::" for
ancestors and the same operator as postfix operator for descendants. I
did it differently because I like the idea of using the same operator
as prefix/postfix depending on desired direction, so I wanted to apply
that to parents/children as well (and for
predecessors/successors). The "*" in the "*:" operator is copied from
regular expression syntax. Let's see how it works out. This is an
experimental VCS, after all.

I've updated the CLI to use the new revset support.

The implementation feels a little messy, but you have to start
somewhere...
2021-04-18 21:25:51 -07:00
Martin von Zweigbergk
7861968f64 index: make IndexRef::entry_by_id() etc return entry with repo's lifetime
It's useful to be able to know that given a `repo: RepoRef<'a>`, the
the lifetime of `repo.index().entry_by_id()` will also be `'a`.
2021-04-15 07:00:04 -07:00
Martin von Zweigbergk
4c3d73ff3b evolution: walk orphans using index
This actually seems to make it slightly slower, but it fixes an
important bug (we used to evolve only one topological branch per `jj
evolve` call). The slowdown seemed to be on the order of 5% when
evolving 100 commits on git.git's "what's cooking" branch.
2021-04-14 08:25:14 -07:00
Martin von Zweigbergk
40f75ec641 revsets: don't crash if given non-hex symbol 2021-04-10 10:08:47 -07:00
Martin von Zweigbergk
998e23db3c index: add IndexEntry::parents() and predecessors() returning Vec<IndexEntry> 2021-03-31 14:48:03 -07:00
Martin von Zweigbergk
4b8484e561 rustfmt: configure to group imports 2021-03-14 10:46:25 -07:00
Martin von Zweigbergk
f755c3f740 cleanup: access integer types' MAX constants directly on the type
Using `std::u32::MAX` is deprecated.
2021-03-08 23:18:17 -08:00
Martin von Zweigbergk
1e623bd019 index: update in memory and on disk while resolving operation conflicts
Updating the index on disk means that reader won't have to calculate
the state. Updating it in memory means that we can take advantage of
it while resolving conflicts. We will do that soon.
2021-03-06 23:30:03 -08:00
Martin von Zweigbergk
d1509ffdd4 index: extract a function for adding all commits from a segment
I also restructured the loop in `maybe_squash_with_ancestors()` to
hopefully make it a little clearer.
2021-03-06 20:30:12 -08:00
Martin von Zweigbergk
3fc35288c0 index: remove dir field from ReadonlyIndex and MutableIndex 2021-03-06 10:02:19 -08:00
Martin von Zweigbergk
502ba895f5 index: move ReadonlyFile::associate_with_operation() to IndexStore
After this patch, the `index` module no longer knows about the
".jj/index/operations/" directory; that knowledge is now only in
`IndexStore`.
2021-03-06 10:00:30 -08:00
Martin von Zweigbergk
c4fe7aab10 index: move ReadonlyRepo::load_at_operation() to IndexStore 2021-03-06 09:52:44 -08:00
Martin von Zweigbergk
12bfbc489c index: move ReadonlyIndex::index() to IndexStore 2021-03-06 09:52:38 -08:00
Martin von Zweigbergk
2fdf9721c0 index: move load() to IndexStore 2021-03-06 09:52:27 -08:00
Martin von Zweigbergk
403e86c138 index: introduce IndexStore, which owns ReadonlyIndex files
This patch introduces a new `IndexStore` struct. The idea is that it
will know about the directory in which the index files are stored, the
associations with operations. It may also cache `Arc<ReadonlyIndex>`
instances so if multiple `ReadonlyIndex` instances are loaded, they
can be returned from the cache. That may be useful when merging
operations because the operations are likely to share a large parent
index file. For now, however, all the new type has is `init()`,
`load()`, and `reinit()`.
2021-03-06 09:52:16 -08:00
Martin von Zweigbergk
e2e9fe8f0d index: add stats for number of change ids and pruned commits 2021-03-06 09:50:22 -08:00
Martin von Zweigbergk
bc64cf02c7 index: don't use an all-0 change id in tests
It was weird to have the same change id for all commits. I think that
was a leftover from a me just quickly getting tests to pass.
2021-03-06 09:30:52 -08:00
Martin von Zweigbergk
031a39ecba cleanup: fix lots of issues found in the lib crate by clippy
I had forgotten to pass `--workspace` to clippy all this time :P
2021-02-26 23:15:43 -08:00
Martin von Zweigbergk
d961f61623 evolution: calculate state using index
All the information needed for calculating the evolution state is now
in the index, so let's use it. This speeds up calculation of the
evolution state from 1.53s to 150ms in the git.git repo. In the Linux
repo, it was sped up from 28.9s to 3.07s. That's still unbearably slow
(and still pretty slow in the git.git repo too). We may need to keep a
persistent cache of the evolution state, but that will have to come
later; this improvement is good enough for now.
2021-02-26 21:19:18 -08:00
Martin von Zweigbergk
190899fe76 index: make rev_walk() iterate over IndexEntry instead of CommitId
We have the `IndexEntry` available in the iterator, so let's return it
so the caller doesn't need to look it up themselves.
2021-02-26 16:49:27 -08:00
Martin von Zweigbergk
8be84c345b index: also index change id 2021-02-26 10:33:34 -08:00
Martin von Zweigbergk
3a53a187ff index: add flag indicating pruned commit 2021-02-26 10:33:34 -08:00
Martin von Zweigbergk
d80903ce48 index: also index predecessors
Evolution needs to have fast access to the predecessors. This change
adds that information to the commit index.

Evolution also needs fast access to the change id and the bit saying
whether a commit is pruned. We'll add those soon.

Some tests changed because they previously added commits with
predecessors that were not indexed, which is no longer allowed from
this change. (We'll probably eventually want to allow that again, so
that the user can prune predecessors they no longer care about from
the repo.)
2021-02-26 10:33:34 -08:00
Martin von Zweigbergk
afc59a210a index: make tests more focused, and add tests of octopus merge 2021-02-26 10:33:32 -08:00
Martin von Zweigbergk
2a531832d6 rewrite: make merge_commit_trees() use index for finding common ancestors
The index is now always kept up to date and it has functionality for
finding common ancestors, so let's use it! This should make merging
commits a little faster if their common ancestor is far away (which is
rare). It's probably much more important that the index-based
algorithm is more correct. Also, it returns multiple common ancestors
in the criss-cross case, which lets us do a recursive merge like git
does. I'm leaving the recursive merge for later, though.
2021-02-23 20:49:18 -08:00
Martin von Zweigbergk
bb94516175 index: add support for finding common ancestors
We currently need to read the commit objects for finding common
ancestors. That can be very slow when the common ancestor is far back
in history. This patch adds a function for finding common ancestors
using the index instead.

Unlike the current algorithm, which only returns one common ancestor,
the new index-based one correctly handles criss-cross merges.

Here are some timings for finding the common ancestors in the git.git
repo:

                          |      Without index     |       With Index       |
                          | First run | Subsequent | First run | Subsequent |
v2.30.0-rc0 v2.30.0-rc1   |   5.68 ms |    5.94 us |   40.3 us |    4.77 us |
v2.25.4 v2.26.1           |   1.75 ms |    1.42 us |   13.8 ms |    4.29 ms |
v1.0.0 v2.0.0             |    492 ms |    2.79 ms |   23.4 ms |    6.41 ms |

Finding ancestors of v2.25.4 and v2.26.1 got much slower because the
new algorithm finds all common ancestors. Therefore, it also finds
v2.24.2, v2.23.2, v2.22.3, v2.21.2, v2.20.3, v2.19.4, v2.18.3, and
v2.17.4, which it then filters out because they're all ancestors of
v2.25.3.

Also note that the result was incorrect before, because the old
algorithm would return as soon as it had found a common ancestor, even
if it's not the latest common ancestor. For example, for the common
ancestor between v1.0.0 and v2.0.0, it returned an ancestor of v1.0.0
because it happened to get there by following some side branch that
led there more quickly.

The only place we currently need to find the common ancestor is when
merging trees, which we only do when the user runs `jj merge`, as well
as when operating on existing merge commits (e.g. to diff or rebase
them). That means that this change won't be very noticeable. However,
it's something we clearly want to do sooner or later, so we might as
well get it done.
2021-02-23 17:29:23 -08:00
Martin von Zweigbergk
422d333d4b index: make heads() return result in index order instead of hash order
It's nice to have a non-random order for tests (we can revisit later
if it shows up in profiling). I'm changing the order to be the index
order so the future caller of `heads_pos()` (not `heads()`) will also
get consistent order.
2021-02-23 17:24:55 -08:00
Martin von Zweigbergk
1481935472 index: extract a function for removing ancestors of set based on positions
We already have the `heads()` function, which works on
`CommitId`s. This just extracts a function that works on
positions. I'll use it soon.
2021-02-21 22:28:44 -08:00
Martin von Zweigbergk
62ce5782b5 index: when writing incremental index, squash into parent file if smaller
We currently write a new incremental index file every time. That means
that the stack of index files quickly gets deep, which makes it slow
to read the index. This commit makes it so that we squash the new
index segment into its parent if the parent has fewer commits. That
means we'll limit the number of files to O(log n). Writes time will
also be O(log n) on average.
2021-02-16 23:47:43 -08:00
Martin von Zweigbergk
a51543b752 index: make first level in stats be the root index
I've confused myself a few times already thinking that level 0 is the
root, so that's probably more intuitive. It also makes tests simpler
because the initial part of the list is unchanged when a new
transaction commits.
2021-02-16 23:45:54 -08:00
Martin von Zweigbergk
b122f33312 index: don't write empty incremental index file 2021-02-16 23:45:52 -08:00
Martin von Zweigbergk
a7b6bcfd79 transaction: write incremental index on commit
With this change, we start writing the incremental index to disk, so
the next reader won't have to re-read the commits and create the
index.

As of this change, we simply write a new index file for each
transaction. That will clearly mean that the stack of files gets deep
pretty quickly. For now, the user will have to do `jj debug reindex`
when things get slow. I plan to change it so instead of writing an
incremental index file every time, we first check if the new index
file would have at least as many commits as the parent file, and if it
will, we write a combined one instead. That should apply recursively,
so we'd have O(log n) index files.
2021-02-15 11:03:41 -08:00
Martin von Zweigbergk
86915f0a6f index: fix check for adding existing commit to index
The check for adding an existing commit to the index only checked if
the commit was already in the `MutableIndex`, not if it was already in
the parent `ReadonlyIndex`.
2021-02-15 10:28:18 -08:00
Martin von Zweigbergk
3c832cbbbe index: let index structs keep track of the index directory
This matches how it's done for the other struct (View, WorkingCopy).
2021-02-14 01:03:49 -08:00
Martin von Zweigbergk
b77740e58a index: move function for saving MutableIndex onto the struct 2021-02-14 01:03:49 -08:00
Martin von Zweigbergk
713d32d803 index: keep up to date within transaction
With tons of groundwork done, wee can now finally keep the index up to
date within a transaction! That means that we can start relying on the
index to always be valid, so we can use it e.g. for finding common
ancestors within a transaction. That should help speed up `jj evolve`
immensely on large repos.

We still don't write the updated index to disk when the transaction
closes. That will come later.
2021-02-14 00:58:11 -08:00
Martin von Zweigbergk
f05a12d301 index: make CompositeIndex non-public and add new IndexRef enum instead
We're getting close to finally having a `RepoRef::index()` method.
2021-02-13 13:56:26 -08:00
Martin von Zweigbergk
face4d637f index: define methods from CompositeIndex directly on {Readonly,Mutable}Index
This is one step towards making `CompositeIndex` non-public (and maybe
deleting it). Next, we'll add an `IndexRef` enum similar to `RepoRef`
etc.
2021-02-13 13:46:58 -08:00
Martin von Zweigbergk
8dda4b05e4 index: add "segment_" prefix to methods in IndexSegment
I'm about to move the functions from `CompositeIndex` to an new
`Index` trait implemeted by `ReadonlyIndex` and `MutableIndex`. Since
those types already implement `IndexSegment`, the names would conflict
and it would get annoying to have to disambiguate them. This commit
therefore prepares for that by adding a `segment_` prefix to the
functions in `IndexSegment`.
2021-02-13 13:45:31 -08:00
Martin von Zweigbergk
72aebc9da3 view: replace View trait by enum with Readonly and Mutable variants 2021-02-13 08:31:41 -08:00
Martin von Zweigbergk
f1666375bd repo: replace Repo trait by enum with readonly and mutable variants
I want to keep the index updated within the transaction. I tried doing
that by adding a `trait Index`, implemented by `ReadonlyIndex` and
`MutableIndex`. However, `ReadonlyRepo::index` is of type
`Mutex<Option<Arc<IndexFile>>>` (because it is lazily initialized),
and we cannot get a `&dyn Index` that lives long enough to be returned
from a `Repo::index()` from that. It seems the best solution is to
instead create an `Index` enum (instead of a trait), with one readonly
and one mutable variant. This commit starts the migration to that
design by replacing the `Repo` trait by an enum. I never intended for
there there to be more implementations of `Repo` than `ReadonlyRepo`
and `MutableRepo` anyway.
2021-02-13 08:31:23 -08:00
Martin von Zweigbergk
fa30cf768f index: rename UnsavedIndexData to MutableIndex 2021-02-07 23:35:37 -08:00
Martin von Zweigbergk
8170c06573 index: rename IndexFile to ReadonlyIndex 2021-02-07 23:35:22 -08:00
Martin von Zweigbergk
51373b75ff index: use correct per-level file name in stats (previously always top-level) 2021-02-07 23:34:57 -08:00
Martin von Zweigbergk
9c8f13608f cargo: update blake2
For no reason other than to stay up to date.
2020-12-24 01:15:38 -08:00
Martin von Zweigbergk
14e7df995a index: move static functions from Index to IndexFile and delete it
The Index struct no longer has any state, so it's not needed.
2020-12-18 16:12:45 -08:00
Martin von Zweigbergk
00fb670c9c index: make Index::load() return Arc<IndexFile> instead of Index
This removes one level of indirection, which is nice because it was
visible to the callers. The `Index` struct is now empty. The next step
is obviously to delete it (and perhaps rename `IndexFile` to `Index`
or `ReadonlyIndex`).
2020-12-18 16:12:45 -08:00
Martin von Zweigbergk
af1760b02e index: load index file eagerly in Index::load() now that Repo::index() is lazy 2020-12-18 16:12:45 -08:00
Martin von Zweigbergk
6b1427cb46 import commit 0f15be02bf4012c116636913562691a0aaa7aed2 from my hg repo 2020-12-12 00:23:38 -08:00