Commit graph

1175 commits

Author SHA1 Message Date
Yuya Nishihara
770ca72a1f index: use iterator to simplify segment_resolve_prefix() a bit further 2023-01-27 03:37:44 +09:00
Yuya Nishihara
956a2d5f83 index: remove redundant prefix tests from resolve_prefix functions
The "min" prefix guarantees that the first entry matches the hex prefix
if any. Spotted by @ilyagr.
2023-01-27 03:37:44 +09:00
Yuya Nishihara
b4c837fd4a index: simplify segment_resolve_prefix() loop to make both impls look close 2023-01-27 03:37:44 +09:00
Yuya Nishihara
b9fc6d4203 templater: rewrite divergent property by leveraging IdIndex 2023-01-26 14:10:26 +09:00
Yuya Nishihara
824f2106fd repo: migrate revset::resolve_change_id() to use IdIndex for ReadonlyRepo
The MutableRepo implementation is the same as before.
2023-01-26 14:10:26 +09:00
Yuya Nishihara
4f15d1f779 repo: implement method to look up change_id prefix by using IdIndex
revset::resolve_change_id() for ReadonlyRepo will be replaced with this
implementation. This doesn't mean revset query will speed up. A trivial
query will become slower due to the initialization cost of the change id
index. "jj log -r hex" will get faster since we have to pay the cost anyway.

Benchmark numbers (against my "linux" repo):

Command:
    hyperfine --warmup 3 --runs 20 \
      "jj log -r $hex -T '' --no-commit-working-copy --no-graph"

Linear search (e874570947):
    Time (mean ± σ):     223.9 ms ±  16.2 ms    [User: 181.2 ms, System: 42.7 ms]
    Range (min … max):   207.7 ms … 247.6 ms    50 runs

Building IdIndex:
    Time (mean ± σ):     855.0 ms ±  21.7 ms    [User: 788.4 ms, System: 66.6 ms]
    Range (min … max):   822.6 ms … 927.5 ms    50 runs

Building IdIndex, but hacked to store SmallVec<[u8; 20]>:
    Time (mean ± σ):     406.1 ms ±  15.9 ms    [User: 354.1 ms, System: 52.0 ms]
    Range (min … max):   382.2 ms … 428.6 ms    50 runs

For my "jj" work repo, changes are < ~1ms.
2023-01-26 14:10:26 +09:00
Yuya Nishihara
38a9180bb7 repo: generalize IdIndex over key and value types
Though we'll only need IdIndex<ChangeId, IndexPosition>, this allows us to
write unit tests without setting up MutableIndex.
2023-01-26 14:10:26 +09:00
Martin von Zweigbergk
10725c095f cleanup: update more "checkout" to "working-copy commit" and similar
I've preferred "working-copy commit" over "checkout" for a while
because I think it's clearer, but there were lots of places still
using "checkout". I've left "checkout" in places where it refers to
the action of updating the working copy or the working-copy commit.
2023-01-25 11:02:59 -08:00
Martin von Zweigbergk
37ba17589d simple_op_heads_store: rename storage directory
`SimpleOpHeadsStore` currently stores its files in
`.jj/repo/op_heads/simple_op_heads/`. The `.jj/repo/op_heads/type`
file indicates the type of op-heads backend. If that contains
"simple_op_head_store", we use the `SimpleOpHeadsStore`
backend. There's no need for the `simple_op_heads` directory to also
indicate the type of backend in its name. I kept just the `heads` in
the name to make it less redundant with the parent directory (which is
`op_heads)`. We could alternatively call the directory `values` or
similar.
2023-01-25 09:22:38 -08:00
Martin von Zweigbergk
0d1ec835c1 repo: rename .jj/repo/store/backend to .jj/repo/store/type
We decided to call the files identifying the backend type `type`. We
already use that name for `OpStore` and `OpHeadsStore`.
2023-01-25 09:22:38 -08:00
dependabot[bot]
7f4e714ffe cargo: bump pest_derive from 2.5.3 to 2.5.4
Bumps [pest_derive](https://github.com/pest-parser/pest) from 2.5.3 to 2.5.4.
- [Release notes](https://github.com/pest-parser/pest/releases)
- [Commits](https://github.com/pest-parser/pest/compare/v2.5.3...v2.5.4)

---
updated-dependencies:
- dependency-name: pest_derive
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-01-25 07:49:05 -08:00
Yuya Nishihara
c018ef229b repo: proxy shortest unique prefix function through RepoRef
Since this function depends on both index and view, it can't be moved to
one of the storage objects. If we go forward with this approach, some
revset::resolve_*() functions will also be migrated to RepoRef.

This patch slightly changes the function name since a "prefix" might have
various meanings.
2023-01-25 10:47:39 +09:00
Yuya Nishihara
c0c5e8f041 repo: rewrite "all()" query to clarify data dependency 2023-01-25 10:47:39 +09:00
Martin von Zweigbergk
ce094c618b repo: propagate error when current working-copy commit is not found
This should fix the panic in the case reported in #1107. It's a bit
hard to reproduce because we normally notice the missing commit when
we snapshot the working copy, but it's possible to reproduce it using
`--no-commit-working-copy`.

I suspect the added test is too brittle because it checks the exact
error message. On the other hand, it might be useful to have one test
case like this so we catch accidental changes in the format.
2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
63aa484046 repo: add a specific error type for MutableRepo::check_out() 2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
eb7de6dd3c repo: inline leave_commit() into single caller 2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
4777508df0 repo: make check_out() call edit()
This reduces duplication a little, and it makes logical sense.
2023-01-24 12:20:28 -08:00
Martin von Zweigbergk
dd3472924b repo: add a specific error type for MutableRepo::edit()
The new type is just an enum version of `RewriteRootCommit`.  I'll add
another variant soon.
2023-01-24 12:20:28 -08:00
dependabot[bot]
9dd5fe108a cargo: bump pest from 2.5.3 to 2.5.4
Bumps [pest](https://github.com/pest-parser/pest) from 2.5.3 to 2.5.4.
- [Release notes](https://github.com/pest-parser/pest/releases)
- [Commits](https://github.com/pest-parser/pest/compare/v2.5.3...v2.5.4)

---
updated-dependencies:
- dependency-name: pest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-01-24 16:50:17 +01:00
Yuya Nishihara
c82a62cf99 repo: turn IdIndex into sorted Vec, use binary search
Since IdIndex is immutable, we don't need fast insertion provided by BTreeMap.
Let's simply use Vec for some speed up. More importantly, this allows us to
store multiple (ChangeId, CommitId) pairs for the same change id, and will
unblock the use of IdIndex in revset::resolve_symbol().

Some benchmark numbers (against my "linux" repo) follow.

Command:
    hyperfine --warmup 3 "jj log -r master \
      -T 'commit_id.short_prefix_and_brackets()' \
      --no-commit-working-copy --no-graph"

Original:
    Time (mean ± σ):      1.892 s ±  0.031 s    [User: 1.800 s, System: 0.092 s]
    Range (min … max):    1.833 s …  1.935 s    10 runs

This commit:
    Time (mean ± σ):     867.5 ms ±   2.7 ms    [User: 809.9 ms, System: 57.7 ms]
    Range (min … max):   862.3 ms … 871.0 ms    10 runs
2023-01-23 07:38:04 +09:00
Yuya Nishihara
879f585b21 repo: leverage stored index to calculate shortest prefix in commit id space
With my "jj" work repo, this saves ~4ms to show the log with default revset.

Command:
    JJ_CONFIG=/dev/null hyperfine --warmup 3 --runs 100 \
      "jj log -T 'commit_id.short_prefix_and_brackets() \
                  change_id.short_prefix_and_brackets()' \
              --no-commit-working-copy"

Baseline (a7541e1ba4):
    Time (mean ± σ):      54.1 ms ±  16.4 ms    [User: 46.4 ms, System: 7.8 ms]
    Range (min … max):    36.5 ms …  78.1 ms    100 runs

This commit:
    Time (mean ± σ):      49.5 ms ±  16.4 ms    [User: 42.4 ms, System: 7.2 ms]
    Range (min … max):    31.4 ms …  70.9 ms    100 runs
2023-01-22 17:24:03 +09:00
Yuya Nishihara
2e9468772b index: add method to calculate shortest commit_id prefix
For simplicity, I made public API that returns the shortest length.
2023-01-22 17:24:03 +09:00
Yuya Nishihara
5a0931885d index: add ancestor iterators to CompositeIndex and rewrite loop/recursion
This iterator will be used to merge neighbor commit ids across segments.

resolve_prefix() is simplified to non-short-circuiting loop. I think that's
fine because visiting parents is cheap, and the costly operation here is
segment_resolve_prefix().

entry_by_pos() could also be migrated to iterator, but I leave the unsafe
bits there.
2023-01-22 17:24:03 +09:00
Yuya Nishihara
e71e9c99b2 index: add neighbor commit_id lookup to IndexSegment
ReadonlyIndex implementation leverages the existing binary search
function. MutableIndex one is basically the same as repo::IdIndex.

Shortest prefix length could be calculated for each segment, but I think
returning neighbors is better for testing.
2023-01-22 17:24:03 +09:00
Yuya Nishihara
a7541e1ba4 repo: add workaround for shortest prefix calculation of root ids
This is ugly, but we need a special case because root_change_id and
root_commit_id aren't equal but share the same prefix bytes. In practice,
no one would care for the shortest root id prefix, but we'll need to deal
with a similar problem when migrating prefix id resolution to repo layer.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
1a4b5c5ee6 index: make IdIndex store raw bytes, not hex bytes
This helps us to migrate commit_id index to ReadonlyIndex. For large
repositories, this also reduces initialization cost, but that's not the main
intent of this change.

https://github.com/martinvonz/jj/pull/1041#issuecomment-1399225876

common_hex_len() and iter_half_bytes() are added to backend.rs since more
call sites will be added to index.rs, and I feel index.rs isn't a good place
to host this kind of utility functions.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
65a659347e tests: pad odd-length hex bytes passed in to repo::IdIndex
This allows us to migrate IdIndex to raw bytes. In practice, these ids are
full hashes which should never be odd length.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
1d2642de1e repo: split commit_id and change_id indices
The goal is to replace the commit_id index with ReadonlyIndex to save the
initialization cost, but this also helps to fix root id handling.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
8c0f7d7707 backend: define root change id statically
I made it a free function. Alternatively, the root id could be instantiated
by and obtained through backend, but I don't think we'll need such level of
abstraction.

I'm going to add a workaround for shortest prefix calculation of the root ids,
where this function will be used.
2023-01-22 12:03:08 +09:00
Yuya Nishihara
ef33bd76df backend: declare CHANGE_ID_HASH_LENGTH as constant 2023-01-22 12:03:08 +09:00
Samuel Tardieu
8b644846a4 refactor: use #[from] on error alternative 2023-01-21 09:46:54 +01:00
Martin von Zweigbergk
8a1b21ff73 backend: implement equality for commits and trees
It can be useful in tests to be able to compare two commits or
trees. Most other structs already implement equality.
2023-01-20 23:26:20 -08:00
Samuel Tardieu
9846bf6c7f style: use bool::then() 2023-01-21 01:14:45 +01:00
dependabot[bot]
034a23f4ec cargo: bump git2 from 0.16.0 to 0.16.1
Bumps [git2](https://github.com/rust-lang/git2-rs) from 0.16.0 to 0.16.1.
- [Release notes](https://github.com/rust-lang/git2-rs/releases)
- [Commits](https://github.com/rust-lang/git2-rs/compare/git2-curl-0.16.0...0.16.1)

---
updated-dependencies:
- dependency-name: git2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-01-21 00:05:57 +00:00
Daniel Ploch
bd43580437 op_heads_store: remove LockedOpHeads
Make op resolution a closed operation, powered by a callback provided by the
caller which runs under an internal lock scope. This allows for greatly
simplifying the internal lifetime structuring.
2023-01-20 15:18:08 -08:00
Daniel Ploch
19962fcded op_heads_store: move the ancestry walk into the OpHeadsStore trait 2023-01-20 15:18:08 -08:00
Yuya Nishihara
763a8cc0f1 index: remove allocation from bisection loop of commit_id prefix search
This wouldn't actually matter since the depth of binary search is O(log(N)),
but I feel it's better to avoid allocation in comparison loop.
2023-01-21 01:44:21 +09:00
Yuya Nishihara
2b7021664d index: deduplicate binary search functions of commit_id lookup 2023-01-21 01:44:21 +09:00
Yuya Nishihara
049a9261ab index: change return type of commit_id_byte_prefix_to_pos()
I don't think the returned position here is an IndexPosition, but a local
"lookup" index of the entry.
2023-01-21 01:44:21 +09:00
Yuya Nishihara
a574987955 index: remove redundant slicing from commit_id-prefix search function
If commit_id[..prefix_len] < prefix, commit_id < prefix is obviously true.
If commit_id[..prefix_len] == prefix, commit_id < prefix returns false. So
slicing isn't needed.

This makes commit_id_byte_prefix_to_pos() basically the same as
segment_commit_id_to_pos(), and these two functions can be merged.
2023-01-21 01:44:21 +09:00
Yuya Nishihara
55dd3a3747 index: do not build hex string to test prefix match, use .as_bytes()
matches() is called from resolve_change_id() loop right now, so it's better to
not allocate String there. Regarding new IdIndex integration, I'll probably make
IdIndex store raw byte ids instead of hexes, and use HexPrefix to look up
range and test prefixes. I think this is basically the same as prefix lookup
in MutableIndex, but I have no idea if we can factor out a common interface.

I made HexPrefix store (Vec<u8>, bool) instead of (Vec<u8>, Option<u8>) so
both min/partial prefixes can be borrowed as slice.
2023-01-19 22:41:29 +09:00
Yuya Nishihara
7e0ba8c002 index: abstract target type of HexPrefix by leveraging ObjectId trait
Another option is HexPrefix<T: ObjectId>, but we might want to build HexPrefix
once, and test it against CommitId and ChangeId.
2023-01-19 22:41:29 +09:00
Michael Forster
27228ce292 Update MSRV to 1.61
This is needed for compatibility with the sapling dag crate.
2023-01-19 10:29:39 +01:00
Martin von Zweigbergk
0f8622dd5c repo: move test_id_index() into a tests module
This is the usual convention (to save on compilation time when not
running tests).
2023-01-18 16:59:16 -08:00
Yuya Nishihara
11c6903786 transaction: remove useless Option wrapping MutableRepo 2023-01-18 09:00:21 -08:00
Martin von Zweigbergk
985555f393 git_backend: avoid redoing some steps when retrying in write_commit()
By inlining `wite_commit_internal()` into `write_commit()`, we can
avoid redoing some steps when we retry. This includes taking the mutex
lock, and reading the tree object and parent commits. It also means
that we avoid cloning the input commit object, which we otherwise
would even in the non-retrying case. I haven't measured if any of this
makes a significant difference, but I think it also slightly
simplifies the code, so it doesn't have to.
2023-01-17 23:12:50 -08:00
Ilya Grigoriev
606eefa8c4 A BTree-based index of commit & change ids to optimize unique_prefix
This is fast enough to be used on medium-sized repositories such as git/git.
It is a bit slow, but bearable, on huge repositories such as torvalds/linux.

There is 0 performance penalty if the display of unique prefixes is disabled

A trie-based implementation will be submitted for consideration in a
follow-up PR. It is faster, but more complicated.

**Update:** I also just discovered https://sapling-scm.com/docs/internals/indexedlog/

There are three important aspects of performance that seemed relevant:

1. Speed of computing the shortest unique prefix per id. It is worlds faster
  than the naive implementation before this commit. It can be optimized
  furher by using a trie or maybe the `fst` crate.

2. Speed of inital loading of the index that happens before the first commit is
  shown. This is the part that's noticeable but bearable on torvalds/linux. 
  
  This could be optimized by storing a sorted list of commit and change ids on
  disk.  This would likely involve reworking the `Index`.

  Failing that, the speed of inital loading doesn't change if a trie is used
  and would likely be worse with the `fst` crate

3. Memory use is unremarkable here. I don't have good tools to measure it
  precisely, but it does not balloon to gigabytes even on the linux repo.
2023-01-17 22:01:09 -08:00
Ilya Grigoriev
e7c434d492 Make ui.unique-prefixes default to brackets 2023-01-17 22:01:09 -08:00
Ilya Grigoriev
67b81a77b8 Config: ui.unique-prefixes to show id shortest unique prefixes
Currently, the possible values are `underscore` and `none`. For now, `none`
is the default, since the `underscore` value messes up copy and pasting of
ids. In the future, an `underline` value should be implemented and will
likely become the default.

Screenshot of `underscore`: https://user-images.githubusercontent.com/4123047/212502483-4119fb17-0601-4335-9770-196e36a6bc31.png
2023-01-17 22:01:09 -08:00
Ilya Grigoriev
19d341d32a Templater: naive implementation of shortest prefix highlight for ids
This creates a templater function `short_underscore_prefix` for commit and
change ids. It is similar to `short` function, but shows one fewer hexadecimal
digit and inserts an underscore after the shortest unique prefix.

Highlighting with an underline and perhaps color/bold will be in a follow-up
PR.

The implementation is quadratic, a simple comparison of each id with every
other id. It is replaced in a subsequent commit. The problem with it is that,
while it works fine for a `jj`-sized repo, it becomes is painfully slow with a
repo the size of git/git. 

Still, this naive implemenation is included here since it's simple, and could
be used as a reference implementation. 

The `shortest_unique_prefix_length` function goes into `repo.rs` since that's
convenient for follow-up commits in this PR to have nicer diffs.
2023-01-17 22:01:09 -08:00