It took a while before I realized that conflicts could be modeled as
simple algebraic expressions with positive and negative terms (they
were modeled as recursive 3-way conflicts initially). We've been
thinking of them that way for a while now, so let's make the
`ConflictPart` name match that model.
When we materialize modify/delete conflicts, we currently don't
include any context lines. That's because modify/delete conflicts have
only two sides, so there's no common base to compare to. Hunks that
are unchanged on the "modify" side are therefore not considered
conflicting, and since they they don't contribute new changes, they're
simply skipped (here:
3dfedf5814/lib/src/files.rs (L228-L230)).
It seems more useful to instead pretend that the missing side is an
empty file. That way we'll get a conflict in the entire file.
We can still decide later to make e.g. `jj resolve` prompt the user on
modify/delete conflicts just like `hg resolve` does (or maybe it
actually happens earlier there, I don't remember).
Closes#1244.
If I understand correctly, the 'revset lifetimes on `Box<dyn
Revset<'index> + 'revset>` are not constrained by the lifetime of a
revset; we don't have any revsets that borrow data from other
revsets. Instead, they're all about constraining a boxed revset to the
index's lifetime. Without the lifetime annotation, it would default to
'static, and the borrow-checker doesn't like `dyn Revset<'index> +
'static`, since the revset could then live longer than the index it
borrows.
It's been about 10 weeks and 730 commits since 0.6.0, compared to
about 7 weeks and 350 commits between 0.5.0 and 0.6.0, so it's time
for a new release. There's been significant user-visible changes and
code-quality improvements. Thanks, everyone!
This is just a little preparation for extracting a `Repo` trait that's
implemented by both `ReadonlyRepo` and `MutableRepo`. The `index()`
function in that trait will of course have to return the same type in
both implementations, and that type will be `&dyn Index`.
Even though we don't know the details yet, we know that we want to
make the index pluggable like the commit and opstore
backends. Defining a trait for it should be a good step. We can refine
the trait later.
By separating the value spaces change ids and commit ids, we can
simplify lookup of a prefix. For example, if we know that a prefix is
for a change id, we don't have to try to find matching commit ids. I
think it might also help new users more quickly understand that change
ids are not commit ids.
This commit is a step towards that separation. It allows resolving
change ids by using hex digits from the back of the alphabet instead
of 0-f, so 'z'='0', 'y'='1', etc, and 'k'='f'. Thanks to @ilyagr for
the idea. The regular hex digits are still allowed.
Supported values are,
- `none` for no author information,
- `full` for both the name and email,
- `name` for just the name,
- `username` for username part of the email,
- (default) `email` (or any other gibberish for that matter) for the full email.
There's a subtle difference between
- 'expression = { whitespace* ... whitespace* }', and
- '_{ whitespace* ~ expression ~ whitespace* }'.
The former includes surrounding whitespace in an "expression", the latter
doesn't. This affects the span of error indication.
The added expect_arguments() is basically a copy from the template_parser.
I'll reimplement it to support keyword arguments, so I don't care much about
the current implementation.
I leave expect_no/one_argument() as wrappers because parsing 0/1 arguments
is pretty common.
Error messages are slightly changed. I personally prefer not to add extra
code for singular/plural handling, but if we do, I'll add 'if N == 1' case.
Our internal backend at Google uses a 32-byte change id, so I'd like
to make the backend able to decide the length. To start with, let's
make the backend able to decide what the root change id should
be. That's consistent with how we already let the backend decide what
the root commit id should be.
The function is currently only about the length of commit IDs, so
let's clarify that. I'm going to add another function for the length
of change IDs next. I don't know if we're going to care about lengths
of other hashes in the future. We might even be able to remove the
current restriction that all commit IDs and all change IDs have the
same length.
I think the CLI currently checks that the backend is not told to write
a merge commit with the root as one parent, but we should not panic if
those checks fail.
Git's HEAD ref is similar to other refs and can logically have
conflicts just like the other refs in `git_refs`. As with the other
refs, it can happen if you run concurrent commands importing two
different updates from Git. So let's treat `git_head` the same as
`git_refs` by making it an `Option<RefTarget>`.
Add a new git.auto-local-branch config option. When set to false, a
remote-tracking branch imported from Git will not automatically create a
local branch target. This is implemented by a new GitSettings struct
that passes Git-related settings from UserSettings.
This behavior is particularly useful in a co-located jj and Git repo,
because a Git remote might have branches that are not of everyday
interest to the user, so it does not make sense to export them as local
branches in Git. E.g. https://github.com/gitster/git, the maintainer's
fork of Git, has 379 branches, most of which are topic branches kept
around for historical reasons, and Git developers wouldn't be expected
to have local branches for each remote-tracking branch.
I don't think there's a good reason not to write the
`.jj/working_copy/tree_state` file on init. Being able to assume that
the file exists means that we won't need the store object to to lazily
load the `TreeState` object. Well, except that `TreeState` keeps an
`Arc<Store>`, but I'm trying to change that.
When building an initial index from an existing Git repo, for example,
we walk parents and predecessors to find all commits to index. Part of
that code was looking up the whole parent and predecessor commits even
though it only needed the ids. I don't know if this has a measurable
impact on performance, but it's not really any more complex to just
get the ids anyway.
I would expect `Commit::is_empty()` to check if the commit is empty in
our usual sense, i.e. that there are no changes compared to the
auto-merged parents. However, it would return `false` for any merge
commit (and for the root commit). Since we only use it in one place,
let's inline it there. The use there does seem reasonable, because
it's about abandoning an "uninteresting" working-copy commit.
I think of it more as style than a format, so using `style` in the
config key makes sense to me.
I didn't bother making upgrades easy by supporting the old name since
this was just released and only a few developers probably have it set.
revset::resolve_change_id() for ReadonlyRepo will be replaced with this
implementation. This doesn't mean revset query will speed up. A trivial
query will become slower due to the initialization cost of the change id
index. "jj log -r hex" will get faster since we have to pay the cost anyway.
Benchmark numbers (against my "linux" repo):
Command:
hyperfine --warmup 3 --runs 20 \
"jj log -r $hex -T '' --no-commit-working-copy --no-graph"
Linear search (e874570947):
Time (mean ± σ): 223.9 ms ± 16.2 ms [User: 181.2 ms, System: 42.7 ms]
Range (min … max): 207.7 ms … 247.6 ms 50 runs
Building IdIndex:
Time (mean ± σ): 855.0 ms ± 21.7 ms [User: 788.4 ms, System: 66.6 ms]
Range (min … max): 822.6 ms … 927.5 ms 50 runs
Building IdIndex, but hacked to store SmallVec<[u8; 20]>:
Time (mean ± σ): 406.1 ms ± 15.9 ms [User: 354.1 ms, System: 52.0 ms]
Range (min … max): 382.2 ms … 428.6 ms 50 runs
For my "jj" work repo, changes are < ~1ms.
I've preferred "working-copy commit" over "checkout" for a while
because I think it's clearer, but there were lots of places still
using "checkout". I've left "checkout" in places where it refers to
the action of updating the working copy or the working-copy commit.
`SimpleOpHeadsStore` currently stores its files in
`.jj/repo/op_heads/simple_op_heads/`. The `.jj/repo/op_heads/type`
file indicates the type of op-heads backend. If that contains
"simple_op_head_store", we use the `SimpleOpHeadsStore`
backend. There's no need for the `simple_op_heads` directory to also
indicate the type of backend in its name. I kept just the `heads` in
the name to make it less redundant with the parent directory (which is
`op_heads)`. We could alternatively call the directory `values` or
similar.
Since this function depends on both index and view, it can't be moved to
one of the storage objects. If we go forward with this approach, some
revset::resolve_*() functions will also be migrated to RepoRef.
This patch slightly changes the function name since a "prefix" might have
various meanings.
This should fix the panic in the case reported in #1107. It's a bit
hard to reproduce because we normally notice the missing commit when
we snapshot the working copy, but it's possible to reproduce it using
`--no-commit-working-copy`.
I suspect the added test is too brittle because it checks the exact
error message. On the other hand, it might be useful to have one test
case like this so we catch accidental changes in the format.
Since IdIndex is immutable, we don't need fast insertion provided by BTreeMap.
Let's simply use Vec for some speed up. More importantly, this allows us to
store multiple (ChangeId, CommitId) pairs for the same change id, and will
unblock the use of IdIndex in revset::resolve_symbol().
Some benchmark numbers (against my "linux" repo) follow.
Command:
hyperfine --warmup 3 "jj log -r master \
-T 'commit_id.short_prefix_and_brackets()' \
--no-commit-working-copy --no-graph"
Original:
Time (mean ± σ): 1.892 s ± 0.031 s [User: 1.800 s, System: 0.092 s]
Range (min … max): 1.833 s … 1.935 s 10 runs
This commit:
Time (mean ± σ): 867.5 ms ± 2.7 ms [User: 809.9 ms, System: 57.7 ms]
Range (min … max): 862.3 ms … 871.0 ms 10 runs
With my "jj" work repo, this saves ~4ms to show the log with default revset.
Command:
JJ_CONFIG=/dev/null hyperfine --warmup 3 --runs 100 \
"jj log -T 'commit_id.short_prefix_and_brackets() \
change_id.short_prefix_and_brackets()' \
--no-commit-working-copy"
Baseline (a7541e1ba4):
Time (mean ± σ): 54.1 ms ± 16.4 ms [User: 46.4 ms, System: 7.8 ms]
Range (min … max): 36.5 ms … 78.1 ms 100 runs
This commit:
Time (mean ± σ): 49.5 ms ± 16.4 ms [User: 42.4 ms, System: 7.2 ms]
Range (min … max): 31.4 ms … 70.9 ms 100 runs
This iterator will be used to merge neighbor commit ids across segments.
resolve_prefix() is simplified to non-short-circuiting loop. I think that's
fine because visiting parents is cheap, and the costly operation here is
segment_resolve_prefix().
entry_by_pos() could also be migrated to iterator, but I leave the unsafe
bits there.
ReadonlyIndex implementation leverages the existing binary search
function. MutableIndex one is basically the same as repo::IdIndex.
Shortest prefix length could be calculated for each segment, but I think
returning neighbors is better for testing.
This is ugly, but we need a special case because root_change_id and
root_commit_id aren't equal but share the same prefix bytes. In practice,
no one would care for the shortest root id prefix, but we'll need to deal
with a similar problem when migrating prefix id resolution to repo layer.
This helps us to migrate commit_id index to ReadonlyIndex. For large
repositories, this also reduces initialization cost, but that's not the main
intent of this change.
https://github.com/martinvonz/jj/pull/1041#issuecomment-1399225876
common_hex_len() and iter_half_bytes() are added to backend.rs since more
call sites will be added to index.rs, and I feel index.rs isn't a good place
to host this kind of utility functions.
I made it a free function. Alternatively, the root id could be instantiated
by and obtained through backend, but I don't think we'll need such level of
abstraction.
I'm going to add a workaround for shortest prefix calculation of the root ids,
where this function will be used.