The VS Code "Better TOML" plugin (which I think most of our VS Code developers use?) doesn't support the `x.y = z` syntax at the top level, even though it's valid TOML.
This is also useful if we ever want to add additional properties in different sub-crates (although unlikely for the near future).
When the main `TreeState::snapshot()` thread doesn't receive any
updated tree entries over the channel, it correctly doesn't write a
new tree. However, it also doesn't write the working copy state file
(`.jj/working_copy/tree_state`). This resulted in performance
regression in 3f97a6da78. From that commit, repeated snapshotting
would have to re-read all files from disk because it didn't remember
the updated mtime from the previous time.
This patch fixes the bug by also writing the file if there were any
new file states.
This doesn't seem to make any difference right now, but it will if we
write the state file when there are mtime-only changes, which we
currently don't do.
`revset::parse()` already has a `RevsetWorkspaceContext` argument, so
I think it makes sense to put that and the other context arguments
into a larger `RevsetParseContext` object.
We resolve file paths into repo-relative paths while parsing the
revset expression, so I think it's consistent to also resolve which
workspace "@" refers to while parsing it. That means we won't need the
workspace context both while parsing and while resolving symbols.
In order to break things like `author("martinvonz@")` (thanks to @yuja
for catching this), I also changed the parsing of working-copy
expressions so they are not allowed to be
quoted. `author(martinvonz@)` will therefore be an error now. That
seems like a small improvement anyway, since we have recently talked
about making `root` and `[workspace]@` not parsed as other symbols.
Per discussion in #2107, I believe "exact" is preferred.
We can also change the default to exact match, but it doesn't always make
sense. Exact match would be useful for branches(), but not for description().
We could define default per predicate function, but I'm pretty sure I cannot
remember which one is which.
git-branchless calls it a substring, so let's do the same.
FWIW, I copied literal:_ from Mercurial, but it's exact:_ in git-branchless.
I have no idea which one is preferred. Since this feature isn't released, we
can freely change it if exact:_ makes more sense.
https://github.com/arxanas/git-branchless/wiki/Reference:-Revsets#patterns
This commit replaces the functions `UserSettings::user_name_placeholder()`` and
`UserSettings::user_email_placeholder()` with `const` `&str`s to emphasize that
the placeholder strings must not be changed to support commits without
names or email addresses made before this change.
The code for getting the current tree object was repeated a few times
over. I'm going to soon make it return a `MergedTree` and I don't want
to repeat that code (it's more complicated than the current code).
The syntax is slightly different from Mercurial. In Mercurial, a pattern must
be quoted like "<kind>:<needle>". In JJ, <kind> is a separate parsing node, and
it must not appear in a quoted string. This allows us to report unknown prefix
as an error.
There's another subtle behavior difference. In Mercurial, branch(unknown) is
an error, whereas our branches(literal:unknown) is resolved to an empty set.
I think erroring out doesn't make sense for JJ since branches() by default
performs substring matching, so its behavior is more like a filter.
The parser abuses DAG range syntax for now. It can be rewritten once we remove
the deprecated x:y range syntax.
Add type annotation to `vec` to avoid the following build error if you
additionally import `bstr`:
```
~/jj> cargo test
Compiling jj-lib v0.8.0 (/home/aspotashev/jj/lib)
warning: unused import: `bstr`
--> lib/src/default_index_store.rs:30:5
|
30 | use bstr;
| ^^^^
|
= note: `#[warn(unused_imports)]` on by default
error[E0282]: type annotations needed
--> lib/src/default_index_store.rs:564:14
|
564 | .as_mut()
| ^^^^^^
565 | .write_u32::<LittleEndian>(parent_overflow.len() as u32)
| --------- type must be known at this point
|
help: try using a fully qualified path to specify the expected types
|
563 | <[u8] as AsMut<T>>::as_mut(&mut buf[parent_overflow_offset..parent_overflow_offset + 4])
| +++++++++++++++++++++++++++++++ ~
For more information about this error, try `rustc --explain E0282`.
warning: `jj-lib` (lib) generated 1 warning
error: could not compile `jj-lib` (lib) due to previous error; 1 warning emitted
```
Reason to support `bstr` being imported: in a Bazel environment where
crates are imported with certain features enabled, jj-lib may pull in
bstr as part of the following dependency chain:
jj-lib -> insta -> similar -> bstr.
We now have all the pieces in place to read the current tree as a
`MergedTree` when snapshotting the working copy. For now, it's still
always a legacy tree. We'll need to update the working copy state file
to support storing multiple trees before we can create a `MergedTree`
with multiple sides here.
For tree-level conflicts, we're going to be getting
`Merge<Option<TreeValue>>` from the current tree and produce a new
such value if contents changes on disk. This commit gets us a little
closer to that by passing in a value of that type into
`write_path_to_store()`.
This seems to have a small but measurable performance
impact. Snapshotting the working copy in the git repo with all files
`touch`ed went from 2.36 s to 2.43 s (3%). I think that's okay,
especially since most files' mtimes rarely change, and we only pay the
price when it has.
If the value at a path hasn't changed, there's no need to send it over
the channel and have the receiver add it to `TreeBuilder`. I couldn't
measure any performance impact.
Now we should no longer send `TreeValue::Conflict` variants over the
tree entry channel.
When writing tree-level conflicts, we're going to be writing multiple
tree (maybe using some new `MergedTreeBuilder`), so we'll need the
full `Merge<Option<TreeValue>>` object. This gets us closer to that by
sending such objects over the channel and having the receiver write
the conflict object.
Note that we still sometimes send `TreeValue::Conflict` variants over
the channel. That only happens if they're unchanged.
When writing tree-level conflicts, we won't pass `TreeValue::Conflict`
over the `tree_entries` channel. Instead, we're going to pass possibly
unresolved `Merge<Option<TreeValue>>` instances. This commit prepares
for that by changing the type even though we'll only pass
`Merge::normal()` over the channel at this point.
I did this partly to see what the performance impact is. I tested that
by touching all files in the git.git repo to force the trees (and
files) to be rewritten. There was no measurable impact at all
(best-of-10 time was 2.44 s before and 2.40 s after, but I assume that
was a fluke).
This basically means that heads in a filtered graph appear in reverse
chronological order. Before, "jj log -r 'tags()'" in linux-stable repo would
look randomly sorted once you ran "jj debug reindex" in it.
With this change, indexing is more like breadth-first search, and BFS is
known to be bad at rendering nice graph (because branches run in parallel.)
However, we have a post process to group topological branches, so we don't
have this problem. For serialization formats like Mercurial's revlog iirc,
BFS leads to bad compression ratio, but our index isn't that kind of data.
Reindexing gets slightly slower, but I think this is negligible.
(in Git repository)
% hyperfine --warmup 3 --runs 10 "jj debug reindex --ignore-working-copy"
(original)
Time (mean ± σ): 1.521 s ± 0.027 s [User: 1.307 s, System: 0.211 s]
Range (min … max): 1.486 s … 1.573 s 10 runs
(new)
Time (mean ± σ): 1.568 s ± 0.027 s [User: 1.368 s, System: 0.197 s]
Range (min … max): 1.531 s … 1.625 s 10 runs
Another idea is to sort heads chronologically and run DFS-based topological
sorting. It's ad-hoc, but worked surprisingly well for my local repositories.
For repositories with lots of long-running branches, this commit will provide
more predictable result than DFS-based one.
With the new `Merge::iter()`, we can simplify the code a bit by
combining that with `zip`.
I'll simplify the last part of `update_from_content()` next.
Implementing `Iterator` and `FromIterator` on `Merge<T>` provides much
more flexibility than the current `map()`, `try_map()`, etc.
`Merge::from_iter()` wouldn't have a way of failing if it's given an
unexpected (even) number of items. I would be fine with having it
panic, but we can't even usefully do that, because
e.g. `Option::from_iter()` will pass us an iterator ends early if the
input interator ends early. For example,
`Merge::resolved(None).iter().collect()` would call
`Merge::from_iter()` with an empty iterator (first item `None`). So, I
instead created a `MergeBuilder` type implementing `FromIterator`, and
let `MergeBuilder::build()` panic if there were an even number of
items.
I re-implemented some existing `Merge` methods using the new
facilities in this commit. Maybe we should remove some of the methods.
This allows us to reorder commits to be indexed in bulk.
The incremental update optimization is applied only for a single head. This
could be tried for multiple heads, but it's unlikely that every head has
a single new commit for each.
This is similar to what mut_repo.add_head() does.
I'm going to adjust the visiting order so the bulk-imported history preserves
chronological order. It might be a small adjustment on the current DFS
approach, or new function based on Kahn's algorithm. Either way, it's important
that both "jj git import" and "jj debug reindex" use the same underlying
function.
Almost the entire method deals with `FileType::Normal`, so we can
reduce indentation and repeated matching on the file type by doing it
early and returning in the non-normal-file cases.
For tree-level conflicts, we're eventually not going to have
`ConflictId`. We'd want to make `write_conflict_to_store()` take a
`Merge<Option<TreeValue>>` and return an updated such value. That
would leave very little logic in the function, so let's just inline it
instead.
`update_from_content()` already writes file content for each term of
an unresolved merge, so it seems consistent for it to also write the
file content for resolved merges. I think this should simplify further
refactoring for tree-level conflicts and for preserving the executable
bit.
Since `update_from_contents()` only works with file contents and not
the executable or other kinds of paths, I think it makes more sense
for it to deal with `FileId`s instead of `TreeValue`s.
There were still many instances of `conflict` left from before we
renamed `Conflict<T>` to `Merge<T>`. I decided to rename many of them
based on the type parameter instead of the container. I think that
made it more readable in many cases.
I think I moved way too many functions onto `Merge<Option<TreeValue>>`
in 82883e648d. This effectively reverts almost all of that
commit. The `Merge<T>` type is simple container and it seems like it
should be at fairly low level in the dependency graph. By moving
functions off of it, we can get rid of the back-depdencies from the
`merge` module to the `conflict` module that I introduced when I moved
`Merge` to the `merge` module. I'm thinking the `conflict` module can
focus on materialized conflicts.
Per discussion in #2009. This behavior isn't affected by e7e49527ef "git:
ensure that remote branches never diverge", but it's subtle enough to write
a test.
I was considering how refs would be imported if we had a per-remote view of
named branches (and tags): Each remote has a view, and jj remembers the last
known view state to compute diffs. That's the same for the pseudo "git" remote.
Under the current storage, these view states are represented as follows:
git_refs["refs/heads/{name}"] # pseudo "git" remote branches
git_refs["refs/tags/{name}"] # pseudo "git" remote tags
git_refs["refs/remotes/{remote}/{name}"] # real remote branches
and the diffs are merged in to branches[name].local_target and tags[name].
We also have branches[name].remote_targets[remote], but I think it's redundant
because a tracking branch should also be the last known state, not something
that can diverge from the actual state. To make that clear, this commit
replaces the use of the "merge" API.
As reported in #1970, SSH authentication would sometimes run into a
loop where it repeatedly tries to use ssh-agent for authentication
without making progess. The problem can be reproduced by simply
removing `$SSH_AUTH_KEY` from your environment (and not having a Git
credentials helper configured, I think).
This seems to be a bug introduced by b104f8e154c21. That commit meant
to make it so we attempt to use ssh-agent and fall back to using
(password-less) keys after that. The problem is that
`git2::Cred::ssh_key_from_agent()` just returns an object that will be
used later for looking up the credentials from ssh-agent, so the call
will not fail because ssh-agent is not reachable.
This commit attempts to fix the problem by having the credentials
callback attempt to use ssh-agent only once.
Perhaps the most important invariant in `.jj/working_copy/tree_state`
is that its set of files in it matches the files in its tree. In
particular, if a file that exists in the tree doesn't exist in the
file state and doesn't exist on disk either, we won't notice that it's
gone, and we will therefore not delete it from the tree on future
rounds of snapshotting either.
Now that we process the outputs from the file system traversal by
reading from channels, we can separate the processing from the file
system traversal. When the working copy is unchanged, processing tree
entries and deleted files takes practically no time, but processing
file states and present files takes significant time.
Since `Conflict<T>` can also represent a non-conflict state (a single
term), `Merge<T>` seems like better name.
Thanks to @ilyagr for the suggestion in
https://github.com/martinvonz/jj/pull/1774#discussion_r1257547709
Sorry about the churn. It would have been better if I thought of this
name before I introduced `Conflict<T>`.
Summary: There's no need to go around specifying `rust-version` or `edition` or
`version` several times, now that we have a global workspace. Instead, inherit
workspace metadata from the top-level Cargo.toml file.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: Iaf905445978ed2b3377239dcdb8a6c32
Summary: This moves all dependencies across the jj-lib and jj-cli crates into
the top-level Cargo file; with that, we can change each crate instead to just
inherit the workspace version, with the toggled features enabled, by setting
a dependency such as:
dep.workspace = true
in the relevant Cargo.toml file.
This doesn't actually change any of the build semantics (from what I can tell)
nor the lockfile, and seems to respond normally. There are more cleanups that
can follow.
Two notes:
- Dependabot seems to work fine, based on what I've seen in other repos.
- `cargo add` doesn't seem to know how to add packages to a top-level
`workspace.dependencies` field; instead you can `cargo add -p jj-cli`
and move the entries, at least.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I307827e5f15c0d8ea8e2a80ec793d3c7
This is simpler than carefully tracking mutation through old/new git refs and
merged local branches. There are two subtle behavior changes:
a. unimported git refs excluded by git_ref_filter() are not pinned.
b. unexported branches are pinned (so fetched deletion doesn't abandon the
branch if it's referenced by another branch.)
I think (a) is okay (and even more correct) since such refs aren't known to jj
yet. (b) is desired.