This commit moves the parse_string_pattern helper function into the
str_util module in jj lib and adds tests for it.
I'd like to reuse this code in a function defined by `UserSettings`, which is
part of the jj lib crate and cannot use functions from the cli crate.
This makes the summary line more informative. Even though it just duplicates
the message printed later, I think it's easier to follow.
This patch also adjusts some RevsetParseError messages because it seemed
redundant to repeat "revset function", "argument", etc.
Because the CLI error handler now prints error sources in multi-line format,
it doesn't make much sense to render Revset/TemplateParseError differently.
This patch also fixes the source() of the SyntaxError kind. It should be
self.pest_error.source() (= None), not self.pest_error.
I'm going to make TemplateParseError hold RevsetParseError as Box<dyn _>, but
Box<dyn std::error::Error ..> doesn't implement Eq. I could remove Eq from
ErrorKind enums, but it's handly if these enums remain as value types.
This change will also simplify fmt::Display and error::Error impls.
It's common to normalize an empty directory path as ".". This change unblocks
the use of from_relative_path() in edit_sparse().
There are a couple of callers who do to_fs_path(Path::new("")), but they all
translate non-directory paths, which should never be empty.
We currently include the commits in `parent_mapping` and `abandoned`
in the set of commits to visit when rebasing descendants. The reason
was that we used to update branches and working copies when we visited
these commits. Since we started updating refs after rebasing all
commits, there's no need to even visit these commits.
We only use `new_commits` in `update_heads()`, so let's calculate it
there. It should also be more correct in case other commits were
created after we initialized `DescendantRebaser`.
Now that we only call `update_references()` in one place, there's no
reason to have it also update `heads_to_add` and `heads_to_remove`. By
moving it out of the function, we can consolidate the logic in one
place.
When `rebase_commit_with_options()` decides to abandons a commit, it
records the new parents in the `MutableRepo`, but it's currently the
caller's responsibility to remember to mark it as abandoned. Let's
move that logic into the function to reduce the risk of future bugs.
By adding the abandoned commit's parents to `parent_mapping`, we can
remove a bit more of the special handling of abandoned commitsin
`DescendantRebaser`.
In the normal case when we don't abandon a commit because it became
empty, then `CommitBuilder::write()` will have recorded the new commit
as a rewrite of the old commit. We don't need to do that again in
`rebase_one()`.
A subset of the state in `DescendantRebaser` now matches exactly what
`MutableRepo` already stores, so we can avoid copying that state and
have `DescendantRebaser` use it directly instead. Having a single
source of truth for the state will enable further simplifications and
improvements.
I'm going to make `DescendantRebaser` share the state about rewritten
commits with `MutableRepo` next. That means that the call to
`rebase_commit_with_options()` will update that state, which would
make this assertion fail. So let's move it a little earlier to avoid
that.
This is just to match `DescendantRebaser`, to make the next commit a
bit simpler. I think `MutableRepo` still has few enough fields that
just `abandoned` is clear enough. Maybe we'll move the three
rewrite-related fields into a new struct at some point.
With this patch, `MutableRepo` has the same tracking of rewritten
commits as `DescendantRebaser`, so we can simply pass that state into
`DescendantRebaser` when we create it. The next step is to remove the
state from `DescendantRebaser`.
I don't think we have any callers left that call
`record_rewritten_commit()` multiple times within a transaction and
expect it to result in divergence. I think we should consider it a bug
to do that.
When rebasing descendants, we generally move branches, child commits,
the working copy to the rewritten commit(s). However, we don't move
the working copy to the new rewritten commit (s) if the old commit had
been abandoned, and we don't move child commits if the rewriten was
divergent.
This patch aims to make it clearer that there's only one mapping from
old to new parents, and that is in `parent_mapping`. It does so by
merging the current `divergent` map into it, and makes the `divergent`
just a set instead. When finding the new parents for a child, we leave
the existing parent if it's in the set.
My longer-term goal is to move `parent_mapping`, `abandoned`, and
`divergent` into `MutableRepo` (maybe in a nested struct), so we can
do some transformations on descendants as we rebase them. By having
the state in a single place (not moving it from `MutableRepo` to
`DescendantRebaser` as we currently do), I hope it will be easier to
write a `MutableRepo::transform_descendants(callback)`, where the
callback gets a `CommitBuilder` and can change parents of the commit,
for example.
Apart from (IMO) looking nicer, this will also sidestep the potential problem
that if the file contains actual jj conflict markers (`>>>>>>>` in the beginning
of a line, for example), jj would currently have trouble materializing and
subsequently parsing conflicts in the file if it actually became conflicted.
I'll demo this bug in either this or a subsequent PR. It's the kind of bug that
sounds serious in theory but might never cause a problem in practice.
After this PR, only `docs/tutorial.md` has a conflict marker that's not indented.
There's only one there, so hopefully it won't be too much of a pain to deal with.
I also indented other strings in `test_conflicts.rs`. IMO, this looks nice and
more consistent with the `insta::assert_snapshot` output. I didn't spend the
time to do the same for `test_resolve_command`.
Suppose we have an alias 'immutable()' = '::immutable_heads()', user can
express (visible) mutable set as '~immutable()'. 'immutable_heads()..' can
terminate early, but a generic difference 'all() & ~immutable()' can't.
Suppose the generation value is usually small, it should be faster to do
bounded range look up first 'y-', then walk ancestors with the unwanted set
'y-..x'.
When an operation is missing and we recover the workspace, we create a
new working-copy commit on top of the desired working-copy commit (per
the available head operation). We then reset the working copy to an
empty tree because it shouldn't really matter much which commit we
reset to. However, when the workspace is sparse, it does matter, as
the test case from the previous patch shows. This patch fixes it by
replacing the `reset_to_empty()` method by a new `recover(&Commit)`,
which effectively resets to the empty tree and then resets to the
commit. That way, any subsequent snapshotting will result keep the
paths from that tree for paths outside the sparse patterns.
Prepares for removing &CompositeIndex from the RevsetGraphIterator struct.
The input iterator will also be changed to position-based.
I've turned self.look_ahead.get().unwrap() into assertion, but it's not super
important here. It's just for sanity that we've mapped missing edges properly.
FWIW, we could say RevsetGraphIterator is an example of iterating *and* testing
membership of the input revset (though the yielded entries are discarded.)
For the same reason as the previous commit. Since self.inner.positions()
basically clones the underlying evaluation tree, there is no reason to stick
to &self lifetime. Perhaps, some of the CLI utility can be changed to not
collect() the iterator.
Migrating iter_graph() requires non-trivial changes, so it will be done
separately.
This allows callers to cache the returned function at 'index lifetime. It's
important in templater. It also means the returned function could be 'static
if the index were Arc<_> and we had a trait interface to achieve that.
Option<Box<dyn ..>> is removed since RevWalk is fused.
This makes the whole evaluation tree 'static, and we can freely move it without
keeping the root RevsetImpl object alive.
Perhaps, "Self: 'a" can be replaced with 'static, but let's leave it for now.
It's not technically wrong to store lifetimed object in InternalRevset.
Prepares for dropping &self lifetime from to_predicate_fn(). All predicate
functions could be wrapped as Box::new(PurePredicateFn(Rc::new(f))) instead, but
I don't think the .clone() cost matters.
This is the step towards removing &CompositeIndex references from the revset
evaluation tree. The filter input is changed from &IndexEntry to IndexPosition
to simplify the lifetime thingy. We might want to pass around CommitId or
Commit object once it's loaded, but that can be implemented later. I don't
see significant performance difference in revset benches.
This serves the same role as templater::Literal. I'm going to add basic
RevWalk adapters so that the revset evaluation tree can be constructed without
capturing the index. EagerRevWalk will help to write tests for these adapters.
Now as default and elided node symbols come from the config, the next logical
step is to use them directly bypassing GraphLog. Note that commands like `jj op
log` and `jj obslog` do not use the elided node symbol at all.
Just a minor code cleanup. We still need Index for &CompositeIndex because the
type is unsized, and unsized type cannot be converted to another dyn reference.
This helps to eliminate higher-ranked trait bounds from RevWalkRevset and
RevWalk combinators to be added. Since &CompositeIndex is now a real reference,
it can be passed to functions as index: &T.
This helps to migrate CompositeIndex<'_> wrapper to &CompositeIndex. If
the wrapped reference had a lifetimed field, it couldn't be represented as
a trivial reference type.
We haven't used custom Git commit headers for two main reasons:
1. I don't want commits created by jj to be different from any other
commits. I don't want Git projects to get annoyed by such commit
and reject them.
2. I've been concerned that tools don't know how to handle such
headers, perhaps even resulting in crashes.
The first argument doesn't apply to commits with conflicts because
such commits would never be accepted by a project whether or not they
use custom commit headers. The second argument is less relevant for
conflicted commits because most tools will be confused by such commits
anyway.
Storing conflict information in commit headers means that we can
transfer them via the regular Git wire protocol. We already include
the tree objects nested inside the root-level tree, so they will also
be transferred.
So, let's start by writing the information redundantly to the commit
header and to the existing storage. That way we can roll it back if we
realize there's a problem with using commit headers.
Initially we were thinking to have `Revset` return something like
`CachedRevset`:
```
pub trait CachedRevset {
fn iter(&self) -> Box<dyn Iterator<Item = Commit>>;
fn contains(&self, &CommitId) -> bool;
}
```
But we weren't sure what use case for `iter` would be, so we dropped the `iter`
method. `CachedRevset` with single `contains` method needed a better name. We
weren't able to come up with one, so we decided instead to have a method on
`Revset` that returns a closure to check if a commit is in a revset.
"for<'index> RevWalk<CompositeIndex<'index>, .." works as of now, but it won't
be composed well. So I'll turn CompositeIndex<'_> into &CompositeIndex in the
next batch, and remove "for<'index>".
This eliminates lifetimed fields from RevWalk objects, and the RevWalk object
will be embedded directly in RevWalkRevset.
This patch adds two separate iterator adapters. They are identical at this
point, but I'm going to add detach/reattach methods only to the borrowed
version. I'm also planning to change CompositeIndex<'_> to &CompositeIndex
to get around higher-ranked trait bound restrictions.
This simplifies the RevWalkIndex API. It would probably add fractional msecs of
overhead per next() call, but I don't see significant difference in revset
benches.
I'm going to make CompositeIndex<'_> detachable from the RevWalk, and
"F: Fn(CompositeIndex) -> Box<dyn Iterator<..>>" of RevWalkRevset<F> will
be replaced with "W: RevWalk<CompositeIndex>". This will simplify the code
structure, but also means that we can no longer apply .take_while() here and
convert it back to RevWalk. Fortunately, ancestors_until_roots() is the only
function I need to reimplement.
It doesn't make sense to build BinaryHeap with intermediate type, and I'm
going to reimplement take_until_roots() in a way that the queue drops
uninteresting items.
The current RevWalk constructors insert intermediate items to BinaryHeap
and convert them as needed. This is redundant, and I'm going to add another
parameter that should be applied to the queue first. That's why I decided
to factor out a builder type. I considered adding a few set of factory
functions that receive all parameters, but they looked messy because most of
the parameters are of [IndexPosition] type.
This patch also adds must_use to the builder and its return types, which are
all iterator-like.
Although watchman client appears to fail at decoding non-UTF-8 path (somewhere
in serde), jj shouldn't panic if watchman could deal with that.
The outer error message "path not in the repo" would sounds odd, but I think
that's okay because 1. it's unlikely that a user input is not UTF-8, and 2.
it's technically correct that a non-UTF-8 path is not contained in the repo.
This should address both use cases:
1. If from_relative_path() is directly called, the error says ".." shouldn't
be included in the (normalized) relative path.
2. If parse_fs_path() is used, the error message contains paths relative to
cwd. #3216
Some of the RevWalk methods could be generalized, but I decided to not try that
for now. I'll probably need to do more cleanup to (hopefully) remove 'index
lifetime from these types.
This requires a code tweak to avoid clippy failures, as `whoami` 1.5.0 has
deprecated the default `hostname()` function.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
This will also provide a better error indication. If write() failed, the child
process would presumably have exited with non-zero status and error message to
stderr.
`cargo doc` complains that two URLs aren't actually links:
```
warning: this URL is not a hyperlink
--> lib/src/fsmonitor.rs:66:6
|
66 | /// (https://facebook.github.io/watchman/). Requires `watchman` to already be
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use an automatic link instead: `<https://facebook.github.io/watchman/>`
|
= note: bare URLs are not automatically turned into clickable links
= note: `#[warn(rustdoc::bare_urls)]` on by default
warning: `jj-lib` (lib doc) generated 1 warning (run `cargo fix --lib -p jj-lib` to apply 1 suggestion)
Documenting jj-cli v0.14.0 (/Users/emesterhazy/oss/github.com/martinvonz/jj/cli)
Documenting testutils v0.14.0 (/Users/emesterhazy/oss/github.com/martinvonz/jj/lib/testutils)
warning: this URL is not a hyperlink
--> cli/src/cli_util.rs:2077:41
|
2077 | /// To get started, see the tutorial at https://github.com/martinvonz/jj/blob/main/docs/tutorial.md.
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use an automatic link instead: `<https://github.com/martinvonz/jj/blob/main/docs/tutorial.md.>`
|
= note: bare URLs are not automatically turned into clickable links
= note: `#[warn(rustdoc::bare_urls)]` on by default
warning: `jj-cli` (lib doc) generated 1 warning (run `cargo fix --lib -p jj-cli` to apply 1 suggestion)
```
This commit fixes the warnings by making the watchman URL a hyperlink and by
disabling the lint for the jj-cli error. Disabling the link is the right thing
to do because the comment is captured by clap and printed when `jj --help`
runs and any markdown formatting like `<>` is passed through.
The only thing we need from the `RepoLoader` is the `OpHeadsStore`, so we can
extract it in UnpublishedOperation::new instead of keeping the entire
`RepoLoader` around.
`NewRepoData` is just a container that holds data used to construct a
`ReadonlyRepo`. The `ReaonlyRepo` is always constructed before the
`UnpublishedOperation` is dropped, so we can simply construct the
`ReadonlyRepo` upfront and delete the `NewRepoData` type.
The custom Drop impl prevents us from moving members of UnpublishedOperation,
and is the reason why `NewRepoData` is wrapped in an `Option`. We don't use
custom Drop functions like this for debugging elsewhere in the codebase, and in
some ways #[must_use] provides better protection since it will typically cause
a compiler error if the UnpublishedOperation isn't used.
MutableRepo and CommitBuilder both define public (now crate-public) functions
which should only be called by each other. This commit adds documentation and
restricts visibility of these functions to the jj_lib crate. It might be even
better to move CommitBuilder to the same module as MutableRepo so that these
codependent functions can be private to the module to avoid misuse.
These comments are intended to make it easier for new developers to get up to
speed with the project. This is just a starting point... there are other types
and functions that could benefit from documentation.
There's no need to have a block of code at the beginning of the function to
cache the rewrite source id. We can simply check the necessary condition before
calling record_rewritten_commit.
This tweak makes the function a little easier to read since we don't check the
condition until we're ready to do the work.
This reverts dc074363d1 "no-op: Move external git repo canonicalization into
Workspace::init_git_external." As I said in the PR comment, appending ".git"
is normalization of the user input, which is IMHO more appropriate to be done
in the CLI layer.
This allows us to define documentation comments for types implemented using the
id_type! macro. Comments defined above the type inside the macro will be
captured and visible in generated docs.
Example:
```
id_type!(
/// Stable identifier for a [`Commit`]. Unlike the `CommitId`, the `ChangeId`
/// follows the commit and is not updated when the commit is rewritten.
pub ChangeId
);
```
This commit also adds documentation for the `CommitId` and `ChangeId` types
defined using the `id_type!` macro.
Follows up 7552f939c6 "tests: disable most gpg integration tests on Windows."
I couldn't find this test failing in a few samples before, but it does now.
This removes the special handling of the working-copy commit. By
recording when an empty/emptied commit was abanoned, we rebase
descendants correctly and create a new empty working-copy commit on
top.
This partially reverts changes in a9f489ccdf "Switch to ignore crate for
gitignore handling." Since child ignore object no longer needs to access the
root to resolve the prefix path, it's simpler to store a matcher per node.
With the current implementation, the file3 pattern is set to the prefix
"foo/foo/bar". I don't know if (unrooted) "baz" prefixed with "foo/foo/bar"
should match "foo/bar/baz", but apparently it is. Anyway, that wouldn't be
the case in practice because adjacent .gitignore files shouldn't be loaded.
We do clone Operation object in several places, and I'm going to add one more
.clone() in the templater. Since the underlying metadata has many fields, I
think it's better to wrap it with Arc just like a Commit object.
The default immutable_heads() includes tags(), which makes sense, but computing
heads(tags()) can be expensive because the tags() set is usually sparse. For
example, "jj bench revset 'heads(tags())'" took 157ms in my linux stable
mirror. We can of course optimize the heads evaluation by using bit set or
segmented index, but the query includes many historical heads if the repository
has per-release branches, which are uninteresting anyway. So, this patch
replaces heads(immutable_heads()) with trunk().
The reason we include heads(immutable_heads()) is to mitigate the following
problem. Suppose trunk() is the branch to be based off, I think using trunk()
here is pretty good.
```
A B
*---*----* trunk() ⊆ immutable_heads()
\
* C
```
https://github.com/martinvonz/jj/pull/2247#discussion_r1335078879
In my linux stable mirror, this makes the default log revset evaluation super
fast. immutable_heads(), if configured properly, includes many historical
branch heads which are also the visible heads.
revsets/immutable_heads()..
---------------------------
0 12.27 117.1±0.77m
3 1.00 9.5±0.08m
I'm going to add pre-filtering to the 'roots..heads' evaluation path, and
difference_by() will be used there to calculate 'heads ~ roots'.
Union and intersection iterators are slightly changed so that all iterators
prioritize iter1's item.
This adds a guard to the gpg signing tests which will skip the test if
`gpg` is not installed on the system.
This is done in order to avoid requiring all collaborators to have setup
all the tools on their local machines that are required to test commit
signing.
When doing things like testing snapshot performance differences,
this allows you to turn off the monitor, no matter what the enabled
user or repository configuration has, e.g.
jj st --config-toml='core.fsmonitor="none"'
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Consider this code:
```
struct NoContentHash {}
#[derive(ContentHash)]
enum Hashable {
NoCanHash(NoContentHash),
Empty,
}
```
Before this commit, it generates an error like this:
```
error[E0277]: the trait bound `NoContentHash: ContentHash` is not satisfied
--> lib/src/content_hash.rs:150:10
|
150 | #[derive(ContentHash)]
| ^^^^^^^^^^^ the trait `ContentHash` is not implemented for `NoContentHash`
151 | enum Hashable {
152 | NoCanHash(NoContentHash),
| --------- required by a bound introduced by this call
|
= help: the following other types implement trait `ContentHash`:
bool
i32
i64
u8
u32
u64
std::collections::HashMap<K, V>
BTreeMap<K, V>
and 35 others
For more information about this error, try `rustc --explain E0277`.
```
After this commit, it generates a better error message:
```
error[E0277]: the trait bound `NoContentHash: ContentHash` is not satisfied
--> lib/src/content_hash.rs:152:15
|
152 | NoCanHash(NoContentHash),
| ^^^^^^^^^^^^^ the trait `ContentHash` is not implemented for `NoContentHash`
|
= help: the following other types implement trait `ContentHash`:
bool
i32
i64
u8
u32
u64
std::collections::HashMap<K, V>
BTreeMap<K, V>
and 35 others
For more information about this error, try `rustc --explain E0277`.
error: could not compile `jj-lib` (lib) due to 1 previous error
```
It also works for enum variants with named fields:
```
error[E0277]: the trait bound `NoContentHash: ContentHash` is not satisfied
--> lib/src/content_hash.rs:152:23
|
152 | NoCanHash { named: NoContentHash },
| ^^^^^^^^^^^^^ the trait `ContentHash` is not implemented for `NoContentHash`
|
= help: the following other types implement trait `ContentHash`:
bool
i32
i64
u8
u32
u64
std::collections::HashMap<K, V>
BTreeMap<K, V>
and 35 others
For more information about this error, try `rustc --explain E0277`.
```
This is a no-op in terms of function, but provides a nicer way to derive the
ContentHash trait for structs using the `#[derive(ContentHash)]` syntax used
for other traits such as `Debug`.
This commit only adds the macro. A subsequent commit will replace uses of
`content_hash!{}` with `#[derive(ContentHash)]`.
The new macro generates nice error messages, just like the old macro:
```
error[E0277]: the trait bound `NotImplemented: content_hash::ContentHash` is not satisfied
--> lib/src/content_hash.rs:265:16
|
265 | z: NotImplemented,
| ^^^^^^^^^^^^^^ the trait `content_hash::ContentHash` is not implemented for `NotImplemented`
|
= help: the following other types implement trait `content_hash::ContentHash`:
bool
i32
i64
u8
u32
u64
std::collections::HashMap<K, V>
BTreeMap<K, V>
and 38 others
```
This commit does two things to make proc macros re-exported by jj_lib useable
by deps:
1. jj_lib needs to be able refer to itself as `jj_lib` which it does
by adding an `extern crate self as jj_lib` declaration.
2. jj_lib::content_hash needs to re-export the `digest::Update` type so that
users of jj_lib can use the `#[derive(ContentHash)]` proc macro without
directly depending on the digest crate. This is done by re-exporting it
as `DigestUpdate`.
#3054
It should be useful at least in the presentation layer to know which
operations correspond to working-copy snapshots. They might be
rendered differently in the graph, for example. Or maybe an undo
command wants to warn if you just undid a snapshot operation. This
patch just introduces a field in the metadata to store the
information.
I think the conclusion from #2600 is that at least auto-rebasing
should not simplify merge commits that merge a commit with its
ancestor. Let's start by adding an option for that in the library.