CommitIds are often manipulated by reference, so this makes the API more
flexible for cases where the caller doesn't already have a Vec or array of
owned CommitIds.
In many cases `rewrite_parents()` does not even need to clone the input
CommitIds. This refactor allows the clone to be avoided if it's unnecessary.
There might be other APIs that would benefit from a similar change. In general,
it seems like there are a lot of places where we're writing
`&[commit_x.id().clone, commit_y.id().clone()]` and similiar.
- [Rust API Guidelines](https://rust-lang.github.io/api-guidelines/flexibility.html#functions-minimize-assumptions-about-parameters-by-using-generics-c-generic)
## Feature Description
If enabled in the user or repository settings, the local branches pointing to the
parents of the revision targeted by `jj commit` will be advanced to the newly
created commit. Support for `jj new` will be added in a future change.
This behavior can be enabled by default for all branches by setting
the following in the config.toml:
```
[experimental-advance-branches]
enabled-branches = ["glob:*"]
```
Specific branches can also be disabled:
```
[experimental-advance-branches]
enabled-branches = ["glob:*"]
disabled-branches = ["main"]
```
Branches that match a disabled pattern will not be advanced, even if they also
match an enabled pattern.
This implements feature request #2338.
It's reasonable for a `WorkingCopy` implementation to want to return
an error. `LocalWorkingCopyFactory` doesn't because it loads all data
lazily. The VFS-based one at Google wants to be able to return an
error, however.
Previously, this command would work:
jj --config-toml='snapshot.max-new-file-size="1"' st
And is equivalent to this:
jj --config-toml='snapshot.max-new-file-size="1B"' st
But this would not work, despite looking like it should:
jj --config-toml='snapshot.max-new-file-size=1' st
This is extremely confusing for users.
This config value is deserialized via serde; and while the `HumanByteSize`
struct allegedly implemented Serde's `visit_u64` method, it was not called by
the deserialize visitor. Strangely, adding an `visit_i64` method *did* work, but
then requires handling of overflow, etc. This is likely because TOML integers
are naturally specified in `i64`.
Instead, just don't bother with any of that; implement a `TryFrom<String>`
instance for `HumanByteSize` that uses `u64::from_str` to try parsing the string
immediately; *then* fall back to `parse_human_byte_size` if that doesn't work.
This not only fixes the behavior but, IMO, is much simpler to reason about; we
get our `Deserialize` instance for free from the `TryFrom` instance.
Finally, this adjusts the test for `max-new-file-size` to now use a raw integer
literal, to ensure it doesn't regress. (There are already in-crate tests for
parsing the human readable strings.)
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I8dafa2358d039ad1c07e9a512c1d10fed5845738
`jj parallelize` was a good example of a command that can be
simplified by the new API, so I decided to rewrite it as an example.
The rewritten version is more flexible and doesn't actually need the
restrictions from the old version (such as checking that the commits
are connected). I still left the check for now to keep this patch
somewhat small. A subsequent commit will remove the restrictions.
There are several existing commands that would benefit from an API
that makes it easier to rewrite a whole graph of commits while
transforming them in some way.
`jj squash` is one example. When squashing into an ancestor, that
command currently rewrites the ancestor, then rebases descendants, and
then rewrites the rewritten source commit. It would be better to
rewrite the source commit (and any descendants) only once.
Another example is the future `jj fix`. That command will want to
rewrite a graph while updating the trees. There's currently no good
API for that; you have to manually iterate over descendants and
rewrite them.
This patch adds a new `MutableRepo::transform_descendants()` method
that takes a callback which gets a `CommitRewriter` passed to it. The
callback can then decide to change the parents, the tree, etc. The
callback is also free to leave the commit in place or to abandon it.
I updated the regular `rebase_descendants()` to use the new function
in order to exercise it. I hope we can replace all of the
`rebase_descendant_*()` flavors later.
I added a `replace_parent()` method that was a bit useful for the test
case. It could easily be hard-coded in the test case instead, but I
think the method will be useful for `jj git sync` and similar in the
future.
`CommitRewriter` wraps 3 of the arguments, so I think it makes sense
to pass it instead. More importantly, I hope to continue refactoring
so many of the callers already have a `CommitRewriter`.
The new `rebase()` method is meant to be called after deciding on the
new parents (typically by leaving them unchanged). It returns a
`CommitBuilder` for setting any additional values.
There will probably be a `reparent()` method in the future.
This patch adds a struct that's meant to help when rewriting
commits. It contains the old commits and the new parents. I hope to
move most of the logic from `rebase_commit_with_options()` onto it in
coming patches. Then this type can be passed in a callback to make it
easier to do custom rewriting of commits that is currently hard to do
because `rebase_descendants()` does not give the caller any control
over the process.
The helper is similar to `CommmitBuilder`, but it is a bit different
by also embedding information about the source commit, so I don't
think the API would be as convenient if we just used `CommitBuilder`
directly.
Mercurial appears to resolve cwd-relative path first, so "glob:*.c" could be
parsed as "**/*.c" if cwd was literally "**". It wouldn't practically matter,
but isn't correct. Instead, jj's parser first splits glob into literal part
and pattern. That's mainly because we want to parse the user input texts into
type-safe objects, and (RepoPathBuf, glob::Pattern) pairs are the simplest
ones. The current parser can't handle patterns like "foo/*/.." (= "foo" ?),
and errors out. I believe this restriction is acceptable.
Unlike literal paths, the 'glob:' pattern anchors to the whole file path. I
don't think "prefix"-matching glob is useful, and making it the default would
be rather confusing.
Patterns are specified as (dir, pattern) pairs because we need to handle
parse errors prior to constructing a matcher, and it's convenient to split
literal directory paths there.
It's cheap to look up commits again from the cache in `Store` but it
can be expensive to look up commits we didn't end up needing. This
will make it easier to refactor further and be able to cheaply set
preliminary parents for a rewritten commits and then let the caller
update them.
I'm going to add a helper struct to help with rewriting commits. I
want to make that struct own the old commit and the new parents to
simplify lifetimes. This patch prepares for that by passing the
commits by value to `rebase_commit()`.
Running `cargo publish` from a non-colocated repo (such as my usual
repo) is currently quite scary because it uploads all non-hidden
files, even if they're ignored by `.gitignore`
(https://github.com/rust-lang/cargo/issues/2063). I noticed this a
while ago and have always run the command from a fresh clone since
then. To avoid the need for that, let's use the workaround mentioned
on the bug, which is to explicitly list patterns we want to publish.
This prepares for adding glob matcher, which will be backed by
RepoPathTree<Vec<glob::Pattern>>.
FilesNodeKind/PrefixNodeKind are basically boolean types, but implemented as
enums for better code readability.
The is_dir flag will be removed soon. Since FilesMatcher doesn't set is_dir
flag explicitly, is_dir is equivalent to !entries.is_empty(). OTOH,
PrefixMatcher always sets is_dir, so all tree nodes are directories.
Perhaps, I didn't do that because it's important to initialize is_dir/file to
false. Since I'm going to extract a generic map-like API, and is_dir/file will
be an enum, this won't be a problem.
I'm going to extract generic map from RepoPathTree, and .get_visit_sets()
will be inlined into FilesMatcher/PrefixMatcher. These removed tests should
be covered by the corresponding matcher tests.
The functions now depend only on `MutableRepo`, so I think they belong
on that type. This gets us closer to being able to make
`parent_mapping` private again.
I think the recent refactorings (especially 9c382fd8c6) make it
pretty clear that `DescendantRebaser` will not attempt to rebase the
same commit twice, so I think we can remove the assertions. This
removes some of the places where `DescendantRebaser` reaches into
`MutableRepo`'s internals.
The Pijul maintainer has opinions that I don't understand about how we
mention Pijul (they consider the current mentions offensive as
"bashing Pijul"). Let's just remove the references so we don't have to
deal with it. I think the references to Darcs we already had in most
of these places are sufficient.
This implements the other workaround described in 57167cefda "git: on
import_refs(), don't abandon ancestors of newly fetched refs":
> I think there are two ways to fix the problem:
> a. pin non-tracking remote branches just like local refs
> b. pin newly fetched refs in addition to local refs
> This patch implements (b) because it's simpler and more obvious that the
> fetched commits would never be abandoned immediately.
The idea of (a) is that untracked remote branches are independent read-only
refs, and read-only branches shouldn't be rewritten implicitly. Once the
branch gets rewritten or abandoned by user, these remote refs will be hidden,
and won't be pinned anymore.
Since (a) effectively supersedes (b), this patch also removes the original
workaround.
Fixes#3495
If we ever implement some sort of ABI for dynamic extension loading, we'll need these underlying APIs to support multiple extensions, so we might as well do that first.
If this doesn't work out, maybe we can try one of these:
a. fall back to bare file name if expression doesn't contain any operator-like
characters (e.g. "f(x" is an error, but "f x" can be parsed as bare string)
b. introduce command-line flag to opt in (e.g. -e FILESET)
c. introduce pattern prefix to opt in (e.g. set:FILESET)
Closes#3239, #2915, #2286
This command checks not only whether Watchman works, but also whether
it's enabled in the config. Also, the output is easier to understand
than that of the other `jj debug watchman` commands.
It would be nice if `jj debug watchman` called `jj debug watchman
status`, but it's not trivial in `clap` to have a default subcommand.
The primary use case is to warn unmatched paths. I originally thought paths in
negated expressions shouldn't be checked, but doing that seems rather
inconsistent than useful. For example, "~x" in "jj split '~x'" should match at
least one file to split to non-empty revisions.
Since fileset is primarily used in CLI, it's better to avoid inner quoting if
possible. For example, ".." would have to be quoted in the original grammar
derived from the revset.
This patch also adds a stricter version of an identifier rule. If we add a
symbol alias, it will follow the "strict_identifier" rule.
The fileset grammar is basically a stripped-down version of the revset grammar,
with a few adjustments:
* extract function call to "function" rule (like templater)
* inline "symbol" rule (because "identifier" and "string" should be treated
differently at the early parsing stage.)
The parser will have a separate name resolution stage. This will help to do
alias substitution properly. I'll probably rewrite the revset parser in the
same way. It will also help if we want to embed fileset expression in file()
revset.
There are no more callers of parse_function_argument_to_string(), so it's
removed. This function was a thin wrapper of literal parser, and can be
easily reintroduced if needed.
FilesetExpression is similar to RevsetExpression, but there are two major
differences:
- Union is represented as N-ary operator,
- Expression node isn't Rc-ed.
The former is because of the nature of the runtime Matcher objects. It's easier
to construct a Matcher from flattened union expressions than from a binary tree.
The latter choice comes from UnionAll(Vec<FilesetExpression>), which doesn't
have to be Vec<Rc<FilesetExpression>>, and Rc<[FilesetExpression]> can't be
constructed from [Rc<_>, ..]. Anyway, the internal representation may change as
needed.
Another design decision I made is Vec<Pattern(RepoPathBuf)> vs
Pattern(Vec<RepoPathBuf>). I chose the former because it will be more closer
to the parsed tree of the fileset language.
The nightly compiler has several clippy fix-its that, if applied, break the
build. There are various bugs about this, but there isn't enough space in the
margins to detail it all.
Just ignore these on a per-function basis; about 70% of them are just multiple
instances happening inside a single function.
This makes `cargo clippy --workspace --all-targets` run clean, even with the
nightly compiler.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: Ic26a025d3c62b12fbf096171308b56e38f7d1bb9
This will be needed to concatenate patterns of different types (such as
"prefix/dir" exact:"file/path".)
The implementation is basically a copy of IntersectionMatcher, with some
logical adjustments. In Mercurial, unionmatcher supports list of matchers
as input, but I think binary version is good enough.
In order to implement a fileset, we'll need owned variants of these matchers.
We can of course let callers move Box<dyn Matcher> into these adapters, but
we might need to somehow clone Box<dyn Matcher>. So, I simply made adapters
generic.
This function doesn't actually need commits, it only needs their IDs. In some
contexts we may only have commit IDs, so there's no need to require an iterator
of Commits.
This commit also adds a `CommitIteratorExt` that makes it easy to convert an
iterator of `&Commit` to an iterator of `&CommitId`.
Templater doesn't have the one yet, but I think it belongs to the same
category.
For clap::Error, we could use clap's own mechanism to render suggestions as
"tip: ...", but I feel "Hint: ..." looks better because our error/hint message
is capitalized.
I'm going to add RevsetParseError constructor for InvalidFunctionArguments,
with/without a source error, and I don't want to duplicate code for all
combinations. The templater change is just for consistency.
I couldn't find a good naming convention for the builder-like API, so it's
called .with_source(mut self, _). Another option was .source_set(source).
Apparently, it's not uncommon to name consuming constructor as
with_<something>().
Now that we no longer bother to keep the set of heads to add and
remove updated while we rewrite descendants, we can simplify how we
find the set of heads to remove - it's simply all commits that have
been marked rewritten, divergent, or abandoned, i.e. the keys in
`parent_mapping`.
I don't think we have any transactions that mark commit as abandoned
and then later mark it as rewritten or divergent. But if we ever do, I
think it should be considered just rewritten/divergent. So let's
enforce that invariant by removing the old value from the set of
abandoned commits.
This commit moves the parse_string_pattern helper function into the
str_util module in jj lib and adds tests for it.
I'd like to reuse this code in a function defined by `UserSettings`, which is
part of the jj lib crate and cannot use functions from the cli crate.
This makes the summary line more informative. Even though it just duplicates
the message printed later, I think it's easier to follow.
This patch also adjusts some RevsetParseError messages because it seemed
redundant to repeat "revset function", "argument", etc.
Because the CLI error handler now prints error sources in multi-line format,
it doesn't make much sense to render Revset/TemplateParseError differently.
This patch also fixes the source() of the SyntaxError kind. It should be
self.pest_error.source() (= None), not self.pest_error.
I'm going to make TemplateParseError hold RevsetParseError as Box<dyn _>, but
Box<dyn std::error::Error ..> doesn't implement Eq. I could remove Eq from
ErrorKind enums, but it's handly if these enums remain as value types.
This change will also simplify fmt::Display and error::Error impls.
It's common to normalize an empty directory path as ".". This change unblocks
the use of from_relative_path() in edit_sparse().
There are a couple of callers who do to_fs_path(Path::new("")), but they all
translate non-directory paths, which should never be empty.
We currently include the commits in `parent_mapping` and `abandoned`
in the set of commits to visit when rebasing descendants. The reason
was that we used to update branches and working copies when we visited
these commits. Since we started updating refs after rebasing all
commits, there's no need to even visit these commits.