Commit graph

3282 commits

Author SHA1 Message Date
Yuya Nishihara
9f4a7318c7 tests: compare git refs loaded from disk, not in-memory cache values
This addresses the test instability. The underlying problem still exists, but
it's unlikely to trigger user-facing issues because of that. A repo instance
won't be reused after gc() call.

Fixes #3537
2024-04-22 18:46:28 +09:00
Yuya Nishihara
527713a851 tests: fix potential mtime flakiness in git gc tests
Apparently, these gc() invocations rely on that the previous "git gc" packed
all refs so there are no loose refs to compare mtimes. If there were new (or
remaining) loose refs, mtime comparison could fail. I also added +1sec to
effectively turn off the keep_newer option, which isn't important in these
tests.
2024-04-22 18:46:28 +09:00
Evan Mesterhazy
f9a3021a7a Simplify calls to CommitRewriter::replace_parents()
Now that it takes `IntoIterator` the caller doesn't need to clone
the input `CommitIds`.
2024-04-21 23:31:17 -04:00
Evan Mesterhazy
2b0aa84c9d CommitRewriter::rewrite_parents(): Take IntoIterator instead of &[CommitId]
CommitIds are often manipulated by reference, so this makes the API more
flexible for cases where the caller doesn't already have a Vec or array of
owned CommitIds.

In many cases `rewrite_parents()` does not even need to clone the input
CommitIds.  This refactor allows the clone to be avoided if it's unnecessary.

There might be other APIs that would benefit from a similar change. In general,
it seems like there are a lot of places where we're writing
`&[commit_x.id().clone, commit_y.id().clone()]` and similiar.

- [Rust API Guidelines](https://rust-lang.github.io/api-guidelines/flexibility.html#functions-minimize-assumptions-about-parameters-by-using-generics-c-generic)
2024-04-21 23:31:17 -04:00
Evan Mesterhazy
bbd9c7c7cb Implement advance-branches for jj commit
## Feature Description

If enabled in the user or repository settings, the local branches pointing to the
parents of the revision targeted by `jj commit` will be advanced to the newly
created commit. Support for `jj new` will be added in a future change.

This behavior can be enabled by default for all branches by setting
the following in the config.toml:

```
[experimental-advance-branches]
enabled-branches = ["glob:*"]
```

Specific branches can also be disabled:
```
[experimental-advance-branches]
enabled-branches = ["glob:*"]
disabled-branches = ["main"]
```

Branches that match a disabled pattern will not be advanced, even if they also
match an enabled pattern.

This implements feature request #2338.
2024-04-20 10:26:04 -04:00
Martin von Zweigbergk
8bb92fa6fa working_copy: allow load_working_copy() to return error
It's reasonable for a `WorkingCopy` implementation to want to return
an error. `LocalWorkingCopyFactory` doesn't because it loads all data
lazily. The VFS-based one at Google wants to be able to return an
error, however.
2024-04-19 15:22:37 -07:00
Austin Seipp
ddfdf5e357 cli: allow snapshot.max-new-file-size to be a raw u64
Previously, this command would work:

    jj --config-toml='snapshot.max-new-file-size="1"' st

And is equivalent to this:

    jj --config-toml='snapshot.max-new-file-size="1B"' st

But this would not work, despite looking like it should:

    jj --config-toml='snapshot.max-new-file-size=1' st

This is extremely confusing for users.

This config value is deserialized via serde; and while the `HumanByteSize`
struct allegedly implemented Serde's `visit_u64` method, it was not called by
the deserialize visitor. Strangely, adding an `visit_i64` method *did* work, but
then requires handling of overflow, etc. This is likely because TOML integers
are naturally specified in `i64`.

Instead, just don't bother with any of that; implement a `TryFrom<String>`
instance for `HumanByteSize` that uses `u64::from_str` to try parsing the string
immediately; *then* fall back to `parse_human_byte_size` if that doesn't work.
This not only fixes the behavior but, IMO, is much simpler to reason about; we
get our `Deserialize` instance for free from the `TryFrom` instance.

Finally, this adjusts the test for `max-new-file-size` to now use a raw integer
literal, to ensure it doesn't regress. (There are already in-crate tests for
parsing the human readable strings.)

Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I8dafa2358d039ad1c07e9a512c1d10fed5845738
2024-04-19 13:03:24 -05:00
Martin von Zweigbergk
d6b41c18c9 parallelize: rewrite using transform_descendants()
`jj parallelize` was a good example of a command that can be
simplified by the new API, so I decided to rewrite it as an example.

The rewritten version is more flexible and doesn't actually need the
restrictions from the old version (such as checking that the commits
are connected). I still left the check for now to keep this patch
somewhat small. A subsequent commit will remove the restrictions.
2024-04-18 21:06:52 -07:00
Martin von Zweigbergk
e682543570 repo: take owned commit IDs to MutableRepo::new_parents()
We always call `.to_vec()` on the slice, so let's just have the caller
pass in an owned vector instead.
2024-04-18 21:06:52 -07:00
Martin von Zweigbergk
87c65ee0f9 rewrite: make CommitRewriter::replace_parents() remove repeats 2024-04-18 21:06:52 -07:00
Martin von Zweigbergk
96f5ca47d4 repo: add method for tranforming descendants, use in rebase_descendants()
There are several existing commands that would benefit from an API
that makes it easier to rewrite a whole graph of commits while
transforming them in some way.

`jj squash` is one example. When squashing into an ancestor, that
command currently rewrites the ancestor, then rebases descendants, and
then rewrites the rewritten source commit. It would be better to
rewrite the source commit (and any descendants) only once.

Another example is the future `jj fix`. That command will want to
rewrite a graph while updating the trees. There's currently no good
API for that; you have to manually iterate over descendants and
rewrite them.

This patch adds a new `MutableRepo::transform_descendants()` method
that takes a callback which gets a `CommitRewriter` passed to it. The
callback can then decide to change the parents, the tree, etc. The
callback is also free to leave the commit in place or to abandon it.

I updated the regular `rebase_descendants()` to use the new function
in order to exercise it. I hope we can replace all of the
`rebase_descendant_*()` flavors later.

I added a `replace_parent()` method that was a bit useful for the test
case. It could easily be hard-coded in the test case instead, but I
think the method will be useful for `jj git sync` and similar in the
future.
2024-04-18 21:06:52 -07:00
Yuya Nishihara
18f94bbb8b cli: suggest root:"<path>" if cwd-relative path is not in workspace
Closes #3216
2024-04-19 09:35:47 +09:00
Martin von Zweigbergk
d38228d0c5 rewrite: move check for unchanged parents onto CommitRewriter 2024-04-18 08:08:51 -07:00
Martin von Zweigbergk
ad1ee2d1d2 rewrite: pass root commits into find_descendants_to_rebase()
I'm going to add another caller that wants to rebase from given roots
instead.
2024-04-18 08:08:51 -07:00
Martin von Zweigbergk
a5e6b1f997 rewrite: inline specialized rebase_commit_with_options() in rebase()
`rebase_commit_with_options()` now does very little, and we don't want
most of it in `rebase()`.
2024-04-18 08:08:51 -07:00
Martin von Zweigbergk
2859277941 rewrite: pass CommitRewriter into rebase_commit_with_options()
`CommitRewriter` wraps 3 of the arguments, so I think it makes sense
to pass it instead. More importantly, I hope to continue refactoring
so many of the callers already have a `CommitRewriter`.
2024-04-18 08:08:51 -07:00
Martin von Zweigbergk
b2993f2b23 rewrite: add rebase() method to CommitRewriter
The new `rebase()` method is meant to be called after deciding on the
new parents (typically by leaving them unchanged). It returns a
`CommitBuilder` for setting any additional values.

There will probably be a `reparent()` method in the future.
2024-04-18 08:08:51 -07:00
Martin von Zweigbergk
b13cb8db26 rewrite: make EmptyBehavior implement Copy 2024-04-18 08:08:51 -07:00
Martin von Zweigbergk
402d94dbd7 rewrite: add a method for simplifying ancestors to CommitRewriter 2024-04-18 08:08:51 -07:00
Martin von Zweigbergk
dc6c7a98d6 rewrite: create a helper type for rewriting commits
This patch adds a struct that's meant to help when rewriting
commits. It contains the old commits and the new parents. I hope to
move most of the logic from `rebase_commit_with_options()` onto it in
coming patches. Then this type can be passed in a callback to make it
easier to do custom rewriting of commits that is currently hard to do
because `rebase_descendants()` does not give the caller any control
over the process.

The helper is similar to `CommmitBuilder`, but it is a bit different
by also embedding information about the source commit, so I don't
think the API would be as convenient if we just used `CommitBuilder`
directly.
2024-04-18 08:08:51 -07:00
Ilya Grigoriev
9fa01e0246 lib git.rs: minor simplification, fixup to 62b14e1f
As suggested by @yuja in
https://github.com/martinvonz/jj/pull/3516#discussion_r1568466814
2024-04-17 19:51:57 -07:00
Yuya Nishihara
4474577ceb fileset: parse cwd/root-glob patterns
Mercurial appears to resolve cwd-relative path first, so "glob:*.c" could be
parsed as "**/*.c" if cwd was literally "**". It wouldn't practically matter,
but isn't correct. Instead, jj's parser first splits glob into literal part
and pattern. That's mainly because we want to parse the user input texts into
type-safe objects, and (RepoPathBuf, glob::Pattern) pairs are the simplest
ones. The current parser can't handle patterns like "foo/*/.." (= "foo" ?),
and errors out. I believe this restriction is acceptable.

Unlike literal paths, the 'glob:' pattern anchors to the whole file path. I
don't think "prefix"-matching glob is useful, and making it the default would
be rather confusing.
2024-04-18 11:09:54 +09:00
Yuya Nishihara
147668cdf2 matchers: add matcher for glob patterns
Patterns are specified as (dir, pattern) pairs because we need to handle
parse errors prior to constructing a matcher, and it's convenient to split
literal directory paths there.
2024-04-18 11:09:54 +09:00
Ilya Grigoriev
62b14e1fa2 lib git.rs: remove workaround for a now-fixed libgit2 bug
https://github.com/libgit2/libgit2/issues/3178 is now fixed.
2024-04-17 12:00:37 -07:00
Martin von Zweigbergk
93baff0b8a rewrite: pass just IDs of new parents into rewrite::rebase*()
It's cheap to look up commits again from the cache in `Store` but it
can be expensive to look up commits we didn't end up needing. This
will make it easier to refactor further and be able to cheaply set
preliminary parents for a rewritten commits and then let the caller
update them.
2024-04-17 06:13:54 -07:00
Martin von Zweigbergk
057b7c8d0b rewrite: take commit and new parents by value in rebase_commit()
I'm going to add a helper struct to help with rewriting commits. I
want to make that struct own the old commit and the new parents to
simplify lifetimes. This patch prepares for that by passing the
commits by value to `rebase_commit()`.
2024-04-17 06:13:54 -07:00
Martin von Zweigbergk
dca9c6f884 repo: propagate errors from find_descendants_to_rebase() 2024-04-17 06:13:54 -07:00
Martin von Zweigbergk
8ce099470b cargo: explicitly indicate paths to publish
Running `cargo publish` from a non-colocated repo (such as my usual
repo) is currently quite scary because it uploads all non-hidden
files, even if they're ignored by `.gitignore`
(https://github.com/rust-lang/cargo/issues/2063). I noticed this a
while ago and have always run the command from a fresh clone since
then. To avoid the need for that, let's use the workaround mentioned
on the bug, which is to explicitly list patterns we want to publish.
2024-04-15 20:37:00 -07:00
Martin von Zweigbergk
955c9bf27b jj-lib-proc-macros: add missing LICENSE file
We publish this crate on crates.io, so it should have a LICENSE file.
2024-04-15 20:37:00 -07:00
Yuya Nishihara
7bed5dd222 matchers: turn RepoPathTree into generic map-like type
This prepares for adding glob matcher, which will be backed by
RepoPathTree<Vec<glob::Pattern>>.

FilesNodeKind/PrefixNodeKind are basically boolean types, but implemented as
enums for better code readability.
2024-04-16 10:10:09 +09:00
Yuya Nishihara
10f5540b3b matchers: rewrite RepoPathTree::to_visit_sets() to not depend on is_dir flag
The is_dir flag will be removed soon. Since FilesMatcher doesn't set is_dir
flag explicitly, is_dir is equivalent to !entries.is_empty(). OTOH,
PrefixMatcher always sets is_dir, so all tree nodes are directories.
2024-04-16 10:10:09 +09:00
Yuya Nishihara
e0d5217450 matchers: inline RepoPathTree::get_visit_sets() 2024-04-16 10:10:09 +09:00
Yuya Nishihara
8e196d0025 matchers: simply derive Default for RepoPathTree
Perhaps, I didn't do that because it's important to initialize is_dir/file to
false. Since I'm going to extract a generic map-like API, and is_dir/file will
be an enum, this won't be a problem.
2024-04-16 10:10:09 +09:00
Yuya Nishihara
f92e5b911f matchers: inline RepoPathTree::add_file() 2024-04-16 10:10:09 +09:00
Yuya Nishihara
0153cc1bc7 matchers: remove tests that directly modify RepoPathTree
I'm going to extract generic map from RepoPathTree, and .get_visit_sets()
will be inlined into FilesMatcher/PrefixMatcher. These removed tests should
be covered by the corresponding matcher tests.
2024-04-16 10:10:09 +09:00
Yuya Nishihara
9a83338079 matchers: don't allow dead_code 2024-04-16 10:10:09 +09:00
Martin von Zweigbergk
0bbebaf4f9 rewrite: move calculation of set to rebase to MutableRepo
This lets us make `parent_mapping` private again.
2024-04-15 07:09:12 -07:00
Martin von Zweigbergk
53a0e23759 rewrite: move functions for updating refs to MutableRepo
The functions now depend only on `MutableRepo`, so I think they belong
on that type. This gets us closer to being able to make
`parent_mapping` private again.
2024-04-15 07:09:12 -07:00
Martin von Zweigbergk
f716116249 rewrite: remove unnecessary assertions
I think the recent refactorings (especially 9c382fd8c6) make it
pretty clear that `DescendantRebaser` will not attempt to rebase the
same commit twice, so I think we can remove the assertions. This
removes some of the places where `DescendantRebaser` reaches into
`MutableRepo`'s internals.
2024-04-15 07:09:12 -07:00
Martin von Zweigbergk
656250d6d0 rewrite: pass UserSettings into update_all_references()
With this change, `update_all_references()` only uses `self` to get to
`mut_repo`. I'll move the function onto `MutableRepo` next.
2024-04-15 07:09:12 -07:00
Martin von Zweigbergk
750002594e rewrite: inline and rewrite ref_target_update()
I rewrote `old_target` and `new_target` to more accurately represent
the change; the old target should be a normal (singleton) ref.
2024-04-15 07:09:12 -07:00
Martin von Zweigbergk
f696f5b727 rewrite: leverage root_id() helper on commit object 2024-04-15 07:09:12 -07:00
Martin von Zweigbergk
0525dc9d86 politics: delete references to Pijul
The Pijul maintainer has opinions that I don't understand about how we
mention Pijul (they consider the current mentions offensive as
"bashing Pijul"). Let's just remove the references so we don't have to
deal with it. I think the references to Darcs we already had in most
of these places are sufficient.
2024-04-14 13:16:08 -07:00
Yuya Nishihara
aaa2025dfc git: on fetch, pin visible untracked remote refs
This implements the other workaround described in 57167cefda "git: on
import_refs(), don't abandon ancestors of newly fetched refs":

> I think there are two ways to fix the problem:
>  a. pin non-tracking remote branches just like local refs
>  b. pin newly fetched refs in addition to local refs
> This patch implements (b) because it's simpler and more obvious that the
> fetched commits would never be abandoned immediately.

The idea of (a) is that untracked remote branches are independent read-only
refs, and read-only branches shouldn't be rewritten implicitly. Once the
branch gets rewritten or abandoned by user, these remote refs will be hidden,
and won't be pinned anymore.

Since (a) effectively supersedes (b), this patch also removes the original
workaround.

Fixes #3495
2024-04-14 11:38:21 +09:00
dploch
57a5d7dd64 cli_util: support multiple extensions consistently
If we ever implement some sort of ABI for dynamic extension loading, we'll need these underlying APIs to support multiple extensions, so we might as well do that first.
2024-04-12 14:07:33 -04:00
Yuya Nishihara
30984dae4a cli: if enabled, parse path arguments as fileset expressions
If this doesn't work out, maybe we can try one of these:
 a. fall back to bare file name if expression doesn't contain any operator-like
    characters (e.g. "f(x" is an error, but "f x" can be parsed as bare string)
 b. introduce command-line flag to opt in (e.g. -e FILESET)
 c. introduce pattern prefix to opt in (e.g. set:FILESET)

Closes #3239, #2915, #2286
2024-04-12 11:36:40 +09:00
Ilya Grigoriev
8fa256ebac New jj debug watchman status command
This command checks not only whether Watchman works, but also whether
it's enabled in the config. Also, the output is easier to understand
than that of the other `jj debug watchman` commands.

It would be nice if `jj debug watchman` called `jj debug watchman
status`, but it's not trivial in `clap` to have a default subcommand.
2024-04-11 10:55:59 -07:00
Yuya Nishihara
33beb8d456 fileset: add recursive iterator over explicit paths
The primary use case is to warn unmatched paths. I originally thought paths in
negated expressions shouldn't be checked, but doing that seems rather
inconsistent than useful. For example, "~x" in "jj split '~x'" should match at
least one file to split to non-empty revisions.
2024-04-11 00:51:19 +09:00
Yuya Nishihara
57b423e3d7 fileset: relax identifier rule to accept more path-like strings
Since fileset is primarily used in CLI, it's better to avoid inner quoting if
possible. For example, ".." would have to be quoted in the original grammar
derived from the revset.

This patch also adds a stricter version of an identifier rule. If we add a
symbol alias, it will follow the "strict_identifier" rule.
2024-04-09 20:42:09 +09:00
Yuya Nishihara
653173abad fileset: implement name resolution stage, add all()/none() functions
#3239
2024-04-09 20:42:09 +09:00
Yuya Nishihara
9c28fe954c fileset: add grammar and implement parser (without name resolution)
The fileset grammar is basically a stripped-down version of the revset grammar,
with a few adjustments:

 * extract function call to "function" rule (like templater)
 * inline "symbol" rule (because "identifier" and "string" should be treated
   differently at the early parsing stage.)

The parser will have a separate name resolution stage. This will help to do
alias substitution properly. I'll probably rewrite the revset parser in the
same way. It will also help if we want to embed fileset expression in file()
revset.
2024-04-09 20:42:09 +09:00
Yuya Nishihara
521bcd81ab dsl_util: deduplicate collect_similar() from revset and templater
For convenience, sort and dedup are done by collect_similar().
2024-04-09 20:42:09 +09:00
Evan Mesterhazy
379849b4b8 Fix documentation for RevsetExpression::ancestors_at and descendants_at
The current documentation is wrong. This is a follow up for
https://github.com/martinvonz/jj/pull/3461#discussion_r1555011289
2024-04-08 08:29:14 -04:00
Yuya Nishihara
73508730aa revset: rewrite identifier rule in common infix-op rule pattern
I don't remember why I made it defined recursively, but it's basically the
same as "primary ~ (infix_op ~ primary)*" rule.
2024-04-08 00:37:25 +09:00
Yuya Nishihara
7f1f73b0fa revset: move whitespace rule to top
The whitespace rule is a bit special, and it seemed weird that the rule is
defined between literals and operator tokens.
2024-04-08 00:37:25 +09:00
Yuya Nishihara
c8f93c50fc revset: remove redundant Result<..> from parse_symbol_rule_as_literal() 2024-04-08 00:37:25 +09:00
Yuya Nishihara
d442cd872f revset: backport \-escapes parsing from templater 2024-04-08 00:37:25 +09:00
Yuya Nishihara
d1ae2d72c8 revset: rename Rule::literal_string to string_literal 2024-04-08 00:37:25 +09:00
Yuya Nishihara
274183fa66 dsl_util: extract helper that parses string literal with \-escapes
The top-level assertion is removed since it's now obvious that the pair
represents a Rule::string_literal.
2024-04-08 00:37:25 +09:00
Yuya Nishihara
8b32a8a916 revset: add support for file(kind:pattern) syntax
There are no more callers of parse_function_argument_to_string(), so it's
removed. This function was a thin wrapper of literal parser, and can be
easily reintroduced if needed.
2024-04-07 19:43:29 +09:00
Yuya Nishihara
850887cf09 fileset: add basic pattern parsing functions
Naming convention is described in FilePattern::from_str_kind(). It's based
on Mercurial's pattern prefixes, but hopefully fixes some inconsistencies.
https://github.com/martinvonz/jj/issues/2915#issuecomment-1956401114

#3239
2024-04-07 19:43:29 +09:00
Yuya Nishihara
3c1d485452 revset: extract function that handles kind:"value" pattern syntax
I also removed comment about the error span. It's unclear whether the kind
was invalid or the value had syntax error.
2024-04-07 19:43:29 +09:00
Yuya Nishihara
47150d2bb4 revset: migrate file() predicate to be based on FilesetExpression 2024-04-06 23:59:54 +09:00
Yuya Nishihara
3e029537c6 fileset: add basic AST-level object and matcher builder
FilesetExpression is similar to RevsetExpression, but there are two major
differences:
 - Union is represented as N-ary operator,
 - Expression node isn't Rc-ed.
The former is because of the nature of the runtime Matcher objects. It's easier
to construct a Matcher from flattened union expressions than from a binary tree.
The latter choice comes from UnionAll(Vec<FilesetExpression>), which doesn't
have to be Vec<Rc<FilesetExpression>>, and Rc<[FilesetExpression]> can't be
constructed from [Rc<_>, ..]. Anyway, the internal representation may change as
needed.

Another design decision I made is Vec<Pattern(RepoPathBuf)> vs
Pattern(Vec<RepoPathBuf>). I chose the former because it will be more closer
to the parsed tree of the fileset language.
2024-04-06 23:59:54 +09:00
Yuya Nishihara
7acfab695a matchers: impl custom Debug for RepoPathTree to get stable and concise output
The default Debug output entries aren't sorted by name, which was inconvenient
while writing snapshot tests.
2024-04-06 23:59:54 +09:00
Yuya Nishihara
c9b21a16be matchers: require Matcher to be Debug
This helps to write snapshot tests.
2024-04-06 23:59:54 +09:00
Yuya Nishihara
f3485c9efb repo_path: make Debug formatting of RepoPathComponent less verbose
Since RepoPath is formatted as a string, it should be okay for RepoPathComponent
to do the same.
2024-04-06 23:59:54 +09:00
Yuya Nishihara
1134dc159e repo_path: use write!() macro to implement Debug 2024-04-06 23:59:54 +09:00
Yuya Nishihara
0b833ea9c0 repo_path: qualify fmt::Error, use fmt::Result for short
"Error" is super common type name, so I think better to not pollute the
namespace with a very specific Error type.
2024-04-06 23:59:54 +09:00
Ilya Grigoriev
93cebcd0c0 protos: cargo update prost prost-builder and regenerate protobufs 2024-04-05 16:56:20 -07:00
Austin Seipp
4b45dde8c6 clippy: disable bogus lints for nightly clippy
The nightly compiler has several clippy fix-its that, if applied, break the
build. There are various bugs about this, but there isn't enough space in the
margins to detail it all.

Just ignore these on a per-function basis; about 70% of them are just multiple
instances happening inside a single function.

This makes `cargo clippy --workspace --all-targets` run clean, even with the
nightly compiler.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: Ic26a025d3c62b12fbf096171308b56e38f7d1bb9
2024-04-05 11:39:29 -05:00
Yuya Nishihara
a364310b56 matchers: add binary UnionMatcher
This will be needed to concatenate patterns of different types (such as
"prefix/dir" exact:"file/path".)

The implementation is basically a copy of IntersectionMatcher, with some
logical adjustments. In Mercurial, unionmatcher supports list of matchers
as input, but I think binary version is good enough.
2024-04-05 10:26:01 +09:00
Yuya Nishihara
c4d7425de5 matchers: abstract matcher combinators over Matcher trait
In order to implement a fileset, we'll need owned variants of these matchers.
We can of course let callers move Box<dyn Matcher> into these adapters, but
we might need to somehow clone Box<dyn Matcher>. So, I simply made adapters
generic.
2024-04-05 10:26:01 +09:00
Yuya Nishihara
a7d5a9c99a commit: actually remove boxing from CommitIteratorExt::ids()
Also simplified lifetime bound a bit.
2024-04-05 00:16:42 +09:00
Evan Mesterhazy
d4a04779c0 Make check_rewritable take an iterator of &CommitId instead of &Commit
This function doesn't actually need commits, it only needs their IDs. In some
contexts we may only have commit IDs, so there's no need to require an iterator
of Commits.

This commit also adds a `CommitIteratorExt` that makes it easy to convert an
iterator of `&Commit` to an iterator of `&CommitId`.
2024-04-04 09:31:17 -04:00
Yuya Nishihara
bb87fac1a4 revset: parse "all:" prefix rule by pest
I had to use negative lookahead !":" because we still support a dummy ":"
operator to provide a suggestion.
2024-04-03 08:59:42 +09:00
Yuya Nishihara
13dadadcdc revset: add ParseState constructor 2024-04-03 08:59:42 +09:00
Christoph Koehler
7bde6ddc29 revset: add working_copies() function
It includes the working copy commit of every workspace of the repo.

Implements #3384
2024-04-01 19:36:53 -06:00
Martin von Zweigbergk
bbe906b426 repo: merge rewrite state into single parent_mapping with enum
This simplifies the code and reduces the risk of inconsistencies in
the data.

Thanks to Yuya for the suggestion.
2024-03-30 09:35:45 -07:00
Yuya Nishihara
a6615bf36d cli: render string pattern suggestion as a hint
Templater doesn't have the one yet, but I think it belongs to the same
category.

For clap::Error, we could use clap's own mechanism to render suggestions as
"tip: ...", but I feel "Hint: ..." looks better because our error/hint message
is capitalized.
2024-03-30 23:53:17 +09:00
Yuya Nishihara
d759ba11f1 revset: don't stringify StringPatternParseError
This helps to add hint at the CLI layer.
2024-03-30 23:53:17 +09:00
Yuya Nishihara
c4d48c5139 revset: add constructor for InvalidFunctionArguments error
Inlined some of the make_error() closures instead. I'll make string pattern
handler preserve the source error object.
2024-03-30 23:53:17 +09:00
Yuya Nishihara
b09732f4f8 revset, templater: split parse error constructor that sets source error object
I'm going to add RevsetParseError constructor for InvalidFunctionArguments,
with/without a source error, and I don't want to duplicate code for all
combinations. The templater change is just for consistency.

I couldn't find a good naming convention for the builder-like API, so it's
called .with_source(mut self, _). Another option was .source_set(source).
Apparently, it's not uncommon to name consuming constructor as
with_<something>().
2024-03-30 23:53:17 +09:00
Yuya Nishihara
73b60903ce tree: flatten TreeMergeError into BackendError 2024-03-30 22:40:05 +09:00
Yuya Nishihara
916014dc1e tree: consolidate read error variants
There isn't much difference between BackendError::ReadObject of file type
and TreeMergeError::ReadError. They are both caused by the backend.
2024-03-30 22:40:05 +09:00
Martin von Zweigbergk
bfa43d16f9 rewrite: don't collect set of heads to add unnecessarily 2024-03-30 05:21:48 -07:00
Martin von Zweigbergk
c40949208b rewrite: all rewritten commits are no longer heads
Now that we no longer bother to keep the set of heads to add and
remove updated while we rewrite descendants, we can simplify how we
find the set of heads to remove - it's simply all commits that have
been marked rewritten, divergent, or abandoned, i.e. the keys in
`parent_mapping`.
2024-03-30 05:21:48 -07:00
Martin von Zweigbergk
bb1fef3258 rewrite: drop redundant unioning of old commits with abandoned commits
We always add abandoned commits as key in `parent_mapping`.
2024-03-30 05:21:48 -07:00
Martin von Zweigbergk
db4b905bc9 repo: when setting rewritten or divergent, remove from abandoned
I don't think we have any transactions that mark commit as abandoned
and then later mark it as rewritten or divergent. But if we ever do, I
think it should be considered just rewritten/divergent. So let's
enforce that invariant by removing the old value from the set of
abandoned commits.
2024-03-30 05:21:48 -07:00
Yuya Nishihara
f20004fffe git_backend: classify "merge with root" as user error
Perhaps, there will be more error types that hold BackendError internally, but
this change is good enough to handle a merge error.
2024-03-30 11:14:25 +09:00
Yuya Nishihara
1e83faf4f8 tree: remove useless "Backend error" message from TreeMergeError
I don't think it adds any contextual information. TreeMergeError is somewhat
similar to BackendError.
2024-03-30 11:14:25 +09:00
Evan Mesterhazy
dd1def02e4 Move parse_string_pattern in cli to StringPattern::parse in lib
This commit moves the parse_string_pattern helper function into the
str_util module in jj lib and adds tests for it.

I'd like to reuse this code in a function defined by `UserSettings`, which is
part of the jj lib crate and cannot use functions from the cli crate.
2024-03-29 08:48:09 -04:00
Yuya Nishihara
916dc30828 revset: use common argument error instead of FsPathParseError
It's not special compared to the other argument errors, and we can now track
the error source separately.
2024-03-28 10:53:06 +09:00
Yuya Nishihara
074e6e12bc revset, templater: include short parse error description in summary line
This makes the summary line more informative. Even though it just duplicates
the message printed later, I think it's easier to follow.

This patch also adjusts some RevsetParseError messages because it seemed
redundant to repeat "revset function", "argument", etc.
2024-03-28 10:53:06 +09:00
Yuya Nishihara
d17166628f revset, templater: simplify parse error impls by using thiserror
This patch moves all "source" errors to the source field to conform to
thiserror API. It will probably help to keep ErrorKind enums comparable.
2024-03-28 10:53:06 +09:00
Yuya Nishihara
2cd70bdf14 revset, templater: render parse error as usual error chain
Because the CLI error handler now prints error sources in multi-line format,
it doesn't make much sense to render Revset/TemplateParseError differently.

This patch also fixes the source() of the SyntaxError kind. It should be
self.pest_error.source() (= None), not self.pest_error.
2024-03-28 10:53:06 +09:00
Yuya Nishihara
844d3d0ff0 revset, templater: allow any kind of error as parse error source
I'm going to make TemplateParseError hold RevsetParseError as Box<dyn _>, but
Box<dyn std::error::Error ..> doesn't implement Eq. I could remove Eq from
ErrorKind enums, but it's handly if these enums remain as value types.

This change will also simplify fmt::Display and error::Error impls.
2024-03-28 10:53:06 +09:00
Yuya Nishihara
32efb4034d revset: make span of parse error mandatory, remove Option<_>
Since all callers of RevsetParseError have some reasonable span, we don't
need a special case for WorkingCopyWithoutWorkspace error.
2024-03-28 10:53:06 +09:00
Yuya Nishihara
8ad0a703d4 repo_path: accept from_relative_path("."), make "".to_fs_path("") return "."
It's common to normalize an empty directory path as ".". This change unblocks
the use of from_relative_path() in edit_sparse().

There are a couple of callers who do to_fs_path(Path::new("")), but they all
translate non-directory paths, which should never be empty.
2024-03-28 10:52:51 +09:00
Martin von Zweigbergk
9c382fd8c6 rewrite: exclude already rewritten commits from set to rebase
We currently include the commits in `parent_mapping` and `abandoned`
in the set of commits to visit when rebasing descendants. The reason
was that we used to update branches and working copies when we visited
these commits. Since we started updating refs after rebasing all
commits, there's no need to even visit these commits.
2024-03-26 09:50:50 -07:00
Martin von Zweigbergk
49ff818e97 rewrite: calculate branches later, remove it from state 2024-03-26 09:50:50 -07:00
Martin von Zweigbergk
718e54b01a rewrite: calculate heads_to_add later, remove it from state
Similar to the previous two commits.
2024-03-26 09:50:50 -07:00
Martin von Zweigbergk
2ee1147145 rewrite: calculate heads_to_remove later, remove it from state
Similar to the previous commit.
2024-03-26 09:50:50 -07:00
Martin von Zweigbergk
b3dd038907 rewrite: calculate new_commits later, remove it from state
We only use `new_commits` in `update_heads()`, so let's calculate it
there. It should also be more correct in case other commits were
created after we initialized `DescendantRebaser`.
2024-03-26 09:50:50 -07:00
Martin von Zweigbergk
5e7a4a2028 rewrite: update heads outside update_references()
Now that we only call `update_references()` in one place, there's no
reason to have it also update `heads_to_add` and `heads_to_remove`. By
moving it out of the function, we can consolidate the logic in one
place.
2024-03-26 09:50:50 -07:00
Martin von Zweigbergk
9511de486e rewrite: extract a function for updating heads 2024-03-26 09:50:50 -07:00
Martin von Zweigbergk
0f7a86d725 rewrite: move new_parents() to MutableRepo
The function only uses state from `MutableRepo`, so it should be
implemented on that type.
2024-03-26 09:50:50 -07:00
Martin von Zweigbergk
cfdb341c6b rewrite: make rebase_commit_with_options() mark abandoned commit
When `rebase_commit_with_options()` decides to abandons a commit, it
records the new parents in the `MutableRepo`, but it's currently the
caller's responsibility to remember to mark it as abandoned. Let's
move that logic into the function to reduce the risk of future bugs.
2024-03-26 09:50:50 -07:00
Martin von Zweigbergk
3ddf9f4329 repo: add parents of abandoned commit to parent_mapping
By adding the abandoned commit's parents to `parent_mapping`, we can
remove a bit more of the special handling of abandoned commitsin
`DescendantRebaser`.
2024-03-26 09:50:50 -07:00
Martin von Zweigbergk
0481e67dfd rewrite: drop now-unnecessary updating of branches map
Since we update all branches at the end now, we never update them in
several steps, so there are no intermediate locations we need to
remember.
2024-03-25 23:00:44 -07:00
Martin von Zweigbergk
5e8d7f8c6f rewrite: update references after rewriting all commits 2024-03-25 23:00:44 -07:00
Martin von Zweigbergk
e55ebd4fe6 rewrite: drop redundant update of parent_mapping after rebasing commit
In the normal case when we don't abandon a commit because it became
empty, then `CommitBuilder::write()` will have recorded the new commit
as a rewrite of the old commit. We don't need to do that again in
`rebase_one()`.
2024-03-25 23:00:44 -07:00
Martin von Zweigbergk
4406005dce rewrite: make DescendantRebaser use state stored in MutableRepo
A subset of the state in `DescendantRebaser` now matches exactly what
`MutableRepo` already stores, so we can avoid copying that state and
have `DescendantRebaser` use it directly instead. Having a single
source of truth for the state will enable further simplifications and
improvements.
2024-03-25 23:00:44 -07:00
Martin von Zweigbergk
ad16bec3a6 rewrite: move an assertion a little earlier
I'm going to make `DescendantRebaser` share the state about rewritten
commits with `MutableRepo` next. That means that the call to
`rebase_commit_with_options()` will update that state, which would
make this assertion fail. So let's move it a little earlier to avoid
that.
2024-03-25 23:00:44 -07:00
Martin von Zweigbergk
a6857a7a8f repo: rename abandoned_commits to abandoned
This is just to match `DescendantRebaser`, to make the next commit a
bit simpler. I think `MutableRepo` still has few enough fields that
just `abandoned` is clear enough. Maybe we'll move the three
rewrite-related fields into a new struct at some point.
2024-03-25 23:00:44 -07:00
Martin von Zweigbergk
6e3ceb4d1c repo: store separate divergent field, pass into DescendantRebaser
With this patch, `MutableRepo` has the same tracking of rewritten
commits as `DescendantRebaser`, so we can simply pass that state into
`DescendantRebaser` when we create it. The next step is to remove the
state from `DescendantRebaser`.
2024-03-25 23:00:44 -07:00
Ilya Grigoriev
de0de4013d hex_utils: fix typo found by clippy 2024-03-25 21:23:09 -07:00
Martin von Zweigbergk
890a8e282f repo: update working copy to first divergent commit 2024-03-25 06:53:14 -07:00
Martin von Zweigbergk
d2043f069e repo: delete record_rewritten_commit()
I don't think we have any callers left that call
`record_rewritten_commit()` multiple times within a transaction and
expect it to result in divergence. I think we should consider it a bug
to do that.
2024-03-25 06:53:14 -07:00
Martin von Zweigbergk
e55168fa3e repo: make record_rewritten_commit() accept only one replacement id
All callers now pass a single new commit and I would like to keep it
that way.
2024-03-25 06:53:14 -07:00
Martin von Zweigbergk
af7ef4d04e repo: add a method for explicitly recording divergent rewrite
I plan to remove `record_rewritten_commit()` and instead make repeated
rewrites replace the rewrite state.
2024-03-25 06:53:14 -07:00
Martin von Zweigbergk
b54ace4954 rewrite: mark divergent commits in parent_mapping too
When rebasing descendants, we generally move branches, child commits,
the working copy to the rewritten commit(s). However, we don't move
the working copy to the new rewritten commit (s) if the old commit had
been abandoned, and we don't move child commits if the rewriten was
divergent.

This patch aims to make it clearer that there's only one mapping from
old to new parents, and that is in `parent_mapping`. It does so by
merging the current `divergent` map into it, and makes the `divergent`
just a set instead. When finding the new parents for a child, we leave
the existing parent if it's in the set.

My longer-term goal is to move `parent_mapping`, `abandoned`, and
`divergent` into `MutableRepo` (maybe in a nested struct), so we can
do some transformations on descendants as we rebase them. By having
the state in a single place (not moving it from `MutableRepo` to
`DescendantRebaser` as we currently do), I hope it will be easier to
write a `MutableRepo::transform_descendants(callback)`, where the
callback gets a `CommitBuilder` and can change parents of the commit,
for example.
2024-03-25 06:53:14 -07:00
Martin von Zweigbergk
ba244423e8 rewrite: avoid an unnecessary clone 2024-03-25 06:53:14 -07:00
Yuya Nishihara
c311131ee2 log: encode elided node as None
Since elided graph entry has no associated commits, it makes some sense to
represent as None?
2024-03-24 10:32:15 +09:00
Benjamin Tan
3034dbba3f git-push: Display messages from remote
The implementation of sideband progress message printing is aligned with
Git's implementation. See
43072b4ca1/sideband.c (L178).

Closes #3236.
2024-03-23 20:17:04 +08:00
Ilya Grigoriev
02a04d0d37 test_conflicts and test_resolve_command: use indoc! to indent conflict markers in tests
Apart from (IMO) looking nicer, this will also sidestep the potential problem
that if the file contains actual jj conflict markers (`>>>>>>>` in the beginning
of a line, for example), jj would currently have trouble materializing and
subsequently parsing conflicts in the file if it actually became conflicted.

I'll demo this bug in either this or a subsequent PR. It's the kind of bug that
sounds serious in theory but might never cause a problem in practice.

After this PR, only `docs/tutorial.md` has a conflict marker that's not indented.
There's only one there, so hopefully it won't be too much of a pain to deal with.

I also indented other strings in `test_conflicts.rs`. IMO, this looks nice and
more consistent with the `insta::assert_snapshot` output. I didn't spend the
time to do the same for `test_resolve_command`.
2024-03-22 23:27:25 -07:00
Anton Älgmyr
e2eb5bddf9 Make node symbols templatable in the graphs.
Adds config options
* templates.log_graph_node
* templates.log_graph_node_elided
* templates.op_log_graph_node
2024-03-21 17:41:31 +01:00
dploch
9380f9d529 rewrite: move handling of simplified ancestry into rebase_commit_with_options
It seems incorrect that `simplify_ancestor_merge` is ignored when it's part of the helper's input.
2024-03-20 11:57:54 -04:00
Ilya Grigoriev
4fbe6aecc9 clippy: remove some unused code beta clippy/rustc compain about
There are still some warnings from (seemingly) clippy bugs. Quoting
myself from Discord:

> PSA: the latest beta cargo clippy (from Rust 1.78) has some problems
> that affect jj: https://github.com/rust-lang/rust-clippy/issues/12467
> and https://github.com/rust-lang/rust-clippy/issues/12377.  You could
> disable clippy::assigning_clones and clippy::empty_docs as a workaround.
> VS Code can disable them in rust-analyzer, you can also use
> https://github.com/ericseppanen/cargo-cranky (you can put Cranky.toml in
> the per-user gitignore).
2024-03-19 18:33:29 -07:00
Martin von Zweigbergk
f865c1bc5d index: print a milder "Reindexing..." message on version mismatch
Closes #3323.
2024-03-18 13:50:14 -07:00
Yuya Nishihara
50363419fb revset: substitute '~(::x)' to 'x..'
Suppose we have an alias 'immutable()' = '::immutable_heads()', user can
express (visible) mutable set as '~immutable()'. 'immutable_heads()..' can
terminate early, but a generic difference 'all() & ~immutable()' can't.
2024-03-17 14:50:48 +09:00
Yuya Nishihara
9207314173 revset: add substitution rule for "::x & ~(::y-)"
Suppose the generation value is usually small, it should be faster to do
bounded range look up first 'y-', then walk ancestors with the unwanted set
'y-..x'.
2024-03-17 14:50:48 +09:00
Yuya Nishihara
39a460a077 revset: extract helper function that substitutes "::x & ~(::y)"
I'm going to add a similar substitution rule for "~(::y)".
2024-03-17 14:50:48 +09:00
Yuya Nishihara
3f9ac78215 revset: update legacy range syntax in comment 2024-03-17 14:50:48 +09:00
Yuya Nishihara
a777cfe98e index: remove topo_order() which is no longer used
The same thing can be achieved by evaluating the input as a revset.
2024-03-17 11:44:41 +09:00
Martin von Zweigbergk
c55e08023e workspace: don't lose sparsed-away paths when recovering workspace
When an operation is missing and we recover the workspace, we create a
new working-copy commit on top of the desired working-copy commit (per
the available head operation). We then reset the working copy to an
empty tree because it shouldn't really matter much which commit we
reset to. However, when the workspace is sparse, it does matter, as
the test case from the previous patch shows. This patch fixes it by
replacing the `reset_to_empty()` method by a new `recover(&Commit)`,
which effectively resets to the empty tree and then resets to the
commit. That way, any subsequent snapshotting will result keep the
paths from that tree for paths outside the sparse patterns.
2024-03-16 07:30:36 -07:00
Alexis (Poliorcetics) Bourget
93c707a469 lib: improve error message for invalid string pattern, suggesting to use one of the known one 2024-03-16 14:22:16 +01:00
Evan Mesterhazy
f30857190e Add more test cases for Index::common_ancestors 2024-03-14 12:54:13 -04:00
Evan Mesterhazy
adaedd5556 Add documentation to lib/src/index.rs and lib/src/default_index/ 2024-03-14 12:54:13 -04:00
Yuya Nishihara
5806dbfd32 revset_graph: detach CompositeIndex, reimplement as RevWalk
For API consistency. It wouldn't practically matter unless we want to reuse
.iter_graph() in lazy event-driven GUI context.

I don't see significant performance difference:
- jj-0: original impl with look-ahead IndexEntry<'_> buffer
- jj-1: this patch

With dense graph
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
  "target/release-with-debug/{bin} -R ~/mirrors/git --ignore-working-copy log -r.. -T ''"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/git --ignore-working-copy log -r.. -T ''
  Time (mean ± σ):      1.367 s ±  0.008 s    [User: 1.261 s, System: 0.105 s]
  Range (min … max):    1.357 s …  1.380 s    10 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r.. -T ''
  Time (mean ± σ):      1.344 s ±  0.017 s    [User: 1.245 s, System: 0.099 s]
  Range (min … max):    1.313 s …  1.369 s    10 runs

Relative speed comparison
        1.02 ±  0.01  target/release-with-debug/jj-0 -R ~/mirrors/git --ignore-working-copy log -r.. -T ''
        1.00          target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r.. -T ''
```

With sparse graph
```
% hyperfine --sort command --warmup 3 --runs 10 -L bin jj-0,jj-1 \
  "target/release-with-debug/{bin} -R ~/mirrors/git --ignore-working-copy log -r'tags()' -T ''"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/git --ignore-working-copy log -r'tags()' -T ''
  Time (mean ± σ):      1.347 s ±  0.017 s    [User: 1.216 s, System: 0.130 s]
  Range (min … max):    1.321 s …  1.379 s    10 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r'tags()' -T ''
  Time (mean ± σ):      1.379 s ±  0.023 s    [User: 1.238 s, System: 0.140 s]
  Range (min … max):    1.328 s …  1.403 s    10 runs

Relative speed comparison
        1.00          target/release-with-debug/jj-0 -R ~/mirrors/git --ignore-working-copy log -r'tags()' -T ''
        1.02 ±  0.02  target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r'tags()' -T ''
```
2024-03-14 10:07:19 +09:00
Yuya Nishihara
3c8f22456b revset_graph: remove lifetimed IndexEntry<'_> from look_ahead buffer
Prepares for removing &CompositeIndex from the RevsetGraphIterator struct.
The input iterator will also be changed to position-based.

I've turned self.look_ahead.get().unwrap() into assertion, but it's not super
important here. It's just for sanity that we've mapped missing edges properly.
FWIW, we could say RevsetGraphIterator is an example of iterating *and* testing
membership of the input revset (though the yielded entries are discarded.)
2024-03-14 10:07:19 +09:00
Yuya Nishihara
699707905c index: reorganize revset_graph_iterator as private module of default_index
The RevsetGraphIterator type is hidden so that the Iterator trait can be
implemented differently.
2024-03-14 10:07:19 +09:00
Yuya Nishihara
17e46e0932 revset: extend lifetime of CommitId/ChangeId iterators
For the same reason as the previous commit. Since self.inner.positions()
basically clones the underlying evaluation tree, there is no reason to stick
to &self lifetime. Perhaps, some of the CLI utility can be changed to not
collect() the iterator.

Migrating iter_graph() requires non-trivial changes, so it will be done
separately.
2024-03-13 10:47:58 +09:00
Yuya Nishihara
3bf41d0c52 revset: extend lifetime of containing_fn()
This allows callers to cache the returned function at 'index lifetime. It's
important in templater. It also means the returned function could be 'static
if the index were Arc<_> and we had a trait interface to achieve that.

Option<Box<dyn ..>> is removed since RevWalk is fused.
2024-03-13 10:47:58 +09:00
Yuya Nishihara
027bd8f03a revset: extend lifetime of internal evaluation nodes
This makes the whole evaluation tree 'static, and we can freely move it without
keeping the root RevsetImpl object alive.

Perhaps, "Self: 'a" can be replaced with 'static, but let's leave it for now.
It's not technically wrong to store lifetimed object in InternalRevset.
2024-03-13 10:47:58 +09:00
Yuya Nishihara
bc49b6b190 revset: make PurePredicateFn clonable
Prepares for dropping &self lifetime from to_predicate_fn(). All predicate
functions could be wrapped as Box::new(PurePredicateFn(Rc::new(f))) instead, but
I don't think the .clone() cost matters.
2024-03-13 10:47:58 +09:00
dploch
6e8f1fb390 extensions_map: create a type-safe container for arbitrary objects 2024-03-12 16:52:49 -04:00
Yuya Nishihara
283907418a revset: detach index from InternalRevset::positions()
Perhaps, union/intersection/difference combinators can be moved to the
rev_walk module, but let's think about that later.
2024-03-12 20:59:38 +09:00
Yuya Nishihara
78dbaba4dc revset: remove entry-based API from InternalRevset
Now all source/sink nodes produce/consume IndexPosition, so it doesn't make
sense to keep InternalRevset::entries().
2024-03-12 20:59:38 +09:00
Yuya Nishihara
a733b0b052 revset: detach index from predicate fn, turn it into position-based
This is the step towards removing &CompositeIndex references from the revset
evaluation tree. The filter input is changed from &IndexEntry to IndexPosition
to simplify the lifetime thingy. We might want to pass around CommitId or
Commit object once it's loaded, but that can be implemented later. I don't
see significant performance difference in revset benches.
2024-03-12 20:59:38 +09:00
Yuya Nishihara
97e69d1dcc index: add filter RevWalk adapter
FilterRevset will be built on top.
2024-03-12 20:59:38 +09:00
Yuya Nishihara
cfa067a0a9 index: add peekable RevWalk adapter
This helps to migrate union/intersection/difference iterators to RevWalk.
2024-03-12 20:59:38 +09:00
Yuya Nishihara
8f0b9a0e4a index: add RevWalk wrapper for eagerly evaluated set
This serves the same role as templater::Literal. I'm going to add basic
RevWalk adapters so that the revset evaluation tree can be constructed without
capturing the index. EagerRevWalk will help to write tests for these adapters.
2024-03-12 20:59:38 +09:00
Yuya Nishihara
7d43a5c2c0 tests: alias index.as_composite() in revset combinator/accumulator tests 2024-03-12 20:59:38 +09:00
Aleksey Kuznetsov
6fd15dc7e5 graphlog: refactor out node symbols from GraphLog
Now as default and elided node symbols come from the config, the next logical
step is to use them directly bypassing GraphLog. Note that commands like `jj op
log` and `jj obslog` do not use the elided node symbol at all.
2024-03-12 08:25:58 +05:00
Yuya Nishihara
9c1d5d155e index: remove HRTB stuff by implementing RevWalkIndex for CompositeIndex 2024-03-11 17:24:10 +09:00
Yuya Nishihara
3d0952b316 index: implement AsCompositeIndex for CompositeIndex, not for &CompositeIndex
Just a minor code cleanup. We still need Index for &CompositeIndex because the
type is unsized, and unsized type cannot be converted to another dyn reference.
2024-03-11 17:24:10 +09:00
Yuya Nishihara
243675b793 index: turn CompositeIndex into transparent reference type
This helps to eliminate higher-ranked trait bounds from RevWalkRevset and
RevWalk combinators to be added. Since &CompositeIndex is now a real reference,
it can be passed to functions as index: &T.
2024-03-11 17:24:10 +09:00
Yuya Nishihara
c8be8c3edd index: add type alias for "dyn IndexSegment" to clarify it's 'static
This helps to migrate CompositeIndex<'_> wrapper to &CompositeIndex. If
the wrapped reference had a lifetimed field, it couldn't be represented as
a trivial reference type.
2024-03-11 17:24:10 +09:00
Yuya Nishihara
64e0be2477 revset: consolidate early-return condition of PositionsAccumulator
Since consume_to() checks the bottom position yielded from the source iterator,
it makes sense to add the same check for the cached positions.
2024-03-11 17:24:01 +09:00
Martin von Zweigbergk
4d42604913 git_backend: write trees involved in conflict in git commit header
We haven't used custom Git commit headers for two main reasons:

1. I don't want commits created by jj to be different from any other
   commits. I don't want Git projects to get annoyed by such commit
   and reject them.

2. I've been concerned that tools don't know how to handle such
   headers, perhaps even resulting in crashes.

The first argument doesn't apply to commits with conflicts because
such commits would never be accepted by a project whether or not they
use custom commit headers. The second argument is less relevant for
conflicted commits because most tools will be confused by such commits
anyway.

Storing conflict information in commit headers means that we can
transfer them via the regular Git wire protocol. We already include
the tree objects nested inside the root-level tree, so they will also
be transferred.

So, let's start by writing the information redundantly to the commit
header and to the existing storage. That way we can roll it back if we
realize there's a problem with using commit headers.
2024-03-10 20:51:05 -07:00
Aleksey Kuznetsov
cd3d75ebf6 revset: introduce more performant way to check if a commit is in a revset
Initially we were thinking to have `Revset` return something like
`CachedRevset`:

```
pub trait CachedRevset {
  fn iter(&self) -> Box<dyn Iterator<Item = Commit>>;
  fn contains(&self, &CommitId) -> bool;
}
```

But we weren't sure what use case for `iter` would be, so we dropped the `iter`
method. `CachedRevset` with single `contains` method needed a better name. We
weren't able to come up with one, so we decided instead to have a method on
`Revset` that returns a closure to check if a commit is in a revset.
2024-03-11 08:27:35 +05:00
Yuya Nishihara
8a406358af index: migrate RevWalkRevset to be based off new RevWalk trait
"for<'index> RevWalk<CompositeIndex<'index>, .." works as of now, but it won't
be composed well. So I'll turn CompositeIndex<'_> into &CompositeIndex in the
next batch, and remove "for<'index>".
2024-03-11 11:25:54 +09:00
Yuya Nishihara
4107cad80e index: migrate RevWalkDescendants to new RevWalk trait
Just for consistency. Descendants are always evaluated eagerly, so this change
isn't strictly needed.
2024-03-11 11:25:54 +09:00
Yuya Nishihara
b6cbd8b90b index: add trait and adaptor types to detach index from RevWalk*
This eliminates lifetimed fields from RevWalk objects, and the RevWalk object
will be embedded directly in RevWalkRevset.

This patch adds two separate iterator adapters. They are identical at this
point, but I'm going to add detach/reattach methods only to the borrowed
version. I'm also planning to change CompositeIndex<'_> to &CompositeIndex
to get around higher-ranked trait bound restrictions.
2024-03-11 11:25:54 +09:00
Yuya Nishihara
d780910bec index: make RevWalk yield IndexPosition instead of IndexEntry
This simplifies the RevWalkIndex API. It would probably add fractional msecs of
overhead per next() call, but I don't see significant difference in revset
benches.
2024-03-11 11:25:54 +09:00
Anton Älgmyr
099f06bf71 Add configuration options for node symbols in the graphs. 2024-03-09 21:16:58 +01:00
Yuya Nishihara
f51c5d7e57 index: consistently use IntoIterator in RevWalk builder API
Since the return type is no longer "impl Iterator<..>", there isn't lifetime
issue anymore.
2024-03-10 01:45:30 +09:00
Yuya Nishihara
2615fed5be index: handle cut-off position of RevWalk by queue
I'm going to make CompositeIndex<'_> detachable from the RevWalk, and
"F: Fn(CompositeIndex) -> Box<dyn Iterator<..>>" of RevWalkRevset<F> will
be replaced with "W: RevWalk<CompositeIndex>". This will simplify the code
structure, but also means that we can no longer apply .take_while() here and
convert it back to RevWalk. Fortunately, ancestors_until_roots() is the only
function I need to reimplement.
2024-03-10 01:45:30 +09:00
Yuya Nishihara
34fbaaaad6 index: construct RevWalk queue after item type is settled
It doesn't make sense to build BinaryHeap with intermediate type, and I'm
going to reimplement take_until_roots() in a way that the queue drops
uninteresting items.
2024-03-10 01:45:30 +09:00
Yuya Nishihara
8480ee9e05 index: migrate RevWalk constructors to builder API
The current RevWalk constructors insert intermediate items to BinaryHeap
and convert them as needed. This is redundant, and I'm going to add another
parameter that should be applied to the queue first. That's why I decided
to factor out a builder type. I considered adding a few set of factory
functions that receive all parameters, but they looked messy because most of
the parameters are of [IndexPosition] type.

This patch also adds must_use to the builder and its return types, which are
all iterator-like.
2024-03-10 01:45:30 +09:00
Yuya Nishihara
008adecf23 index: rename ancestors iterators from RevWalk* to RevWalkAncestors*
I'm planning to add RevWalk trait, and this patch frees up the name. It seems
also good for consistency as we have RevWalkDescendants*.
2024-03-10 01:45:30 +09:00
Yuya Nishihara
fa60026f25 repo_path: don't panic on invalid UTF-8 path component
Although watchman client appears to fail at decoding non-UTF-8 path (somewhere
in serde), jj shouldn't panic if watchman could deal with that.

The outer error message "path not in the repo" would sounds odd, but I think
that's okay because 1. it's unlikely that a user input is not UTF-8, and 2.
it's technically correct that a non-UTF-8 path is not contained in the repo.
2024-03-09 11:01:43 +09:00
Yuya Nishihara
a224d0f172 repo_path: show more detailed error if filesystem path failed to parse
This should address both use cases:
 1. If from_relative_path() is directly called, the error says ".." shouldn't
    be included in the (normalized) relative path.
 2. If parse_fs_path() is used, the error message contains paths relative to
    cwd. #3216
2024-03-09 11:01:43 +09:00
Yuya Nishihara
a76f716cd1 index: remove RevWalk newtypes that were necessary to hide impl types/traits
Some of the RevWalk methods could be generalized, but I decided to not try that
for now. I'll probably need to do more cleanup to (hopefully) remove 'index
lifetime from these types.
2024-03-08 10:07:40 +09:00
Yuya Nishihara
8451453f3a index: hide walk_revs() and related types
They are now implementation details of the default index backend.
2024-03-08 10:07:40 +09:00
Yuya Nishihara
f5eb172769 tests: remove last use of walk_revs() from integration tests 2024-03-08 10:07:40 +09:00
Martin von Zweigbergk
5ce5022ee9 cargo: mark the jj-lib-proc-macros crate for publish
I don't think we can publish a new version of the other crates without
publishing `jj-lib-proc-macros`.
2024-03-06 20:35:38 -08:00
Thomas Castiglione
d661f59f9d working_copy: implement symlinks on windows with a helper function
enables symlink tests on windows, ignoring failures due to disabled developer mode,
and updates windows.md
2024-03-05 15:16:38 +08:00
Austin Seipp
bd551099f0 cargo: update whoami dependency to 1.5.0
This requires a code tweak to avoid clippy failures, as `whoami` 1.5.0 has
deprecated the default `hostname()` function.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
2024-03-04 18:35:21 -06:00
Yuya Nishihara
c8023dbd8b signing: insert tracing events to command invocation paths
This might help debug command failure.
2024-03-05 09:23:15 +09:00
Yuya Nishihara
fa7864edeb signing: ensure child processes are wait()ed on I/O error
This will also provide a better error indication. If write() failed, the child
process would presumably have exited with non-zero status and error message to
stderr.
2024-03-05 09:23:15 +09:00
Evan Mesterhazy
a09ee4b9a3 Make URLs in docs hyperlinks
`cargo doc` complains that two URLs aren't actually links:

```
warning: this URL is not a hyperlink
  --> lib/src/fsmonitor.rs:66:6
   |
66 | /// (https://facebook.github.io/watchman/). Requires `watchman` to already be
   |      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use an automatic link instead: `<https://facebook.github.io/watchman/>`
   |
   = note: bare URLs are not automatically turned into clickable links
   = note: `#[warn(rustdoc::bare_urls)]` on by default

warning: `jj-lib` (lib doc) generated 1 warning (run `cargo fix --lib -p jj-lib` to apply 1 suggestion)
 Documenting jj-cli v0.14.0 (/Users/emesterhazy/oss/github.com/martinvonz/jj/cli)
 Documenting testutils v0.14.0 (/Users/emesterhazy/oss/github.com/martinvonz/jj/lib/testutils)
warning: this URL is not a hyperlink
    --> cli/src/cli_util.rs:2077:41
     |
2077 | /// To get started, see the tutorial at https://github.com/martinvonz/jj/blob/main/docs/tutorial.md.
     |                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use an automatic link instead: `<https://github.com/martinvonz/jj/blob/main/docs/tutorial.md.>`
     |
     = note: bare URLs are not automatically turned into clickable links
     = note: `#[warn(rustdoc::bare_urls)]` on by default

warning: `jj-cli` (lib doc) generated 1 warning (run `cargo fix --lib -p jj-cli` to apply 1 suggestion)
```

This commit fixes the warnings by making the watchman URL a hyperlink and by
disabling the lint for the jj-cli error. Disabling the link is the right thing
to do because the comment is captured by clap and printed when `jj --help`
runs and any markdown formatting like `<>` is passed through.
2024-03-04 16:05:42 -05:00
Yuya Nishihara
24868e5192 gpg_signing: handle early termination of gpg command in verify path
Also fixes missing wait() on I/O error. We have the same problem in several
places. I'll fix them in another batch.
2024-03-03 18:35:10 +09:00
Yuya Nishihara
a0c31134ba gpg_signing: split run_command() into sign/verify variants 2024-03-03 18:35:10 +09:00
Yuya Nishihara
093f61607e gpg_signing: leverage Command builder API to eliminate type casting noises 2024-03-03 18:35:10 +09:00
Yuya Nishihara
dbe99c8fe0 gpg_signing: extract bottom half of run() to helper function
I'll split it further to fix EPIPE handling.
2024-03-03 18:35:10 +09:00
Evan Mesterhazy
ff4a5aa491 Store OpHeadsStore in UnpublishedOperation instead of RepoLoader
The only thing we need from the `RepoLoader` is the `OpHeadsStore`, so we can
extract it in UnpublishedOperation::new instead of keeping the entire
`RepoLoader` around.
2024-03-02 23:08:57 -05:00
Evan Mesterhazy
1439c902be Replace NewRepoData with ReadonlyRepo in the UnpublishedOperation struct
`NewRepoData` is just a container that holds data used to construct a
`ReadonlyRepo`. The `ReaonlyRepo` is always constructed before the
`UnpublishedOperation` is dropped, so we can simply construct the
`ReadonlyRepo` upfront and delete the `NewRepoData` type.
2024-03-02 23:08:57 -05:00
Evan Mesterhazy
c4cbf25545 Remove the std::Option around UnpublishedOperation::data
The Option is unnecessary now since `UnpublishedOperation` doesn't implement
the Drop trait (the `MustClose` member implements it instead).
2024-03-02 23:08:57 -05:00
Evan Mesterhazy
962b188b76 Replace custom Drop impl for UnpublishedOperation with #[must_use]
The custom Drop impl prevents us from moving members of UnpublishedOperation,
and is the reason why `NewRepoData` is wrapped in an `Option`. We don't use
custom Drop functions like this for debugging elsewhere in the codebase, and in
some ways #[must_use] provides better protection since it will typically cause
a compiler error if the UnpublishedOperation isn't used.
2024-03-02 23:08:57 -05:00
Evan Mesterhazy
6ee19589e9 Adjust visibility of codependent MutableRepo and CommitBuilder functions
MutableRepo and CommitBuilder both define public (now crate-public) functions
which should only be called by each other. This commit adds documentation and
restricts visibility of these functions to the jj_lib crate. It might be even
better to move CommitBuilder to the same module as MutableRepo so that these
codependent functions can be private to the module to avoid misuse.
2024-03-02 22:41:47 -05:00
Ilya Grigoriev
96bf190234 Nightly clippy fixes
There are a few additional warnings because of
https://github.com/rust-lang/rust-clippy/issues/12377, which is a
nightly-only bug that will hopefully be fixed.
2024-03-02 18:19:14 -08:00
Evan Mesterhazy
2f7b15b7b1 Add documentation comments for operation, transaction, and view types 2024-03-02 15:35:41 -05:00
Evan Mesterhazy
a335321c45 Add documentation comments for several types
These comments are intended to make it easier for new developers to get up to
speed with the project. This is just a starting point... there are other types
and functions that could benefit from documentation.
2024-03-02 15:01:55 -05:00
Evan Mesterhazy
b8aa9a1a2b Make a minor simplification to CommitBuilder::write
There's no need to have a block of code at the beginning of the function to
cache the rewrite source id. We can simply check the necessary condition before
calling record_rewritten_commit.

This tweak makes the function a little easier to read since we don't check the
condition until we're ready to do the work.
2024-03-02 13:40:18 -05:00
Evan Mesterhazy
276276ea01 Reorder functions in impl Repo for MutableRepo to match trait
This is just a clean-up to silence a lint that complains that the functions are
defined in a different order than they are in the trait.
2024-03-02 13:40:04 -05:00
Yuya Nishihara
5df7a42915 merge_tools: move "ui.diff-instructions" to CLI and config/misc.toml
There are no users of this option in jj-lib. Let's simplify it.
2024-03-02 23:33:45 +09:00
Evan Mesterhazy
5c252bd8e4 Add test cases where HexPrefix::new fails due to invalid inputs 2024-03-01 10:00:22 -05:00
Yuya Nishihara
4c16c05be1 cli: move --git-repo path normalization back from workspace
This reverts dc074363d1 "no-op: Move external git repo canonicalization into
Workspace::init_git_external." As I said in the PR comment, appending ".git"
is normalization of the user input, which is IMHO more appropriate to be done
in the CLI layer.
2024-02-28 09:03:16 +09:00
Evan Mesterhazy
a28beb5b8f Allow id_type! to capture doc comments
This allows us to define documentation comments for types implemented using the
id_type! macro. Comments defined above the type inside the macro will be
captured and visible in generated docs.

Example:

```
id_type!(
    /// Stable identifier for a [`Commit`]. Unlike the `CommitId`, the `ChangeId`
    /// follows the commit and is not updated when the commit is rewritten.
    pub ChangeId
);
```

This commit also adds documentation for the `CommitId` and `ChangeId` types
defined using the `id_type!` macro.
2024-02-27 10:37:05 -05:00
Yuya Nishihara
ef9d22887c tests: disable gpg unknown_key() test on Windows as well
Follows up 7552f939c6 "tests: disable most gpg integration tests on Windows."
I couldn't find this test failing in a few samples before, but it does now.
2024-02-27 00:55:06 +09:00
Martin von Zweigbergk
1cbf2b4acf rewrite: allow working-copy to be abandoned
This removes the special handling of the working-copy commit. By
recording when an empty/emptied commit was abanoned, we rebase
descendants correctly and create a new empty working-copy commit on
top.
2024-02-25 16:39:05 -08:00
Martin von Zweigbergk
3bc3a63411 rewrite: move decision about abandoned commit into update_references() 2024-02-25 16:39:05 -08:00
Yuya Nishihara
7552f939c6 tests: disable most gpg integration tests on Windows
These tests often stuck on Windows CI for unknown reasons. Let's mark them
ignored for the moment. The unknown_key test is allowed because it somehow
appears to pass.

https://github.com/martinvonz/jj/actions/runs/8009950119/job/21879789008?pr=3123#step:7:1487

#3140
2024-02-25 17:07:05 +09:00
Yuya Nishihara
e588a9babc backend: allow cheap copy of MillisSinceEpoch(i64)
It's unlikely this type will become uncopyable.
2024-02-25 09:00:56 +09:00
Yuya Nishihara
ebf90384f6 operation: add shorthand for .store_operation().metadata 2024-02-25 09:00:56 +09:00
Yuya Nishihara
a67aa08995 gitignore: make objects chain be more Arc friendly
This partially reverts changes in a9f489ccdf "Switch to ignore crate for
gitignore handling." Since child ignore object no longer needs to access the
root to resolve the prefix path, it's simpler to store a matcher per node.
2024-02-24 15:55:10 +09:00
Yuya Nishihara
febae9f9e8 gitignore: fix prefix handling when chaining .gitignore in sub directory
The prefix is relative to the root, not to the parent .gitignore file.

Fixes #3126
2024-02-24 15:55:10 +09:00
Yuya Nishihara
2f25848883 gitignore: update file ordering test to not use relative path in patterns
With the current implementation, the file3 pattern is set to the prefix
"foo/foo/bar". I don't know if (unrooted) "baz" prefixed with "foo/foo/bar"
should match "foo/bar/baz", but apparently it is. Anyway, that wouldn't be
the case in practice because adjacent .gitignore files shouldn't be loaded.
2024-02-24 15:55:10 +09:00
Yuya Nishihara
073310547c operation: make Operation object cheaply clonable
We do clone Operation object in several places, and I'm going to add one more
.clone() in the templater. Since the underlying metadata has many fields, I
think it's better to wrap it with Arc just like a Commit object.
2024-02-23 10:13:25 +09:00
Yuya Nishihara
62f0cb8c3f cli: change default log revset to not include all tagged heads
The default immutable_heads() includes tags(), which makes sense, but computing
heads(tags()) can be expensive because the tags() set is usually sparse. For
example, "jj bench revset 'heads(tags())'" took 157ms in my linux stable
mirror. We can of course optimize the heads evaluation by using bit set or
segmented index, but the query includes many historical heads if the repository
has per-release branches, which are uninteresting anyway. So, this patch
replaces heads(immutable_heads()) with trunk().

The reason we include heads(immutable_heads()) is to mitigate the following
problem. Suppose trunk() is the branch to be based off, I think using trunk()
here is pretty good.

```
A   B
*---*----* trunk() ⊆ immutable_heads()
     \
      * C
```
https://github.com/martinvonz/jj/pull/2247#discussion_r1335078879
2024-02-23 00:25:58 +09:00
Yuya Nishihara
f21c078249 revset: ad-hoc optimization for range queries containing unwanted wanted heads
In my linux stable mirror, this makes the default log revset evaluation super
fast. immutable_heads(), if configured properly, includes many historical
branch heads which are also the visible heads.

revsets/immutable_heads()..
---------------------------
0     12.27     117.1±0.77m
3      1.00       9.5±0.08m
2024-02-22 23:26:29 +09:00
Yuya Nishihara
f71f065b17 revset: rename InternalRevset::iter() to ::entries() 2024-02-22 23:26:29 +09:00
Yuya Nishihara
1572c251ef revset: add positions() iterator to InternalRevset
I just wanted to clean up the callers, but this might also be marginally
faster.
2024-02-22 23:26:29 +09:00
Yuya Nishihara
33c7e18ac8 revset: flip ordering of generic combination iterators
As a general-purpose iterator combinator, ascending order makes more sense.
2024-02-22 23:26:29 +09:00
Yuya Nishihara
22933563e8 revset: extract generic combination iterators
I'm going to add pre-filtering to the 'roots..heads' evaluation path, and
difference_by() will be used there to calculate 'heads ~ roots'.

Union and intersection iterators are slightly changed so that all iterators
prioritize iter1's item.
2024-02-22 23:26:29 +09:00
Julien Vincent
f97e929cbf sign: Skip gpg tests if gpg is not installed
This adds a guard to the gpg signing tests which will skip the test if
`gpg` is not installed on the system.

This is done in order to avoid requiring all collaborators to have setup
all the tools on their local machines that are required to test commit
signing.
2024-02-21 13:22:53 +00:00
Yuya Nishihara
9f05aa8c46 tests: fix fun typo "singing" -> "signing" 2024-02-21 22:04:41 +09:00
Yuya Nishihara
e3d2ff2b75 signing: change default gpg program, add --keyid-format option accordingly
This is the default of Git, and Debian sid doesn't install the gpg2 symlink
by default.

https://github.com/git/git/blob/v2.43.2/gpg-interface.c#L92
https://github.com/martinvonz/jj/pull/3007#discussion_r1496877808
https://packages.debian.org/bookworm/gnupg2
2024-02-21 22:04:41 +09:00
Austin Seipp
6c31bab0d3 fsmonitor: allow core.fsmonitor = "none" to disable
When doing things like testing snapshot performance differences,
this allows you to turn off the monitor, no matter what the enabled
user or repository configuration has, e.g.

    jj st --config-toml='core.fsmonitor="none"'

Signed-off-by: Austin Seipp <aseipp@pobox.com>
2024-02-20 20:19:47 -06:00
Evan Mesterhazy
79518eafce Output better error messages when deriving ContentHash for an enum fails
Consider this code:
```
struct NoContentHash {}

#[derive(ContentHash)]
enum Hashable {
    NoCanHash(NoContentHash),
    Empty,
}
```

Before this commit, it generates an error like this:
```
error[E0277]: the trait bound `NoContentHash: ContentHash` is not satisfied
   --> lib/src/content_hash.rs:150:10
    |
150 | #[derive(ContentHash)]
    |          ^^^^^^^^^^^ the trait `ContentHash` is not implemented for `NoContentHash`
151 | enum Hashable {
152 |     NoCanHash(NoContentHash),
    |     --------- required by a bound introduced by this call
    |
    = help: the following other types implement trait `ContentHash`:
              bool
              i32
              i64
              u8
              u32
              u64
              std::collections::HashMap<K, V>
              BTreeMap<K, V>
            and 35 others

For more information about this error, try `rustc --explain E0277`.
```

After this commit, it generates a better error message:
```
error[E0277]: the trait bound `NoContentHash: ContentHash` is not satisfied
   --> lib/src/content_hash.rs:152:15
    |
152 |     NoCanHash(NoContentHash),
    |               ^^^^^^^^^^^^^ the trait `ContentHash` is not implemented for `NoContentHash`
    |
    = help: the following other types implement trait `ContentHash`:
              bool
              i32
              i64
              u8
              u32
              u64
              std::collections::HashMap<K, V>
              BTreeMap<K, V>
            and 35 others

For more information about this error, try `rustc --explain E0277`.
error: could not compile `jj-lib` (lib) due to 1 previous error
```

It also works for enum variants with named fields:
```
error[E0277]: the trait bound `NoContentHash: ContentHash` is not satisfied
   --> lib/src/content_hash.rs:152:23
    |
152 |     NoCanHash { named: NoContentHash },
    |                       ^^^^^^^^^^^^^ the trait `ContentHash` is not implemented for `NoContentHash`
    |
    = help: the following other types implement trait `ContentHash`:
              bool
              i32
              i64
              u8
              u32
              u64
              std::collections::HashMap<K, V>
              BTreeMap<K, V>
            and 35 others

For more information about this error, try `rustc --explain E0277`.
```
2024-02-20 16:29:25 -05:00
Evan Mesterhazy
e8f324ffde Replace uses of content_hash! with #[derive(ContentHash)]
This is a pure refactor with no behavior changes.

#3054
2024-02-20 14:18:13 -05:00
Evan Mesterhazy
966a5505e2 Add support for deriving ContentHash for Enums
Here's an example of what the derived output looks like for an enum:

```rust
pub enum TreeValue {
    File { id: FileId, executable: bool },
    Symlink(SymlinkId),
    Tree(TreeId),
    GitSubmodule(CommitId),
    Conflict(ConflictId),
}
#[automatically_derived]
impl ::jj_lib::content_hash::ContentHash for TreeValue {
    fn hash(&self, state: &mut impl digest::Update) {
        match self {
            Self::File { id, executable } => {
                state.update(&0u32.to_le_bytes());
                ::jj_lib::content_hash::ContentHash::hash(id, state);
                ::jj_lib::content_hash::ContentHash::hash(executable, state);
            }
            Self::Symlink(field_0) => {
                state.update(&1u32.to_le_bytes());
                ::jj_lib::content_hash::ContentHash::hash(field_0, state);
            }
            Self::Tree(field_0) => {
                state.update(&2u32.to_le_bytes());
                ::jj_lib::content_hash::ContentHash::hash(field_0, state);
            }
            Self::GitSubmodule(field_0) => {
                state.update(&3u32.to_le_bytes());
                ::jj_lib::content_hash::ContentHash::hash(field_0, state);
            }
            Self::Conflict(field_0) => {
                state.update(&4u32.to_le_bytes());
                ::jj_lib::content_hash::ContentHash::hash(field_0, state);
            }
        }
    }
}
```


#3054
2024-02-20 12:59:35 -05:00
Evan Mesterhazy
8e1a6c708f Add support for generics to #[derive(ContentHash)]
#3054
2024-02-20 12:48:25 -05:00
Daehyeok Mun
a9f489ccdf Switch to ignore crate for gitignore handling.
Co-authored-by: Waleed Khan <me@waleedkhan.name>
2024-02-20 09:12:46 -08:00
Evan Mesterhazy
965d6ce4e4 Implement a procedural macro to derive the ContentHash trait for structs
This is a no-op in terms of function, but provides a nicer way to derive the
ContentHash trait for structs using the `#[derive(ContentHash)]` syntax used
for other traits such as `Debug`.

This commit only adds the macro. A subsequent commit will replace uses of
`content_hash!{}` with `#[derive(ContentHash)]`.

The new macro generates nice error messages, just like the old macro:

```
error[E0277]: the trait bound `NotImplemented: content_hash::ContentHash` is not satisfied
   --> lib/src/content_hash.rs:265:16
    |
265 |             z: NotImplemented,
    |                ^^^^^^^^^^^^^^ the trait `content_hash::ContentHash` is not implemented for `NotImplemented`
    |
    = help: the following other types implement trait `content_hash::ContentHash`:
              bool
              i32
              i64
              u8
              u32
              u64
              std::collections::HashMap<K, V>
              BTreeMap<K, V>
            and 38 others
```

This commit does two things to make proc macros re-exported by jj_lib useable
by deps:

1. jj_lib needs to be able refer to itself as `jj_lib` which it does
   by adding an `extern crate self as jj_lib` declaration.

2. jj_lib::content_hash needs to re-export the `digest::Update` type so that
   users of jj_lib can use the `#[derive(ContentHash)]` proc macro without
   directly depending on the digest crate. This is done by re-exporting it
   as `DigestUpdate`.


#3054
2024-02-20 11:29:05 -05:00
Ilya Grigoriev
106483ad6a clippy: run nightly cargo clippy --fix 2024-02-19 23:38:33 -08:00
Martin von Zweigbergk
11c67cf979 op_store: add metadata flag for ops representing working-copy snapshot
It should be useful at least in the presentation layer to know which
operations correspond to working-copy snapshots. They might be
rendered differently in the graph, for example. Or maybe an undo
command wants to warn if you just undid a snapshot operation. This
patch just introduces a field in the metadata to store the
information.
2024-02-19 22:44:38 -08:00
Julien Vincent
23e5fba737 sign: Add SSH backend tests 2024-02-20 00:02:08 +00:00
Julien Vincent
5e24677301 sign: Implement SSH signing backend 2024-02-20 00:02:08 +00:00
Julien Vincent
7c11a61c23 sign: GPG backend tests 2024-02-20 00:02:08 +00:00
Anton Bulakh
0efaef2da9 sign: Implement GPG signing backend
Now it is actually possible to set GPG as the main backend and have jj
"preserving" signatures on rewrites. Just no way to make signatures yet
2024-02-20 00:02:08 +00:00
Martin von Zweigbergk
3f1d75f518 rewrite: default to not simplifying ancestor merges
This means auto-rebase will no longer simplify ancestor merges.
2024-02-19 14:20:18 -08:00
Martin von Zweigbergk
a9d0300b11 rewrite: make simplification of ancestor merges optional
I think the conclusion from #2600 is that at least auto-rebasing
should not simplify merge commits that merge a commit with its
ancestor. Let's start by adding an option for that in the library.
2024-02-19 14:20:18 -08:00
Yuya Nishihara
0c0eb37f2e index: don't store commit ids in sorted lookup table to save disk space
This reduces the index file size. In my linux mirror repo containing 1591524
commits, the initial index file shrank from 122MB to 92MB. In theory, this
makes commit id lookup slow because of additional indirection and cache miss,
but I don't see significant difference. In mid-size repo, this is actually a
bit faster thanks to smaller index reads.

Alternatively, the commit id field could be removed from the CommitGraphEntry,
but doing that would introduce indirect lookup there, and the index disk size
isn't as small as this change.

- jj-0 baseline                         122MB
- jj-1 shrink CommitLookupEntry (this)   92MB
- jj-3 shrink CommitGraphEntry           98MB

Mid-size repo, "log" with default template
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1,jj-2,jj-3 \
  -s "target/release-with-debug/{bin} -R ~/mirrors/linux debug reindex" \
  "target/release-with-debug/{bin} -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=\"\"'"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     177.7 ms ±  12.9 ms    [User: 96.3 ms, System: 81.5 ms]
  Range (min … max):   156.8 ms … 191.2 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     169.8 ms ±  13.8 ms    [User: 93.3 ms, System: 76.6 ms]
  Range (min … max):   151.1 ms … 191.5 ms    20 runs

Benchmark 4: target/release-with-debug/jj-3 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     170.3 ms ±  13.4 ms    [User: 90.1 ms, System: 79.7 ms]
  Range (min … max):   154.8 ms … 186.2 ms    20 runs

Relative speed comparison
        1.05 ±  0.11  target/release-with-debug/jj-0 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
        1.00          target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
        1.00 ±  0.11  target/release-with-debug/jj-3 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
```

Small repo, "log" thousands of commits with -T"commit_id.shortest()"
```
% hyperfine --sort command --warmup 3 --runs 100 -L bin jj-0,jj-1,jj-2,jj-3 \
  -s "target/release-with-debug/{bin} -R ~/mirrors/git debug reindex" \
  "target/release-with-debug/{bin} -R ~/mirrors/git --ignore-working-copy log -r.. -l5000 -T'commit_id.shortest()' --config-toml='revsets.short-prefixes=\"\"'"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/git --ignore-working-copy log -r.. -l5000 -T'commit_id.shortest()' --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     179.3 ms ±  12.8 ms    [User: 149.7 ms, System: 29.6 ms]
  Range (min … max):   155.2 ms … 191.0 ms    100 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r.. -l5000 -T'commit_id.shortest()' --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     179.1 ms ±  13.7 ms    [User: 148.5 ms, System: 30.5 ms]
  Range (min … max):   157.2 ms … 196.7 ms    100 runs

Benchmark 4: target/release-with-debug/jj-3 -R ~/mirrors/git --ignore-working-copy log -r.. -l5000 -T'commit_id.shortest()' --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     178.2 ms ±  13.6 ms    [User: 148.7 ms, System: 29.6 ms]
  Range (min … max):   156.5 ms … 191.7 ms    100 runs

Relative speed comparison
        1.01 ±  0.11  target/release-with-debug/jj-0 -R ~/mirrors/git --ignore-working-copy log -r.. -l5000 -T'commit_id.shortest()' --config-toml='revsets.short-prefixes=""'
        1.01 ±  0.11  target/release-with-debug/jj-1 -R ~/mirrors/git --ignore-working-copy log -r.. -l5000 -T'commit_id.shortest()' --config-toml='revsets.short-prefixes=""'
        1.01 ±  0.11  target/release-with-debug/jj-3 -R ~/mirrors/git --ignore-working-copy log -r.. -l5000 -T'commit_id.shortest()' --config-toml='revsets.short-prefixes=""'
```
2024-02-19 11:36:45 +09:00
Vladimir Petrzhikovskii
06d67f02d8 cli: list new remote branches during git fetch 2024-02-18 17:36:01 +01:00
Yuya Nishihara
a1b16c5583 index: build reachable change ids set lazily
Instead of abstracting RevWalk over borrowed/Arc-ed index types, I decided to
implement bitset-based ancestor traversal. It's simpler and probably faster so
long as the set isn't sparse.

"jj log" without working copy snapshot:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1,jj-2 \
  -s "target/release-with-debug/{bin} -R ~/mirrors/linux debug reindex" \
  "target/release-with-debug/{bin} -R ~/mirrors/linux \
   --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=\"\"'"
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     271.3 ms ±   9.9 ms    [User: 183.8 ms, System: 87.7 ms]
  Range (min … max):   250.5 ms … 282.7 ms    20 runs

Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     177.5 ms ±  12.6 ms    [User: 94.6 ms, System: 82.9 ms]
  Range (min … max):   154.4 ms … 188.7 ms    20 runs

Relative speed comparison
        1.53 ±  0.12  target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
        1.00          target/release-with-debug/jj-2 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
```

"jj status" with working copy snapshot (watchman enabled):
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1,jj-2 \
  -s "target/release-with-debug/{bin} -R ~/mirrors/linux debug reindex" \
  "target/release-with-debug/{bin} -R ~/mirrors/linux \
   status --config-toml='revsets.short-prefixes=\"\"'"
Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     318.6 ms ±  12.6 ms    [User: 219.1 ms, System: 94.1 ms]
  Range (min … max):   294.2 ms … 333.0 ms    20 runs

Benchmark 3: target/release-with-debug/jj-2 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     214.7 ms ±  15.0 ms    [User: 117.4 ms, System: 96.1 ms]
  Range (min … max):   198.4 ms … 243.3 ms    20 runs

Relative speed comparison
        1.48 ±  0.12  target/release-with-debug/jj-1 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
        1.00          target/release-with-debug/jj-2 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
```
2024-02-19 00:54:43 +09:00
Yuya Nishihara
adcb01ef95 index: move RevWalk tests to inner module
The main tests module is getting bigger, and these tests are very specific
to the RevWalk* implementations.
2024-02-19 00:54:43 +09:00
Yuya Nishihara
924a5fc842 index: inline entry size calculation
There aren't many callers now, and using self.commit_id_length might help
compiler remove redundant bounds checking in CommitLookupEntry.
2024-02-19 00:47:46 +09:00
Yuya Nishihara
d5c75da4f5 index: precompute base data offsets
These offsets are getting messier, so let's calculate them in one place. This
will probably help compiler optimization.
2024-02-19 00:47:46 +09:00
Yuya Nishihara
3c7aa75b9b index: switch to persistent change id index
The shortest change id prefix will become a few digits longer, but I think
that's acceptable. Entries included in the "revsets.short-prefixes" set are
unaffected.

The reachable set is calculated eagerly, but this is still faster as we no
longer need to sort the reachable entries by change id. The lazy version will
save another ~100ms in mid-size repos.

"jj log" without working copy snapshot:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1,jj-2 \
  -s "target/release-with-debug/{bin} -R ~/mirrors/linux debug reindex" \
  "target/release-with-debug/{bin} -R ~/mirrors/linux \
   --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=\"\"'"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     353.6 ms ±  11.9 ms    [User: 266.7 ms, System: 87.0 ms]
  Range (min … max):   329.0 ms … 365.6 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     271.3 ms ±   9.9 ms    [User: 183.8 ms, System: 87.7 ms]
  Range (min … max):   250.5 ms … 282.7 ms    20 runs

Relative speed comparison
        1.99 ±  0.16  target/release-with-debug/jj-0 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
        1.53 ±  0.12  target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r.. -l100 --config-toml='revsets.short-prefixes=""'
```

"jj status" with working copy snapshot (watchman enabled):
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1,jj-2 \
  -s "target/release-with-debug/{bin} -R ~/mirrors/linux debug reindex" \
  "target/release-with-debug/{bin} -R ~/mirrors/linux \
   status --config-toml='revsets.short-prefixes=\"\"'"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     396.6 ms ±  10.1 ms    [User: 300.7 ms, System: 94.0 ms]
  Range (min … max):   373.6 ms … 408.0 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     318.6 ms ±  12.6 ms    [User: 219.1 ms, System: 94.1 ms]
  Range (min … max):   294.2 ms … 333.0 ms    20 runs

Relative speed comparison
        1.85 ±  0.14  target/release-with-debug/jj-0 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
        1.48 ±  0.12  target/release-with-debug/jj-1 -R ~/mirrors/linux status --config-toml='revsets.short-prefixes=""'
```
2024-02-18 09:44:57 +09:00
Yuya Nishihara
5f3a31300b index: implement index-level change id lookup methods
These methods are basically the same as the commit_id versions, but
resolve_change_id_prefix() is a bit more involved as we need to gather matches
from multiple segments.
2024-02-18 09:44:57 +09:00
Yuya Nishihara
f73e590837 index: implement segment-level change id lookup methods
In resolve_change_id_prefix(), I've implemented two different ways of
collecting the overflow items. I don't think they impact the performance,
but we can switch to the alternative method as needed.
2024-02-18 09:44:57 +09:00
Yuya Nishihara
8cdf6d752c index: move change ids to sstable, build change-id-to-pos lookup table
This basically means that the change ids are interned. We'll implement binary
search over the sorted change ids table. The table could be sorted differently
for better cache locality, but it is in lexicographical order for simplicity.
With my testing, the cost of the id lookup isn't dominant.

Unlike the parent entries, the size of the per-id overflow items isn't saved.
That's s because the number of the same-change-id commits is either 1 or many.
It doesn't make sense to allocate 8 bytes for each change id. Instead, we'll
pay extra indirection cost to determine the size.
2024-02-18 09:44:57 +09:00
Yuya Nishihara
9974a46327 index: clarify parent entries are global positions
I'm going to add change id overflow table whose elements are of LocalPosition
type. Let's make sure that the serialization code would break if we changed
the underlying data type.
2024-02-18 09:44:57 +09:00
Thomas Castiglione
aaa5d6bc4f working_copy: add Send supertrait
If WorkingCopy: Send, then Workspace is Send, which is useful for long-running
servers. All existing impls are Send already, so this is just a marker.
2024-02-17 15:13:25 +08:00
Yuya Nishihara
5eea88d26a tests: fix concurrent git read/write test to retry on ref lock contention
Apparently, gix has 100ms timeout. Since this test tries to create contended
situation, it's possible that the ref lock can't be acquired. I've added
upper bound to the retry loop at b37293fa68 "tests: add upper bound to
test_concurrent_read_write_commit() loop", so ignoring arbitrary errors
should be okay.

The problem can be reproduced on my Linux machine by inserting 10ms sleep() to
gix and increasing the concurrency.

Fixes #3069
2024-02-17 15:09:27 +09:00
Yuya Nishihara
ce295f8bc2 op_store: remove unneeded repr(u8) from RemoteRefState
It no longer makes sense after e1fd402d39 "Fix the ContentHash implementations
for std::Option, MergedTreeId, and RemoteRefState."
2024-02-17 02:13:44 +09:00
Yuya Nishihara
718d080e7a index: make reindexing message less scary 2024-02-17 01:45:23 +09:00
Evan Mesterhazy
a80d0183a2 Implement ContentHash for u32 and u64
This is for completeness and to avoid accidents such as someone calling
`ContentHash::hash(1234u32.to_le_bytes())` and expecting it to hash properly as
a u32 instead of a 4 byte slice, which produces a different hash due to hashing
the length of the slice before its contents.
2024-02-16 10:23:39 -05:00
Evan Mesterhazy
e1fd402d39 Fix the ContentHash implementations for std::Option, MergedTreeId, and RemoteRefState
The `ContentHash` documentation specifies that implementations for enums should
hash the ordinal number of the variant contained in the enum as a 32-bit
little-endian number and then hash the contents of the variant, if any.

The current implementations for `std::Option`, `MergedTreeId`, and
`RemoteRefState` are non-conformant since they hash the ordinal number as a u8
with platform specific endianness.


Fixes #3051
2024-02-16 09:27:32 -05:00
Yuya Nishihara
903f18acfd index: extract helper functions for id lookup in mutable table
Similar to the previous commit, these functions will be reused by the change id
lookup methods. The return value isn't cloned because resolve_id_prefix() will
return (key, value) pair, and the current caller doesn't need a cloned value.
2024-02-16 11:12:53 +09:00
Yuya Nishihara
000cb41c7e index: extract helper struct for post processing binary search result
This code will be shared among commit id and change id lookup functions.
2024-02-16 11:12:53 +09:00
Yuya Nishihara
6fa660d9a8 index: extract inner binary search function
The callback returns Ordering instead of &[u8] due to lifetime difficulty.
2024-02-16 11:12:53 +09:00
Yuya Nishihara
2e64bf83fd index: pass bytes prefix to binary search function
This helps extract common binary search helper to be used by change id index.
2024-02-14 23:34:47 +09:00
Yuya Nishihara
91a68b950d index: adjust binary search function to conform to std behavior
This removes redundant case from resolve_neighbor_commit_ids(). The returned
position should never be lower than the prefix id.

The implementation is basically a copy of slice::binary_search_by(). We still
use (low + high) / 2 as the size wouldn't exceed 2^31.

https://github.com/rust-lang/rust/blob/1.76.0/library/core/src/slice/mod.rs#L2825
2024-02-14 23:34:47 +09:00
Yuya Nishihara
e2c8a8fabd index: fix change id resolution test to not depend on deterministic order
Since IdIndex sorts the entries by using .sort_unstable_by_key(), the order of
the same-key elements is undefined. Perhaps, it's stable for short arrays, and
the test passes because of that.
2024-02-14 23:22:23 +09:00
Yuya Nishihara
8b1dfa7157 index: compact parent encoding, inline up to two parents
This saves 4 more bytes per entry, and more importantly, most commit parents
can be resolved with no indirection to the overflow table.

IIRC, Git always inlines the first parent, but that wouldn't be useful in jj
since jj diffs merge commit against the auto-merge parent. The first merge
parent is nothing special.

I'll use a similar encoding in change id sstable, where only one position
will be inlined (to optimize for imported commits.)

Benchmark number measuring the cost of change id index building:
```
% hyperfine --sort command --warmup 3 --runs 20 -L bin jj-0,jj-1 \
  -s "target/release-with-debug/{bin} -R ~/mirrors/linux \
      --ignore-working-copy debug reindex" \
  "target/release-with-debug/{bin} -R ~/mirrors/linux \
    --ignore-working-copy log -r@ --config-toml='revsets.short-prefixes=\"\"'"
Benchmark 1: target/release-with-debug/jj-0 -R ~/mirrors/linux --ignore-working-copy log -r@ --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     342.9 ms ±  14.5 ms    [User: 202.4 ms, System: 140.6 ms]
  Range (min … max):   326.6 ms … 360.6 ms    20 runs

Benchmark 2: target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r@ --config-toml='revsets.short-prefixes=""'
  Time (mean ± σ):     325.0 ms ±  13.6 ms    [User: 196.2 ms, System: 128.8 ms]
  Range (min … max):   311.6 ms … 343.2 ms    20 runs

Relative speed comparison
        1.06 ±  0.06  target/release-with-debug/jj-0 -R ~/mirrors/linux --ignore-working-copy log -r@ --config-toml='revsets.short-prefixes=""'
        1.00          target/release-with-debug/jj-1 -R ~/mirrors/linux --ignore-working-copy log -r@ --config-toml='revsets.short-prefixes=""'
```
2024-02-14 22:33:48 +09:00
Yuya Nishihara
89928ffdd8 index: remove local-global pos round trip from entry_by_id() 2024-02-14 22:33:48 +09:00
Yuya Nishihara
249449ff1a index: store local position in lookup table 2024-02-14 22:33:48 +09:00
Yuya Nishihara
1d11cffcfa index: use local position in segment-local operations
I'm going to change the index format to store local positions in the lookup
table. That's not super important, but I think it makes sense because the
lookup table should never contain inter-segment links.

The mutable segment now stores local positions in its lookup map. The readonly
segment will be updated later.
2024-02-14 22:33:48 +09:00
Yuya Nishihara
3e43108abb index: remove unused flag field from readonly index segment
This is remainder of fdb861b957 "backend: remove unused Commit::is_pruned."
As I'm going to change the index format, let's remove unused fields, too.
2024-02-14 22:33:48 +09:00
Yuya Nishihara
26528091e6 revset: drop now unused is_legacy flag from dag ranges 2024-02-14 10:04:56 +09:00
Yuya Nishihara
815670a4ad revset: add parsing rule and expression node dedicated for kind:"pattern"
This unblocks removal of 'is_legacy: bool' fields.

Note that all legacy dag range expressions can't be accepted by the new grammar.
For example, 'x:y()' is parsed as ('x:y', error) because 'x:y' is a valid string
pattern expression, and '(' isn't an infix operator. The old compat_dag_range_op
is NOT removed as it can still translate 'x():y' or 'x:(y)' to a better error,
and we might make the string pattern syntax stricter #2101.
2024-02-14 10:04:56 +09:00
Yuya Nishihara
815437598f revset: disable parsing rules of legacy dag range operator
The legacy parsing rules are turned into compatibility errors. The x:y rule
is temporarily enabled when parsing string patterns. It's weird, but we can't
isolate the parsing function because a string pattern may be defined in an
alias.
2024-02-14 10:04:56 +09:00
Yuya Nishihara
2905a70b18 doc, tests: drop use of deprecated revset dag range operator 2024-02-14 10:04:56 +09:00
Yuya Nishihara
1f6d1de62d index: on reindexing, print error details to stderr
It's not ideal to print the error there, but using stderr should be slightly
better. It could be a tracing message, but tracing won't be displayed by
default.
2024-02-12 19:38:36 +09:00
Yuya Nishihara
b0e8e2a1af index: move segment files to sub directory, add version number
I'm going to introduce breaking changes in index format. Some of them will
affect the file size, so version number or signature won't be needed. However,
I think it's safer to detect the format change as early as possible.

I have no idea if embedded version number is the best way. Because segment
files are looked up through the operation links, the version number could be
stored there and/or the "segments" directory could be versioned. If we want to
support multiple format versions and clients, it might be better to split the
tables into data chunks (e.g. graph entries, commit id table, change id table),
and add per-chunk version/type tag. I choose the per-file version just because
it's simple and would be non-controversial.

As I'm going to introduce format change pretty soon, this patch doesn't
implement data migration. The existing index files will be deleted and new
files will be created from scratch.

Planned index format changes include:
 1. remove unused "flags" field
 2. inline commit parents up to two
 3. add sorted change ids table
2024-02-12 19:38:36 +09:00
Yuya Nishihara
4b541e6c93 index: on reinit(), don't remove "operations" directory itself
This should be slightly safer as the store may be accessed concurrently from
another process.
2024-02-12 19:38:36 +09:00
Yuya Nishihara
81837897dc index: extract dir.join("operations") to private method 2024-02-12 19:38:36 +09:00
Martin von Zweigbergk
48a9f9ef56 repo: use Transaction for creating repo-init operation
Since the operation log has a root operation, we don't need to create
the repo-initialization operation in order to create a valid
`ReadonlyRepo` instance. I think it's conceptually simpler to create
the instance at the root operation id and then add the initial
operation using the usual `Transaction` API. That's what this patch
does.

Doing that also brought two issues to light:

 1. The empty view object doesn't have the root commit as head.
 2. The initialized `OpHeadsStore` doesn't have the root operation as
     head.

Both of those seem somewhat reasonable, but maybe we should change
them. For now, I just made the initial repo (before the initial
operation) have a single op head (to compensate for (2)). It might be
worth addressing both issues so the repo is in a better state before
we create the initial operation. Until we do, we probably shouldn't
drop the initial operation.
2024-02-11 21:19:30 -08:00
Martin von Zweigbergk
305a507ae3 repo: move creation of repo-init operation to end of init()
Since we now have a root operation, we don't need the
repo-initialization operation to create the repo. Let's move it later
to clarify that.
2024-02-11 21:19:30 -08:00
Ilya Grigoriev
a9c3af8153 test_local_working_copy: use std::fs:write instead of OpenOptions 2024-02-10 16:06:28 -08:00
Ilya Grigoriev
b2e37d448b clippy: add truncate option as suggested by clippy
In the next commit, I replace the whole thing with
std::fs::write, but I'll leave this here in case
the next commit is somhow incorrect
2024-02-10 16:06:28 -08:00
Ilya Grigoriev
a88c06068e clippy: new nightly fixes
For some reason, clippy also suggested surrounding
`self.value` with parentheses. Not sure whether
that's a clippy bug.

Cc: https://github.com/rust-lang/rust-clippy/issues/12268
2024-02-10 16:06:28 -08:00
dependabot[bot]
6d1faf9b03 Update strsim (changes tests), clap, clap_complete
This is #3002 with tests rerun to account for changes
to `strsim`, as @thoughtpolice noticed in
https://github.com/martinvonz/jj/pull/3002#issuecomment-1936763101

The string similarity changes include an example that
seems better and one that seems worse. Decreasing
the threshold definitely makes things worse.
2024-02-10 00:01:47 -08:00
Yuya Nishihara
e908bd9a17 simple_op_store: use TryFrom<i32> instead of deprecated from_i32() 2024-02-10 09:15:30 +09:00
Yuya Nishihara
421ab592be cargo: bump gix to 0.58.0, migrate to ObjectId::try_from()
The panicking conversion function appears to be renamed, and try_from() is
added instead.
2024-02-10 09:15:30 +09:00
Austin Seipp
5b517b542e rust: bump MSRV to 1.76.0
Signed-off-by: Austin Seipp <aseipp@pobox.com>
2024-02-09 15:48:01 -06:00
Martin von Zweigbergk
6c1aeff7a9 working copy: materialize symlinks on Windows as regular files
I was a bit surprised to learn (or be reminded?) that checking out
symlinks on Windows leads to a panic. This patch fixes the crash by
materializing symlinks from the repo as regular files. It also updates
the snapshotting code so we preserve the symlink-ness of a path. The
user can update the symlink in the repo by updating the regular file
in the working copy. This seems to match Git's behavior on Windows
when symlinks are disabled.
2024-02-09 09:20:24 -08:00
Martin von Zweigbergk
b253a28788 merge: add as_normal(), taken from RefTarget
The `RefTarget::as_normal()` function is not specific to `RefTarget`,
and I plan to use it from `local_working_copy`.
2024-02-09 09:20:24 -08:00
Martin von Zweigbergk
5a898b16a8 working_copy: handle symlink outside write_path_to_store()
The `write_path_to_store()` has almost no overlapping code between the
handling of symlinks and regular files, which suggests that we should
move out the handling of symlinks to the caller (there's only one).
2024-02-09 09:20:24 -08:00
Jonathan Tan
33f3a420a1 workspace: recover from missing operation
If the operation corresponding to a workspace is missing for some reason
(the specific situation in the test in this commit is that an operation
was abandoned and garbage-collected from another workspace), currently,
jj fails with a 255 error code. Teach jj a way to recover from this
situation.

When jj detects such a situation, it prints a message and stops
operation, similar to when a workspace is stale. The message tells the
user what command to run.

When that command is run, jj loads the repo at the @ operation (instead
of the operation of the workspace), creates a new commit on the @
commit with an empty tree, and then proceeds as usual - in particular,
including the auto-snapshotting of the working tree, which creates
another commit that obsoletes the newly created commit.

There are several design points I considered.

1) Whether the recovery should be automatic, or (as in this commit)
manual in that the user should be prompted to run a command. The user
might prefer to recover in another way (e.g. by simply deleting the
workspace) and this situation is (hopefully) rare enough that I think
it's better to prompt the user.

2) Which command the user should be prompted to run (and thus, which
command should be taught to perform the recovery). I chose "workspace
update-stale" because the circumstances are very similar to it: it's
symptom is that the regular jj operation is blocked somewhere at the
beginning, and "workspace update-stale" already does some special work
before the blockage (this commit adds more of such special work). But it
might be better for something more explicitly named, or even a sequence
of commands (e.g. "create a new operation that becomes @ that no
workspace points to", "low-level command that makes a workspace point to
the operation @") but I can see how this can be unnecessarily confusing
for the user.

3) How we recover. I can think of several ways:
a) Always create a commit, and allow the automatic snapshotting to
create another commit that obsoletes this commit.
b) Create a commit but somehow teach the automatic snapshotting to
replace the created commit in-place (so it has no predecessor, as viewed
in "obslog").
c) Do either a) or b), with the added improvement that if there is no
diff between the newly created commit and the former @, to behave as if
no new commit was created (@ remains as the former @).
I chose a) since it was the simplest and most easily reasoned about,
which I think is the best way to go when recovering from a rare
situation.
2024-02-09 00:38:47 -08:00
Ilya Grigoriev
12c3be70f4 lib refs.rs: rename TrackingRefPair to LocalAndRemoteRef
As discussed in
https://github.com/martinvonz/jj/pull/2962#discussion_r1479384841, the
previous name is confusing since the struct is used for pairs where the
remote branch is not tracked by the local branch.
2024-02-07 17:06:28 -08:00
jyn
d66fcf2ca0 compile integration tests as a single binary
this greatly speeds up the time to run all tests, at the cost of slightly larger recompile times for individual tests.

this unfortunately adds the requirement that all tests are listed in `runner.rs` for the crate.
to avoid forgetting, i've added a new test that ensures the directory is in sync with the file.

 ## benchmarks

before this change, recompiling all tests took 32-50 seconds and running a single test took 3.5 seconds:

```
; hyperfine 'touch lib/src/lib.rs && cargo t --test test_working_copy'
  Time (mean ± σ):      3.543 s ±  0.168 s    [User: 2.597 s, System: 1.262 s]
  Range (min … max):    3.400 s …  3.847 s    10 runs
```

after this change, recompiling all tests take 4 seconds:
```
;  hyperfine 'touch lib/src/lib.rs ; cargo t --test runner --no-run'
  Time (mean ± σ):      4.055 s ±  0.123 s    [User: 3.591 s, System: 1.593 s]
  Range (min … max):    3.804 s …  4.159 s    10 runs
```
and running a single test takes about the same:
```
; hyperfine 'touch lib/src/lib.rs && cargo t --test runner -- test_working_copy'
  Time (mean ± σ):      4.129 s ±  0.120 s    [User: 3.636 s, System: 1.593 s]
  Range (min … max):    3.933 s …  4.346 s    10 runs
```

about 1.4 seconds of that is the time for the runner, of which .4 is the time for the linker. so
there may be room for further improving the times.
2024-02-06 18:19:41 -08:00
Ilya Grigoriev
1741ab22e4 view.rs: clarify some internal function docstrings
Mostly, I was a bit confused that some of these functions return a
`TrackingRefPair` but don't seem to take into account whether the remote
branch is being tracked or not.
2024-02-06 17:52:01 -08:00
Martin von Zweigbergk
b343289238 working_copy: make reset() take a commit instead of a tree
Our virtual file system at Google (CitC) would like to know the commit
so it can scan backwards and find the closest mainline tree based on
it. Since we always record an operation id (which resolves to a
working-copy commit) when we write the working-copy state, it doesn't
seem like a restriction to require a commit.
2024-02-06 12:41:09 -08:00
Yuya Nishihara
77ceadbfd0 cleanup: remove remaining ": {source}" from error message templates 2024-02-04 09:13:21 +09:00
Yuya Nishihara
1efadd96c8 git: remove ": {source}" from FailedRefExportReason, walk chain by caller
The error output gets more verbose because all gix error sources are printed.
Maybe we'll need a better formatting, but changing to multi-line output doesn't
look nice either.
2024-02-04 09:13:21 +09:00
Yuya Nishihara
a0cefb8b7b revset, template: remove ": {source}" from parse error message template
These error types are special because the message is embedded in ASCII art. I
think it would be a source of bugs if some error types had ": {source}" but
others don't. So I'm going to remove all ": {source}"s, and let the callers
concatenate them when needed.
2024-02-04 09:13:21 +09:00
Ilya Grigoriev
d439de073d rewrite.rs: revert commits cfcc7c5e and becbc889
This mostly reverts https://github.com/martinvonz/jj/pull/2901 as well as its
fixup https://github.com/martinvonz/jj/pull/2903. The related bug is reopened,
see https://github.com/martinvonz/jj/issues/2869#issuecomment-1920367932.

The problem is that while the fix did fix #2869 in most cases, it did
reintroduce the more severe bug https://github.com/martinvonz/jj/issues/2760
in one case, if the working copy is the commit being rebased.

For example, suppose you have the tree

```
root -> A -> B -> @ (empty) -> C
```

### Before this commit

#### Case 1

`jj rebase -s B -d root --skip-empty` would work perfectly before this
commit, resulting in

```
root -> A
  \-------B -> C
           \- @ (new, empty)
```

#### Case 2

Unfortunately, if you run `jj rebase -s @ -d A --skip-empty`, you'd have the
following result (before this commit), which shows the reintroduction of #2760:

```
root -> A @ -> C
         \-- B
```

with the working copy at `A`. The reason for this is explained in
https://github.com/martinvonz/jj/pull/2901#issuecomment-1920043560.

### After this commit

After this commit, both case 1 and case 2 will be wrong in the sense of #2869,
but it will no longer exhibit the worse bug #2760 in the second case.

Case 1 would result in:

```
root -> A
  \-------B -> @ (empty) -> C
```

Case 2 would result in:

```
root -> A -> @ -> C
         \-- B
```

with the working copy remaining a descendant of A
2024-02-03 15:56:44 -08:00
Essien Ita Essien
8423c63a04 cli: Refactor workspace root directory creation
* Add file_util::create_or_reuse_dir() which is needed by all init
  functionality regardless of the backend.
2024-02-03 14:15:05 +00:00
Yuya Nishihara
ec0f2753ae repo: mark inner error of EditCommitError as source 2024-02-01 16:59:44 +09:00
Martin von Zweigbergk
7c87fe243c backends: implement as_any() on OpStore and OpHeadsStore too
It's useful for custom commands to be able to downcast to custom
backend types.
2024-01-31 00:15:29 -08:00
Ilya Grigoriev
cfcc7c5e34 test_rewrite: Fixup test comment after becbc88 2024-01-30 23:43:05 -08:00
Martin von Zweigbergk
9efa66e8c9 rewrite: remove return value from rebase_next()
`rebase_next()` returns an `Option<RebasedDescendant>`, but the only
way we use it is to decide whether to terminate the loop over
`to_visit`. Let's simplify by making the caller iterate over
`to_visit` instead.
2024-01-30 23:27:48 -08:00
Martin von Zweigbergk
881d75e899 rewrite: drop TODO about changing the API
The `rebase_next()` method is private, so I think we've addressed the
TODO.
2024-01-30 23:27:48 -08:00
Ilya Grigoriev
becbc88915 rewrite.rs: fix working copy position after jj rebase --abandon-empty
Fixes #2869
2024-01-30 22:53:55 -08:00
Ilya Grigoriev
1fff6e37a1 rewrite.rs DescendantRebaser: rename variable for clarity
The `edit` argument seems to be true if and only if the
old commit was *not* abandoned. So, I flipped its value
and renamed it to `abandoned_old_commit`.
2024-01-30 22:53:55 -08:00
Yuya Nishihara
976b801208 index: on reinit(), delete all segment files to save disk space
Perhaps, reinit() will evolve to gc() function? It's basically a gc() with
empty operation set.
2024-01-31 09:40:52 +09:00
Yuya Nishihara
3d68601c01 index: remove redundant stat() of operation link file, handle error instead
This wouldn't matter in practice, but the operation link file could be deleted
after testing the existence.
2024-01-31 09:40:52 +09:00
Yuya Nishihara
3d0b3d57d8 git_backend: on gc(), remove unreachable no-gc refs and compact them
With my jj repo, the number of jj/keep refs went down from 87887 to 27733.
The .git directory size is halved, but we'll need to clean up extra and index
files to save disk space. "git gc --prune=now && jj debug reindex" passed, so
the repo wouldn't be corrupted.

#12
2024-01-27 10:18:11 +09:00
Yuya Nishihara
351487b9f5 backend: pass Index and keep_newer timestamp parameters to gc()
GitBackend::gc() will need to check if a commit is reachable from any
historical operations. This could be calculated from the view and commit
objects, but the Index will do a better job.
2024-01-27 10:18:11 +09:00
Yuya Nishihara
845eb4ce01 git_backend: when running "git gc", chdir instead of specifying it by GIT_DIR
Hopefully this will be more reliable on Windows where path/environment stuff
is messy.
2024-01-27 10:18:11 +09:00
Yuya Nishihara
4e54021930 backend: have gc() return BackendError instead of opaque error type
The gc() implementation is likely to call other backend functions, which
return BackendError.
2024-01-27 10:18:11 +09:00
Yuya Nishihara
84949dd551 backend: mark BackendError::Other as transparent
The inner error should be the source, and I don't think the "Error:" prefix
gives additional context.
2024-01-27 10:18:11 +09:00
Yuya Nishihara
8a67191d25 git: simplify import_head() as it doesn't have to process multiple head commits 2024-01-27 00:01:59 +09:00
Yuya Nishihara
fc114ef217 git: extract Git HEAD handling bits from import_some_refs()
I'm going to make WorkspaceCommandHelper::maybe_snapshot() snapshot the working
copy before importing refs. git::import_some_refs() can rebase the working copy
branch and therefore @ can be moved. git::import_head() doesn't, and it should
be invoked before snapshotting.

git::import_head() is inserted to some of the git:import_refs() callers where
HEAD seems to matter. I feel it's a bit odd that the HEAD ref is imported to
non-colocated repo, but "jj init --git-repo" relies on that, and I think the
existence of HEAD@git is harmless. It's merely a ref to the revision checked
out somewhere else.
2024-01-27 00:01:59 +09:00
Yuya Nishihara
5a88180720 git_backend: fix import_head_commits() to not issue duplicated ref edits
This was broken at afa72ff496 "git_backend: inline prevent_gc() to bulk-update
refs." Since no-gc refs are created within a transaction, duplicated edits are
no longer allowed.
2024-01-27 00:00:57 +09:00
Ilya Grigoriev
dff440c4a8 clippy: Fix nightly warnings about "useless use of vec!" 2024-01-25 22:00:26 -08:00
Daniel Ploch
20cbe77bf5 workspace: support creating shares of custom workspaces 2024-01-25 11:46:07 -08:00
Daniel Ploch
cb889f0b45 workspace: combine working copy functions into a trait 2024-01-25 11:46:07 -08:00
Yuya Nishihara
5a7d8ac596 working_copy: don't follow symlinks when visiting files in gitignored directory
Fixes #2878
2024-01-24 16:38:48 +09:00
Yuya Nishihara
d0d4496258 tests: add executable files and symlinks to gitignored directory test 2024-01-24 16:38:48 +09:00
Martin von Zweigbergk
502150b2f4 conflicts: test materialization with with negative snapshots
We didn't have any tests with negative snapshots (after a `-------`
line). I initially thought we couldn't produce such conflict markers
anymore. I'm not sure we want to render conflicts like the one in the
test like this. I don't think I intended for `add_index` in the code
to be able to be two steps ahead of the remove. Maybe we should
rewrite the algorithm to not do that and thus never produce negative
snapshots.
2024-01-23 07:18:54 -08:00
Ilya Grigoriev
d168fd2b09 test_rebase_abandoning_empty: add children of an empty @ to the test case
This demonstrates the minor bug discussed in
https://github.com/martinvonz/jj/pull/2766#discussion_r1442365389
AKA https://github.com/martinvonz/jj/issues/2869.

It's also interesting whether changing the definition of "discardable" commit
would affect this test, see
https://github.com/martinvonz/jj/issues/2859#issuecomment-1903275884

(I think it won't, but still)
2024-01-22 18:36:49 -08:00
Jonathan Tan
0bc1341fd0 revset: add count_estimate() to Revset trait
The count() function in this trait is used by "jj branch" to determine
(and then report) how many commits a certain branch is ahead/behind
another branch. This is currently implemented by walking all commits
in the revset, counting how many were encountered. But this could be
improved: if the number is large, it is probably sufficient to report
"at least N" (instead of walking all the way), and this does not scale
well to jj backends that may not have all commits present locally (which
may prefer to return an estimate, rather than access the network).

Therefore, add a function that is explicitly documented to be O(1)
and that can return a range of values if the backend so chooses.

Also remove count(), as it is not immediately obvious that it is an
expensive call, and callers that are willing to pay the cost can obtain
the exact same functionality through iter().count() anyway. (In this
commit, all users of count() are migrated to iter().count() to preserve
all existing functionality; they will be migrated to count_estimate() in
a subsequent commit.)

"branch" needed to be updated due to this change. Although jj
is currently only available in English, I have attempted to keep
user-visible text from being assembled piece by piece, so that if we
later decide to translate jj into other languages, things will be easier
for translators.
2024-01-22 15:07:00 -08:00
Yuya Nishihara
c7be4d019c index: add all_heads_for_gc() that iterates heads of all indexed commits
GitBackend::gc() will recreate no-gc refs for the indexed heads. We could
collect all historical heads by traversing operation log, but it isn't enough
because there may be predecessor links to hidden commits, and "git gc" isn't
aware of predecessors.
2024-01-17 23:07:14 +09:00
Yuya Nishihara
afa72ff496 git_backend: inline prevent_gc() to bulk-update refs 2024-01-17 10:43:25 +09:00
Yuya Nishihara
96ee9bdb9f git_backend: ensure no-gc refs are created for all imported head commits
This also means that we can implement GC without taking care of extra
metadata. I haven't tried, but it wouldn't be easy to keep Git refs and extra
table in sync.
2024-01-17 10:43:25 +09:00
Yuya Nishihara
2e1aa6c49c git_backend: remove fast path testing imported commits, filter them by caller
The idea is that GC, if implemented, will clean up objects based on the Index
knowledge. It's probably okay to leave some extra metadata of unreachable
objects, but GC-ed refs should be recreated if the corresponding heads get
reimported. See also the next patch.
2024-01-17 10:43:25 +09:00
Yuya Nishihara
48c4985e34 git_backend: ensure that no-gc ref target never conflicts 2024-01-17 10:43:25 +09:00
Yuya Nishihara
f66c859fe4 git_backend: use lower-level API to create no-gc refs
This will allow us to issue multiple prevent_gc() requests all at once. It's
not important here, but will be unavoidable when implementing GC. Deleting
tons of refs from packed refs is super slow if the requests were processed one
by one.
2024-01-17 10:43:25 +09:00
Yuya Nishihara
34956f17e5 op_walk: assert that virtual root op is not reparented
This is enforced by the caller, but it's scary if it weren't.
2024-01-16 21:46:54 +09:00
Yuya Nishihara
fb3e006a45 op_store: add special case for root id resolution 2024-01-16 21:46:54 +09:00
Yuya Nishihara
660806ffed tests: set up unparented operations for id prefix tests
Otherwise we can't easily pick i to create operation id starting with "0".
2024-01-16 21:46:54 +09:00
Yuya Nishihara
df1be14aa8 tests: split op id resolution tests, don't require merged op for prefix tests
This makes it easy to set up crafted environment for prefix resolution tests.
2024-01-16 21:46:54 +09:00
Essien Ita Essien
dc074363d1 no-op: Move external git repo canonicalization into Workspace::init_git_external
* Move canonicalization of the external git repo path into the Workspace::init_git_external().
  This keeps necessary code together.
* Add a new variant of WorkspaceInitError for reporting path not found errors. The user error
  string is written to pass existing tests.
2024-01-16 10:46:02 +00:00
Yuya Nishihara
da218d19db repo: optimize enforce_view_invariants() to not traverse ancestors until root
Because the default index cuts off the traversal at min(generations), including
the root id means all ancestors will be visited. This could be worked around at
the index side, but I think it's the repo/view's responsibility. That being
said, it's not uncommon to pad a revset with "root()", so it might make sense
for the index to special case the root id.

I also removed the redundant .clone().
2024-01-15 09:57:02 +09:00
Martin von Zweigbergk
6e302bb3a2 op_store: add a virtual root operation, similar to root commit
It seems obvious in hindsight to have a virtual root operation just
like we have a virtual root commit. It removes the same kind of
problems by making sure there's always a common ancestor (or multiple)
between any two commits.

I think the reason I didn't add a root operation from the beginning
was that there used to be a mandatory working-copy commit in the view
(this was before support for multiple workspaces).

Perhaps we should remove the "initialize repo" operation now. The only
difference between their view objects is that the "initialize repo"
operation adds the root commit as a head. We could add that to the
root operation, but then the root operation's value depends on the
commit backend.
2024-01-14 10:15:14 -08:00
Martin von Zweigbergk
c9af8bf43a view: drop tracking of public heads
We've had the public_heads for as long as we've had the View object,
IIRC (I didn't check), but we still don't use it for anything. I don't
have any concrete plans for using it either. Maybe our config for
immutable commits is good enough, or maybe we'll want something more
generic (like Mercurial's phases). For now, I think we should simplify
by removing it the storage for public heads.
2024-01-13 22:23:57 -08:00
Martin von Zweigbergk
a66e2a0a6d working_copy: mark commit_id field in proto reserved
By marking it reserved, we prevent accidental use. We can still read
working copy protos that have the field.
2024-01-12 17:38:23 -08:00
Yuya Nishihara
543036c753 cli: run "op log" without loading repo or merging concurrent ops
When debugging behavior of badly-GCed repos, I find it's annoying that "op log"
fails because the index can't be loaded. Since "op log" doesn't need a repo, I
think it's better to display the exact op-heads state without merging.
2024-01-13 10:38:10 +09:00
Yuya Nishihara
831a530283 op_walk: make walk_ancestors() sort head ops to stabilize output
I thought this would be done by dag_walk::topo_order_reverse_lazy_ok(), but
apparently I made it preserve the input order in a way topo_order_reverse()
would do.
2024-01-13 10:38:10 +09:00
Yuya Nishihara
b7eb551cf7 index: fix reindexing to scan all referenced commits such as hidden remote refs
Since hidden commits can be looked up by remote_branches() revset for example,
reindexing should traverse ancestors from all named refs in addition to the
visible heads.
2024-01-12 12:53:16 +09:00
Yuya Nishihara
805046ceba op_walk: extract function that resolves op expression with preloaded head op
I'm going to make "op abandon" not load the repo, and this function will
be used there instead of resolve_op_with_repo().
2024-01-12 08:01:13 +09:00
Yuya Nishihara
83ede241e3 op_walk: don't resolve heads beyond @ operation
Since `jj undo --at-op=OP @` resolves @ to OP, I think OP should be the head
in that context, and the descendants of OP shouldn't be accessible by @+.
2024-01-12 08:01:13 +09:00
Yuya Nishihara
ba42b37a67 operation: remove operation::View wrapper in favor of view::View
view::View doesn't track ViewId, but there are no callers of cheap Eq/Hash
functions.
2024-01-12 08:01:02 +09:00
Yuya Nishihara
d5a98df046 git_backend: teach "format.tree-level-conflicts" config by constructor
Since GitBackend constructors now depend on &UserSettings, it makes sense to
initialize the formatting options there.
2024-01-10 08:57:51 +09:00
Yuya Nishihara
e5286aed08 index: move lifetimed change_id_index() to MutableIndex, rename 'static version
change_id_index() is only used by Readonly/MutableRepo, so we don't need an
abstraction at Index. evaluate_revset() is somewhat similar, but the callers
rely on &dyn Repo.
2024-01-09 10:38:00 +09:00
Yuya Nishihara
dc68f1eeb2 revset: remove unused lifetime parameter from Revset<'index> 2024-01-09 10:37:43 +09:00
Yuya Nishihara
e9d31177cb op_store: implement GC of unreachble operations and views
Since new operations and views may be added concurrently by another process,
there's a risk of data corruption. The keep_newer parameter is a mitigation
for this problem. It's set to preserve files modified within the last 2 weeks,
which is the default of "git gc". Still, a concurrent process may replace an
existing view which is about to be deleted by the gc process, and the view
file would be lost.

#12
2024-01-09 10:37:03 +09:00
Yuya Nishihara
5894f3dfba operation: add shorthand for &store_operation().view_id 2024-01-09 10:37:03 +09:00
Martin von Zweigbergk
c98b0d76af index: move Revset::change_id_index() to Index
We current have `Revset::change_id_index()` for creating a
`ChangeIdIndex` for a given revset. I think it will be hard to make it
performant for general revsets, especially in very large repos and
with custom index implementations, like the one we have at Google. If
we instead restrict it to including all ancestors of a set of heads, I
think it will be much easier to implement. We only use
`Revset::change_id_index()` with revsets including all visible commits
today, so we won't lose any current functionality by making it more
restricted.
2024-01-08 06:06:47 -08:00
Martin von Zweigbergk
2f4594540a tests: move ChangeIdIndex test from test_revset to test_index 2024-01-08 06:06:47 -08:00
Martin von Zweigbergk
1508f28567 tests: update ChangeIdIndex test to include ancestors in set
I plan to replace `Revset::change_id_index()` by
`Index::change_id_index(heads)`, but one of the tests currently uses a
set of commits that does not include ancestors. This patch updates it
to include ancestors (and changes the set of heads to keep the set
small enough for the test).
2024-01-08 06:06:47 -08:00
Martin von Zweigbergk
f9dc00704d index: specialize evaluate_revset_static() to change_id_index_static()
I'd like to move `change_id_index()` from `Revset` to `Index` (and
make it take the set of visible heads as argument). We currently use
`evaluate_revset_static()` only to get a `ChangeIdIndex`, so a good
place to start is to convert that into `change_id_index_static()`.
2024-01-08 06:06:47 -08:00
Martin von Zweigbergk
b549090acc index: adopt ChangeIdIndex and relatives from revset module
The `ChangeIdIndex` type is currently in defined in the `revset`
module because that's the only placed it's used. However, I'd like to
start using it directly from `index`. The idea is to make it possible
to create a `ChangeIdIndex` given a set of heads, without first
creating a `Revset`.
2024-01-08 06:06:47 -08:00
Martin von Zweigbergk
f0182ad4b8 default_index: adopt revset engine and graph iterator modules
The revset engine and the graph iterator are specific to the default
index implementation, so they belong in the same module.
2024-01-07 05:37:47 -08:00
Yuya Nishihara
a6616e9cea object_id: don't allow ObjectId::from_hex() a dynamically allocated string
This isn't technically needed, but it prevents API misuse. Another option
is to do some compile-time substitution, but most callers are tests and the
runtime performance wouldn't matter.
2024-01-06 00:26:36 +09:00
Yuya Nishihara
837ac15052 op_store: add resolve_operation_id_prefix() trait method that uses readdir()
The OpStore backends should have a better way to look up operation by id than
traversing from the op heads. The added method is similar to the commit Index
one, but returns an OpStoreResult because the backend operation can fail.

FWIW, if we want .shortest() in the op log template, we'll probably need a
trait method that returns an OpIndex instead.
2024-01-05 23:36:57 +09:00
Yuya Nishihara
95ea352b0a object_id: add fallible version of ObjectId::from_hex() 2024-01-05 23:36:57 +09:00
Yuya Nishihara
95d83cbfe5 object_id: make ObjectId constructors non-trait methods
I'm going to add try_from_hex(), which requires Self: Sized. Such trait bound
could be added, but I don't think we'll need abstracted ObjectId constructors
at all.
2024-01-05 23:36:57 +09:00
Yuya Nishihara
31b236a70d object_id: move HexPrefix and PrefixResolution from index module 2024-01-05 10:20:57 +09:00
Yuya Nishihara
fa5e40719c object_id: extract ObjectId trait and macros to separate module
I'm going to add a prefix resolution method to OpStore, but OpStore is
unrelated to the index. I think ObjectId, HexPrefix, and PrefixResolution can
be extracted to this module.
2024-01-05 10:20:57 +09:00
Yuya Nishihara
dbaee198e6 hex_util: move common_hex_len() from backend module
This function predates the hex_util module. If there were hex_util, I would
add it there.
2024-01-05 10:20:57 +09:00
Yuya Nishihara
e5255135bb op_walk: add function that reparents (and abandons) operation range
This will be used in "jj op abandon ..op_id" command. The "op_id..@" range will
be reparented onto the root operation.

The current implementation is good enough for local repos, but it won't scale.
We might want to extract it as a trait method or introduce OpIndex for
efficient DAG operation.
2024-01-04 11:44:36 +09:00
Yuya Nishihara
392e83be42 op_heads: ensure that update_op_heads([id], id) fails
The doc states it's invalid, but I made such bug.
2024-01-04 11:44:36 +09:00
Matt Stark
3f0a49dafe Ensure you never drop the working commit with --skip-empty
See #2766 for discussions
2024-01-04 13:33:24 +11:00
Matt Stark
a4aed2391f Rewrite instead of abandoning empty commits.
Fixes #2760


Given the tree:
```
A-B-C
 \
  B2
```
And the command `jj rebase -s B -d B2`

We were previously marking B as abandoned, despite the comment stating that we were marking it as being succeeded by B2. This resulted in a call to `rewrite(rewrites={}, abandoned={B})` instead of `rewrite(rewrites={B=>B2}, abandoned={})`, which then made the new parent of `C` into `A` instead of `B2`
2024-01-04 13:33:24 +11:00
Ilya Grigoriev
6edaa97517 DescendantRebaser: change rebased() method to into_map() that consumes the rebaser
This prevents a clone and does not affect the public API, as suggested
in https://github.com/martinvonz/jj/pull/2738#discussion_r1438903463.
2024-01-01 21:55:18 -08:00
Ilya Grigoriev
ddec3f91b2 lib: mild refactoring made possible by previous commit
Inline `create_descendant_commits`, move some functionality of
`DescendantRebaser::rebase_next` to `rebase_all`, a seemingly more logical
location.
2024-01-01 18:51:36 -08:00
Ilya Grigoriev
277b81ff6f lib: make DescendantRebaser-related APIs private.
Finally, there are no test uses of these APIs. `DescendantRebaser` is made
`pub(crate)`, since it is used by `MutRepo`. Other functions are made private.
2024-01-01 18:51:36 -08:00
Ilya Grigoriev
45cd0bf11b test_rewrite.rs: stop using DescendantRebaser when testing EmptyBehavior
This completes the process of removing DescendantRebaser-related APIs from
tests. It requires creating some new test utils and a new
`rebase_descendants_with_option_return_map`.
2024-01-01 18:51:36 -08:00
Ilya Grigoriev
7cef879ef6 lib repo.rs & rewrite.rs: Move clearing of rewritten/abandoned commits
This commit is a little out of place in this sequence, but
it seems to make more sense for MutRepo to own these maps.

@yuja [pointed out] that any tests written using `create_descendant_rebaser` now
need to do this cleanup, but there are no longer any such tests after the
previous commits and a follow-up commit removes `create_descendant_rebaser`
entirely.

[pointed out]: https://github.com/martinvonz/jj/pull/2737#discussion_r1435754370
2024-01-01 18:51:36 -08:00
Ilya Grigoriev
4461d61254 test_rewrite: test branches of descendants of divergent commits
A TODO left over from a previous PR
2024-01-01 18:51:36 -08:00
Ilya Grigoriev
b2abba07e9 tests: (mostly) stop using soon-to-be-private DescendantRebaser-related APIs
This removes uses of `DescendantRebaser::new` or
`MutRepo::create_descendant_rebaser` from most tests. The exceptions  are the
tests having to do with abandoning empty commits on rebase, since adjusting
those is a bit more elaborate (see follow-up commits).
2024-01-01 18:51:36 -08:00
Yuya Nishihara
3eafca65ea op_walk: add support for op_id+ (children) operator
A possible use case is when doing some archaeology around a certain operation.

The current implementation is quadratic if + is repeated. Suppose op_id is
usually close to the current op heads, I think it'll practically work better
than building a reverse lookup table.
2024-01-02 10:30:08 +09:00
Yuya Nishihara
ab299a6af5 op_walk: reimplement prefix lookup by using walk_ancestors() and HexPrefix
Perhaps, OpStore should provide prefix resolution method, but let's think
that later.
2024-01-02 10:30:08 +09:00
Yuya Nishihara
c53748d732 op_walk: allow walk_ancestors() from more than one head operations 2024-01-02 10:30:08 +09:00
Yuya Nishihara
51691ea22c tests: add lib tests for op id resolution, migrate some from cli
CLI testing is slow and harder to set up crafted environment.
2024-01-02 10:30:08 +09:00
Yuya Nishihara
dad890b960 operation: make parent_ids() return slice instead of Vec reference 2024-01-02 02:47:41 +09:00
Yuya Nishihara
c9b581589c op_walk: simplify arguments passed to high-level "opset" query functions 2024-01-01 10:22:23 +09:00
Yuya Nishihara
26b5f38f45 op_walk: move "opset" query functions from jj_cli 2024-01-01 10:22:23 +09:00
Yuya Nishihara
e4460d5386 op_walk: add error types for fake "opset" expression
This removes CommandError dependency from these resolution functions. We might
want to refactor the error types again if we introduce a real "opset" evaluator.

The error message for unresolved op heads now includes "@" instead of the whole
expression.
2024-01-01 10:22:23 +09:00
Yuya Nishihara
94fc32ab47 op_walk: extract walk_ancestors() to new module
I'm going to extract fake "opset" resolution functions there, and I think
walk_ancestors() belongs to the same category.
2024-01-01 10:22:23 +09:00
Yuya Nishihara
6dd936f72f op_heads: let caller decide resolve_op_heads() error type
The resolver callback usually returns wider error type, which I don't think
is a variant of OpHeadResolutionError.

To help type inference, resolver's error type is E, not E1 where E: From<E1>.
2024-01-01 10:22:23 +09:00
Martin von Zweigbergk
90744fb770 working copy: read files ahead when updating
If the commit backend has high latency, it can make a big difference
to read files concurrently. This patch updates the working copy code
to do that in the update code (when reading files from the backend to
write to the working copy). Because our backend at Google reads files
from a local daemon process that already does a lot of prefetching,
this patch doesn't actually help us. I think it's still the right
thing to do for backends that don't do the same kind of
prefetching. It speeds up `jj sparse set --add` by >10x when I disable
the prefetching in our daemon (our `Backend::concurrency()` is 100).
2023-12-29 13:37:13 -08:00
Yuya Nishihara
f9e9058b9b index: show bad operation id if commit lookup failed during reindexing
My jj repo contains such head commits, and "jj debug reindex" fails. To address
this problem, we'll probably need to implement GC, and the user will discard
operations before the first bad op id.
2023-12-29 13:05:58 +09:00
Yuya Nishihara
43e016a7d1 index: add explicit reindexing method that can propagate error 2023-12-29 13:05:58 +09:00
Yuya Nishihara
ab1c8656a4 index: rename private index_at_operation methods, reorder arguments
I'm going to add a public method that rebuilds index, and its return type will
be different. I also added "build_" because "index" could be misinterpreted
as noun.

The method arguments are reordered to follow the public IndexStore interface.
2023-12-29 13:05:58 +09:00
Yuya Nishihara
3abe6be384 index: propagate DefaultIndexStore::init/reinit() errors 2023-12-29 13:05:58 +09:00
Yuya Nishihara
955f6e356a repo: add error propagation path to IndexStore initialization and loading
The error types are shared with the commit store backend. We could add per-store
error types, but it's unlikely that the caller needs to discriminate them.
2023-12-29 13:05:58 +09:00
Yuya Nishihara
bb73cd491f clenaup: don't use debug format to embed ObjectId in error message
Also fixed typo, s/a/an/.
2023-12-29 13:05:58 +09:00
Martin von Zweigbergk
d06764eb7c op heads: remove now-unused methods for adding/removing op heads 2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
65a6aa61db op heads: replace last use of remove_op_head() by update_op_heads() 2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
76516bb46b op heads: inline handle_ancestor_ops()
This gets us closer to being able to use the new `update_op_heads()`
function here (without calling it multiple times).
2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
4221c7cf5c op heads: remove handle_ancestor_ops() from trait
I think the idea behind `handle_ancestor_ops()` was to let our backend
at Google delegate the work to the server, which could then avoid
walking ancestors. However, we're now thinking that we're going to
make our server resolve divergent operations on its own instead, so
the client will never see more than one op head, unless it manually
creates the second op head itself (e.g. because the user ran two
concurrent commands). In those cases it should be fine to do the
walk. So let's simplify the trait by removing the function.
2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
f969f4b0b0 op heads: remove lifetime from OpHeadsStoreLock 2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
c304777a35 op heads: remove promote_new_op()
`OpHeadsStoreLock::promote_new_op()` doesn't add much over the new
`update_op_heads()`, so let's switch to the latter.
2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
b8e45d196f op heads: add a new trait method combining add and remove of op heads
Consider how one would implment the current `OpHeadsStore` interface
for a cloud-based backend. After `OpHeadsStore::add_op_head()` is
called, the set of op heads temporarily contains two heads (typically)
until `OpHeadsStore::remove_op_head()` is called. That's not invalid,
but it's annoying to have to deal with that state more than
necessary. Also, it's unnecessarily inefficient to send the addition
and removal of op heads as separate RPCs. This patch therefore adds a
`update_op_heads()` method that takes a list of old heads to remove
and a single new head to add. Coming patches will start migrating to
that method.
2023-12-28 09:17:42 -08:00
Martin von Zweigbergk
8137975785 op heads: drop support for old location/format
We move `.jj/repo/op_heads/*` into `.jj/repo/op_heads/heads/*` almost
a year ago, in commits 90a66ec262 and 37ba17589d. We said we would
drop support for it in 0.9+. I think we said that before we started
doing monthly releases, but I we're still past the goal of 6 months
(which is what I think we were aiming for).
2023-12-28 09:17:42 -08:00
Yuya Nishihara
dde42b9c05 index: rename resolve_prefix() to resolve_commit_id_prefix()
I'll probably add change id lookup methods to CompositeIndex. The Index trait
won't gain resolve_change_id_prefix(), but I also renamed its resolve_prefix()
for consistency.
2023-12-26 01:03:10 +09:00
Yuya Nishihara
0f2f566188 index: remove "segment_" prefix from IndexSegment methods
Since Readonly/MutableIndexSegment no longer implement Index trait, there's
no ambiguity between segment-local and index-global operations. Let's shorten
the method names.
2023-12-26 01:03:10 +09:00
Yuya Nishihara
c9b9e2864e index: introduce newtype that represents segment-local position
I'm thinking of changing some IndexSegment methods to return LocalPosition
instead of global IndexPosition, and using u32 there would be a source of bugs.
2023-12-26 01:03:10 +09:00
Yuya Nishihara
ee8d5e279a index: make segment-level lookup return neighbor commit ids instead of positions
Both readonly and mutable segments know the commit ids to return, and the
caller only needs the ids. Since segment_commit_id(local_pos) scans the graph
entries, doing that would increase the chance of cache miss.
2023-12-26 01:03:10 +09:00
Yuya Nishihara
0e7834feb9 index: inline segment_entry_by_pos()
There's no reasonable way to abstract the IndexEntry construction.
2023-12-26 01:03:10 +09:00
Ilya Grigoriev
1fb9df252b split.rs: stop using DescendantRebaser::new
This requires creating a new public API as a substitute. I took the opportunity
to also add some comments to the
`MutRepo::record_rewritten_commit`/`record_abandoned_commit` functions.

I imade the simplest possible addition to the API; it is not a very elegant
one. Eventually, the entire `record_rewritten_commit` API should probably be
refactored again.

I also added some comments explaining what these functions do.
2023-12-24 19:25:16 -08:00
Ilya Grigoriev
6bfd09009f move.rs: remove use of MutRepo::create_descenant_rebaser.
After this, the internal function is only used in tests.
2023-12-24 19:25:16 -08:00
Ilya Grigoriev
cde8ea8985 Make CommitBuilder constructors private to the library crate
The implementation of `CommitBuilder::write` is tightly bound to the MutRepo,
so only MutRepo should construct CommitBuilder-s.
2023-12-24 19:25:16 -08:00
Yuya Nishihara
b954bab0ca index: fix partial reindexing to not lose commits only reachable from one side
Spotted while adding error propagation there. This wouldn't likely be a real
problem because "jj debug reindex" removes all of the operation links.

The "} else {" condition is removed because it doesn't make sense to exclude
only the exact parent_op_id operation. This can be optimized to not walk
ancestors of the parent_op_id operation, but I don't see a motivation to add
tests covering such scenarios. It's pretty rare that an intermediate operation
link is missing.
2023-12-24 23:31:16 +09:00
Yuya Nishihara
320d15412b index: let caller of segment-level save-in() squash segments explicitly
There are many unit tests that call mutable_segment.save_in(), but I don't
think these callers expect that the segment file could be squashed depending
on the size. Let's make it caller's responsibility.

maybe_squash_with_ancestors() should be cheap if segment_num_commits() == 0,
so it's okay to call it before checking the emptiness.
2023-12-24 00:22:47 +09:00
Yuya Nishihara
1d80bbb70a index: leverage ancestor iterator to collect segments to be squashed
I think "for" loop is easier to follow. Maybe it could be rewritten further to
.find_map() loop, but that would be too clever.

I also made ancestor_index_segments() pub(super) since it doesn't make sense
to only provide ancestor_files_without_local().
2023-12-24 00:22:47 +09:00
Yuya Nishihara
55b4f69fb6 repo: propagate store error from add_heads() 2023-12-24 00:22:30 +09:00
Yuya Nishihara
0f6a7418f2 index: propagate store error from reindexing function
If the error is permanent (because the repo predates the no-gc-ref fix for
example), there's no easy way to recover. Still, panicking in this function
seems wrong.
2023-12-24 00:22:30 +09:00
Yuya Nishihara
7a44e590dc lock: remove byteorder dependency from tests, use fs helper functions
This is the last use of Read/WriteBytesExt. The byteorder crate is great, but
we don't need an abstraction of endianness. Let's simply use the std functions.
2023-12-23 00:14:17 +09:00
Yuya Nishihara
9de6273e10 index, stacked_table: inline read_u32::<LittleEndian>()
There aren't many callers of ReadBytesExt::read_u32().
2023-12-23 00:14:17 +09:00
Yuya Nishihara
21c22be96e stacked_table: use u32::from_le_bytes() to reinterpret bytes as integer
Apparently, I forgot to update this in fb06e89649.
2023-12-23 00:14:17 +09:00
Yuya Nishihara
6f5096e266 index, stacked_table: use u32::try_from() instead of numeric cast
These .unwrap()s wouldn't be compiled out, but I don't think they would
have measurable impact. Let's use the safer method.
2023-12-22 09:03:50 +09:00
Yuya Nishihara
9ec89bcf86 index, stacked_table: use u32::to_le_bytes() to reinterpret as bytes 2023-12-22 09:03:50 +09:00
Yuya Nishihara
392539fa29 index, stacked_table: simply extend Vec<u8> to not use .write_all()
I'm going to remove use of .write_u32() there. It's not super important, but
fewer .unwrap()s, the code looks slightly better.
2023-12-22 09:03:50 +09:00
Yuya Nishihara
fb06e89649 index: use u32::from_le_bytes() to reinterpret bytes as integer
It's less abstract than going through io::Read, so is probably easier for
compiler to optimize out. I also feel it's a bit more readable.
2023-12-22 09:03:50 +09:00
Yuya Nishihara
38ce914321 index: reindex on content-related I/O errors
If read_exact() or read_u32() reached to EOF, the index file should be
considered corrupted. File not found error is also treated as data corruption
because an invalid file name could be read from the child segment file. It
can't handle special file names like "..", though.
2023-12-21 08:05:30 +09:00
Yuya Nishihara
e98104d6f0 index: add file name to both io/corrupt errors, combine these variants
Index file name also applies to io::Error. New error type reuses io::Error to
represent data corruption. We could add an inner Corrupt|Io enum instead, but
we'll need to remap some io::Error variants (e.g. UnexpectedEof) to Corrupt
anyway.
2023-12-21 08:05:30 +09:00
Yuya Nishihara
88f3085bb1 index: extract function that opens file and loads index segments 2023-12-21 08:05:30 +09:00
Yuya Nishihara
eccb9b7a44 index: propagate index load errors from DefaultIndexStore 2023-12-19 07:41:57 +09:00
Yuya Nishihara
dd8e686127 index: don't reload parent files after saving new segment file
This should be cheaper, and more importantly, we no longer need to propagate
ReadonlyIndexLoadError to the caller.
2023-12-19 07:41:57 +09:00
Yuya Nishihara
fb07749291 index: split load function into header and local parts as well 2023-12-19 07:41:57 +09:00
Yuya Nishihara
616a8c7f54 index: split serialization function into header and local parts
The idea is that we don't have to reload parent files as we already have the
chain of the parent segments. The resulting readonly index will be constructed
from the loaded parent segments + local entries blob.
2023-12-19 07:41:57 +09:00
Yuya Nishihara
31b6e93c6e index: move IndexLoadError to "readonly" module, rename accordingly
I thought IndexLoadError and DefaultIndexStoreError would represent "load" and
"store" failures respectively, but they aren't. Actually, DefaultIndexStoreError
is the store-level error, and IndexLoadError should be wrapped in it.
2023-12-19 07:41:57 +09:00
Yuya Nishihara
b5de16007e index: add stub IndexReadError type
This is needed to remove .unwrap()s from DefaultIndexStore.
2023-12-19 07:41:57 +09:00
Yuya Nishihara
d49b079494 index: update file format comment about ReadonlyIndexSegment
Also made it a doc comment. I think 4-byte alignment is a nice property,
so added note about that.
2023-12-19 07:41:34 +09:00
Yuya Nishihara
8909647d86 index: pass base directory path by reference 2023-12-18 08:49:21 +09:00
Yuya Nishihara
b733d52557 index: split DefaultIndexStoreError::Io variant, extract save helper
Since OpStoreError can also include io::Error, it doesn't make much sense to
have Io variant at this level. Let's split it to context-specific errors, and
extract helper method that maps io::Error.
2023-12-18 08:49:21 +09:00
Yuya Nishihara
bf4a4e70b1 index: use DefaultMutableIndex wrapper when reconstructing missing index
This allows us to extract helper method that writes index file and associates
it with the operation.
2023-12-18 08:49:21 +09:00
Yuya Nishihara
50164bb36f index: have IndexWriteError carry opaque error type instead of string
I'm going to remove some .unwrap()s from DefaultIndexStore, and the inner
error type will be consolidated to DefaultIndexStoreError.
2023-12-18 08:49:21 +09:00
Yuya Nishihara
87a8238bee git: turn git.auto-local-branch off by default
As far as I can see in the chat, there's no objection to changing the default,
and git.auto-local-branch = false is generally preferred.

docs/branches.md isn't updated as it would otherwise conflict with #2625. I
think the "Remotes" section will need a non-trivial rewrite.

#1136, #1862
2023-12-17 08:30:24 +09:00
Yuya Nishihara
6971ec239a tests: set git_settings.auto_local_branch where it matters 2023-12-17 08:30:24 +09:00
Yuya Nishihara
ac99145a28 working_copy: drop open file instance from PersistError
For the same reason as the file_util change.
2023-12-17 08:20:07 +09:00
Yuya Nishihara
c6df0ba4c3 file_util: don't try to overwrite existing content-addressed file on Windows
The doc says persist() replaces the destination file as rename() would do
on Unix. persist_noclobber() doesn't, and is probably more reliable on Windows.
I don't know if persist() is completely atomic on Windows, but if it isn't, it
might be the source of the "permission denied" error under highly contended
situation.

https://docs.rs/tempfile/latest/tempfile/struct.NamedTempFile.html#method.persist
https://github.com/Stebalien/tempfile/blob/v3.8.0/src/file/imp/windows.rs#L77

We could use persist_noclobber() on all platforms, but it's more involved on
Unix.

https://github.com/Stebalien/tempfile/blob/v3.8.0/src/file/imp/unix.rs#L107
2023-12-17 08:20:07 +09:00
Yuya Nishihara
dd325c089c file_util: drop open file instance from PersistError
PersistError is basically a pair of io::Error and NamedTempFile instance. It's
unlikely that we would want to propagate the open file instance to the CLI
error handler, leaving the temporary file alive.
2023-12-17 08:20:07 +09:00
Yuya Nishihara
4d91e4c196 revset: simplify type constraints on combination iterators
Just a minor cleanup to remove lifetime parameter from the types. I tried to
reimplement them by using itertools, but I couldn't find a simple way to
encode short-circuiting at the end of either left or right iterator.
2023-12-16 07:50:04 +09:00
Yuya Nishihara
6d59156858 revset: parameterize candidates set of FilterRevset as well 2023-12-16 07:50:04 +09:00
Yuya Nishihara
a36368bb88 revset: make revset combinators generic over set types, merge UnionPredicate
UnionRevset and UnionPredicate are conceptually the same. Let's unify them.
2023-12-16 07:50:04 +09:00
Yuya Nishihara
af6047a655 lib: forbid unsafe_code at all 2023-12-15 16:10:28 +09:00
Yuya Nishihara
9990c41a90 repo: remove unsafe lifetime hack from change_id_index() 2023-12-15 16:10:28 +09:00
Yuya Nishihara
d9e8297059 index: add 'static version of evaluate_revset() to ReadonlyIndex
We'll probably need a better abstraction, but a separate method is good
enough to remove unsafe code from ReadonlyRepo.

I'm not sure if this is feasible for the other backends, but I guess there
would be less lifetimed variables than DefaultReadonlyIndex.
2023-12-15 16:10:28 +09:00
Yuya Nishihara
2ba50c76c7 revset: abstract evaluated RevsetImpl over owned/borrowed index types 2023-12-15 16:10:28 +09:00
Yuya Nishihara
72d9cd019b index: extract as_composite() to trait method
The revset engine will accept abstract AsCompositeIndex type, and the
evaluated revset can be 'static if the index is behind Arc<T>.
2023-12-15 16:10:28 +09:00
Yuya Nishihara
8fdf9db6e0 revset: remove 'index lifetime from InternalRevset 2023-12-15 14:58:12 +09:00
Yuya Nishihara
c426d34c11 revset: pass in index to PurePredicateFn as an argument to make it 'static 2023-12-15 14:58:12 +09:00
Yuya Nishihara
71070e85d7 revset: add helper that coerces closure to PurePredicateFn
Also renamed the boxed version to discriminate it from the cast helper.
2023-12-15 14:58:12 +09:00
Yuya Nishihara
a9a7de4a5e revset: store RevWalk factory function in RevWalkRevset
The returned iterator is boxed by caller due to the limitation of the type
system. There's a workaround, but it's super ugly.

https://users.rust-lang.org/t/hrtb-on-multiple-generics/34255/3
2023-12-15 14:58:12 +09:00
Yuya Nishihara
575d3dc7bf revset: store IndexPosition in EagerRevset to drop 'index lifetime
This adds overhead to re-look up IndexEntry, but I don't think that would
have significant impact on performance.
2023-12-15 14:58:12 +09:00
Yuya Nishihara
261bf848a9 revset: pass in index to InternalRevset as an argument
The idea is that InternalRevset will store a 'static boilerplate function that
borrows an 'index passed by function argument. This way, we can abstract the
index type over Arc<T> and &T without introducing too much generics.
2023-12-15 14:58:12 +09:00
Yuya Nishihara
e332d39375 revset: extract inner method that constructs IndexEntry iterator 2023-12-15 14:58:12 +09:00
Yuya Nishihara
b8f60c4dd6 cargo: bump gix to 0.56.0
I don't know why the dependabot didn't catch this, but there are things to
fix manually. EntryMode was changed to a u16 wrapper, and the enum was renamed
to EntryKind. Other than that, I don't find anything breaking our codebase.
2023-12-15 14:17:02 +09:00
Yuya Nishihara
95a0cceb97 index: use loaded readonly data without splitting into vecs
Since lookup data isn't typically small, .split_off() can take a few
milliseconds to memcpy().
2023-12-14 08:43:50 +09:00
Yuya Nishihara
5121e1f4e9 index: move IndexSegment trait to "composite" module
Perhaps, this is the most controversial part. It could be moved to new
"segment" module (or something like "common"), but I think IndexSegment can be
considered a trait that enables the CompositeIndex abstraction.
2023-12-14 08:43:40 +09:00
Yuya Nishihara
b89ae7c0b5 index: use IndexEntry::position() instead of direct field access 2023-12-14 08:43:40 +09:00
Yuya Nishihara
9fb0f00f2d index: add IndexEntry constructor instead of pub(super)-ing fields 2023-12-14 08:43:40 +09:00
Yuya Nishihara
771f447d99 index: split IndexEntry and related types to "entry" module
Added pub(super) or pub where needed. I won't implement accessor methods on
IndexPositionByGeneration and IndexPosition as they are purely value types,
and protecting the inner values wouldn't make sense.
2023-12-14 08:43:40 +09:00
Martin von Zweigbergk
60fae3114e transaction: take description at end instead of start
It seems better to have the caller pass the transaction description
when we finish the transaction than when we start it. That way we have
all the information we want to include more readily available.
2023-12-13 08:12:49 -08:00
Ilya Grigoriev
316ab8efb8 rewrite.rs: refactor new_parents to depend only on parent_mapping
Previously, the function relied on both the `self.parent_mapping` and
`self.rebased`. If `(A,B)` was in `parent_mapping` and `(B,C)` was in `rebased`,
`new_parents` would map `A` to `C`.

Now, `self.rebased` is ignored by `new_parents`. In the same situation,
DescendantRebaser is changed so that both `(A,B)` and `(B,C)` are in
`parent_mapping` before. `new_parents` now applies `parent_mapping` repeatedly,
and will map `A` to `C` in this situation.

## Cons

- The semantics are changed; `new_parents` now panics if `self.parent_mapping`
  contain cycles. AFAICT, such cycles never happen in `jj` anyway, except for
one test that I had to fix. I think it's a sensible restriction to live with;
if you do want to swap children of two commits, you can call
`rebase_descendants` twice.

## Pros

- I find the new logic much easier to reason about. I plan to extract it into a
function, to be used in refactors for `jj rebase -r` and `jj new --after`. It
will make it much easier to have a correct implementation of `jj rebase -r
--after`, even when rebasing onto a descendant.

- The de-duplication is no longer O(n^2). I tried to keep the common case fast.

## Alternatives

- We could make `jj rebase` and `jj new` use a separate function with the
algorithm shown here, without changing DescendantRebaser. I believe that the new
algorithm makes DescendatRebaser easier to understand, though, and it feels more
elegant to reduce code duplication.

- The de-duplication optimization here is independent of other changes, and
could be used on its own.
2023-12-12 19:35:51 -08:00
Yuya Nishihara
2abbb637e3 index: add wrapper functions to DefaultReadonlyIndex to remove pub(super) field 2023-12-13 08:09:48 +09:00
Yuya Nishihara
c0a12a7cbc index: add methods that provides commit/change_id_length
We could add Layout struct holding these parameters, but I don't think that's
needed just for two parameters.
2023-12-13 08:09:48 +09:00
Yuya Nishihara
3831ad423c index: use as_composite().num_commits() instead of direct field access 2023-12-13 08:09:48 +09:00
Yuya Nishihara
30984b1505 index: use name() instead of direct field access 2023-12-13 08:09:48 +09:00
Yuya Nishihara
e5c8252fb4 index: use segment_parent_file() instead of direct field access 2023-12-13 08:09:48 +09:00
Yuya Nishihara
402e36bab7 index: split readonly index types to "readonly" module
Added pub(super) where needed. There are a few pub(super) fields that look
suspicious, which will be fixed by the subsequent patches.
2023-12-13 08:09:48 +09:00
Yuya Nishihara
fbec16b49f index: add wrapper functions to DefaultMutableIndex to remove pub(super) field
into_segment() could be added instead of save_in(), but I decided to wrap
save_in(). save_in() may squash ancestor files, so it could be considered an
index-level operation.
2023-12-13 08:09:48 +09:00
Yuya Nishihara
5aeeb5f723 index: split mutable index types to "mutable" module
Added pub(super) where needed or makes sense.
2023-12-13 08:09:48 +09:00
Yuya Nishihara
ab2742f2c9 index: split RevWalk types to "rev_walk" module
Added pub(super) where needed.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
caa1b99c24 index: add CompositeIndex constructor instead of pub(super)-ing field
This wouldn't matter, but seemed slightly better.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
679518fdf2 index: split CompositeIndex and stats types to "composite" module
Added pub(super) where needed or makes sense.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
2423558e68 index: split DefaultIndexStore and Load/StoreError types to "store" module
IndexLoadError isn't store-specific, but I think it's better to put I/O
stuff in the store module.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
cdcd465c79 index: move default_index_store.rs to sub directory named default_index
default_index_store.rs is relatively big, and it contains types and impls in
arbitrary order. Let's split them into sub modules. After everything moved,
mod.rs will only contain tests.
2023-12-12 08:07:52 +09:00
Yuya Nishihara
f86b338681 revset: inline walk_ancestors() 2023-12-11 09:14:03 +09:00
Yuya Nishihara
cd0b24ef14 revset: inline walk_children()
There's only one caller, and we have common code at the call site.
2023-12-11 09:14:03 +09:00
Yuya Nishihara
d28bd8fa0f revset: inline collect_dag_range() 2023-12-11 09:14:03 +09:00
Yuya Nishihara
73fb922517 index: reimplement collect_dag_range() of revset engine as iterator
I'm going to remove 'index lifetime from InternalRevset so Revset<'static>
can be easily constructed from DefaultReadonlyIndex. As the first step, this
series removes some lifetime complexity from EvaluationContext methods.

We don't need an descendant iterator API, but it helps to add separate function
to collect into HashSet<IndexPosition> instead of returning a pair of
ordered vec and set.
2023-12-11 09:14:03 +09:00
Yuya Nishihara
cbbe38ba7b index: rename MutableIndexImpl to MutableIndexSegment 2023-12-10 11:03:07 +09:00
Yuya Nishihara
c94e1de6d2 index: add DefaultMutableIndex wrapper, move Index impls to it
The wrapper type isn't needed for the mutable layer, but this mirrors the
readonly type structure. Test cases are also migrated to be using the index
wrapper so long as we don't have to care for the nesting of the segment files.
2023-12-10 11:03:07 +09:00
Yuya Nishihara
ce312ae288 index: duplicate add_commit() to MutableIndexImpl 2023-12-10 11:03:07 +09:00
Yuya Nishihara
e0206a82f2 index: extract merge_in() function that works on segment types
Prepares for splitting MutableIndexImpl into segment and index wrapper types.
2023-12-10 11:03:07 +09:00
Yuya Nishihara
a110ec6d95 cli: print failed git export reason for each ref
Not all reasons are actionable, but we print hint in common cryptic cases.
2023-12-09 23:37:00 +09:00
Yuya Nishihara
990edcefc9 index: impl Index for DefaultReadonlyIndex instead of ReadonlyIndexSegment
The idea is that the ReadonlyIndexSegment is a sub component of the index. The
Index trait could be implemented for any Segment type, but we don't need a
public interface to access sub segment as an index.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
1cbd2ddb4b index: rename ReadonlyIndexImpl to ReadonlyIndexSegment
I'm going to split the internal Segment types and the public Index types
in order to clarify the layering concept. The public Index types will be
wrappers like DefaultReadonlyIndex.

Strictly speaking, ReadonlyIndexImpl is a segment + parent pointer pair,
but I think calling it a segment is pretty okay. It could be called a
ReadonlyIndexFile, but "File" can't apply to the mutable part.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
172043e968 index: make ReadonlyIndexImpl private
There are no external callers.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
6c57ba7f21 index: rename ReadonlyIndexWrapper to DefaultReadonlyIndex
This matches the store naming: impl IndexStore for DefaultIndexStore. I also
added minimal doc comment and Debug.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
cee69d1665 tests: remove index downcast helpers called only by as_<type>_composite()
I'm going to rename the impl types, and I don't want to think about the
names of these downcast functions.
2023-12-09 15:18:36 +09:00
Yuya Nishihara
5f6e28c8cf git: migrate export_refs() to gix::Repository
FailedToDelete/Set reasons are boxed because gix error types aren't small.
They could be casted to std::error::Error if needed.
2023-12-09 15:18:19 +09:00
Yuya Nishihara
2d76907048 git: unimplement PartialEq on FailedRefExportReason
Gitoxide errors don't implement PartialEq. We could instead stringify the
errors, but there aren't many callers who expect FailedRefExportReason to
be comparable.
2023-12-09 15:18:19 +09:00
Yuya Nishihara
9f8831e825 git: unimplement PartialEq on GitExportError
Gitoxide errors don't implement PartialEq, and I don't think it makes sense
to test equality of InternalGitError objects.
2023-12-09 15:18:19 +09:00
Yuya Nishihara
a77eed648b git: have export_refs() obtain git2::Repository instance from store 2023-12-09 15:18:19 +09:00
Yuya Nishihara
0f37027646 index: remove unneeded Any trait bound from MutableIndex
We use .as_any() to downcast to the backend impl instead.
2023-12-08 23:30:35 +09:00
Yuya Nishihara
c197add39b git_backend: do not try to resolve git_target path as working directory path
The git_target path is normalized and managed by jj, so we don't need a
fallback mechanism. Let's make it stricter.
2023-12-07 08:43:49 +09:00
Yuya Nishihara
77c811163f tests: make sure to specify external git repository path including ".git" 2023-12-07 08:43:49 +09:00
Yuya Nishihara
25fcc3e403 workspace: consider .git symlink when generating relative git_target path
Before, an absolute path would be saved in the git_target file if .git is a
symlink. That's not wrong, but seemed a bit weird. Let's consolidate the
behavior across .git file types.
2023-12-05 14:23:59 -08:00
Yuya Nishihara
787fa1340b workspace: remove redundant cloning from init_external_git()
Apparently, I forgot to update it in 1db033504c "repo, workspace: remove
'static lifetime bound from initializer functions."
2023-12-05 14:23:59 -08:00
Yuya Nishihara
899c6375a0 git_backend: don't fully canonicalize .git symlink
Apparently, libgit2 doesn't deduce "core.bare" config from the directory name,
but gitoxide implements it correctly. So we shouldn't blindly canonicalize
the Git repository path. Fortunately, the saved git_target path isn't a fully-
canonicalized form (unless user explicitly sepcified "--git-repo ./.git"), so
we don't need a hack to remap git_target back to the symlink path.

is_colocated_git_workspace() is adjusted since the git_workdir is no longer
resolved from the fully-canonicalized repo path, at least in our code. Still we
have the ".git/.." fallback because test_init_git_colocated_symlink_gitlink()
would otherwise fail. I haven't figured out why, and the test might be actually
wrong compared to the git CLI behavior, but let's not change that for now.

Fixes #2668
2023-12-05 14:23:59 -08:00
Martin von Zweigbergk
1cc271441f gc: implement basic GC for Git backend
This adds an initial `jj util gc` command, which simply calls `git gc`
when using the Git backend. That should already be useful in
non-colocated repos because it's not obvious how to GC (repack) such
repos. In my own jj repo, it shrunk `.jj/repo/store/` from 2.4 GiB to
780 MiB, and `jj log --ignore-working-copy` was sped up from 157 ms to
86 ms.

I haven't added any tests because the functionality depends on having
`git` binary on the PATH, which we don't yet depend on anywhere
else. I think we'll still be able to test much of the future parts of
garbage collection without a `git` binary because the interesting
parts are about manipulating the Git repo before calling `git gc` on
it.
2023-12-03 07:40:12 -08:00
Yuya Nishihara
35f718f212 merged_tree: remove canceling terms prior to resolving file-level conflict
I think this is a variant of the problem fixed by 7fda80fc22 "tree: simplify
conflict before resolving at hunk level." We need to simplify() the conflict
before and after extracting file ids because the source conflict values may
contain trees to be cancelled out, and the file values may differ only in exec
bits. Since the legacy tree passes a simplified conflict in to this function,
I made the merged tree do the same.

Fixes #2654
2023-12-03 07:44:58 +09:00
Yuya Nishihara
4ffbf40c82 merged_tree: do not propagate conflicting empty tree value to parent
Otherwise an empty subtree would be added to the parent tree.

If the stored tree contained an empty subtree, simplify() wouldn't work
against new "absent" subtree representation. I don't know if there's a
such code path, but I believe it's very rare to encounter the problem.

#2654
2023-12-03 07:44:58 +09:00
Yuya Nishihara
1db033504c repo, workspace: remove 'static lifetime bound from initializer functions 2023-12-03 07:44:41 +09:00
Yuya Nishihara
d747879aee signing: pass SigningFn by reference
write_commit() doesn't need ownership of the signing function.
2023-12-01 22:55:04 +09:00
Anton Bulakh
eb1c0ab4a2 sign: Implement a test signing backend and add a few basic tests 2023-11-30 23:36:56 +02:00
Anton Bulakh
d7229a3f90 sign: Define signing backend API and integrate it
Finished everything except actual signing backend implementation(s) and
the UI.
2023-11-30 23:36:56 +02:00
Yuya Nishihara
076b49b610 merged_tree: use merged_tree_entry_diff() in stream version 2023-12-01 00:05:06 +09:00