Function parameters are processed as local symbols while substituting
alias expression. This isn't as efficient as Mercurial which caches
a tree of fully-expanded function template, but that wouldn't matter in
practice.
Let's acknowledge everyone's contributions by replacing "Google LLC"
in the copyright header by "The Jujutsu Authors". If I understand
correctly, it won't have any legal effect, but maybe it still helps
reduce concerns from contributors (though I haven't heard any
concerns).
Google employees can read about Google's policy at
go/releasing/contributions#copyright.
Follows up c5ed3e1477. Now change/commit ids are resolved at the same
precedence, which means there are at least three types of ambiguity.
I don't think we would need to discriminate these.
Because the use of the change id is recommended, any operation should abort
if a valid change id happens to match a commit id. We still try the commit
id lookup first as the change id lookup is more costly.
Ambiguous change/commit id is reported as AmbiguousCommitIdPrefix for now.
Maybe we can merge AmbiguousCommit/ChangeIdPrefix errors into one?
Closes#799
84b924946f switched to requiring both
SSH_AUTH_SOCK and SSH_AGENT_PID for an agent to be used. This doesn't
seem to be a typical situation, so perhaps it was not intended.
Since syntactic information like symbol or function name is lost after
parse(), alias substitution is inserted to the middle of the post-parsing
stage, not after the whole RevsetExpression tree is built. This is the main
difference from Mercurial. Mercurial also caches parsed aliases, but I don't
think that would have a measurable impact.
The CLI will load aliases from config, insert them one by one, and warn if
declaration part is invalid. That's why RevsetAliasesMap is a public struct
and needs to be instantiated by the caller.
I'll add aliases map, substitution stack (to detect recursion), and locals
(for function aliases) there. Fortunately, we can avoid shared mutables
so a copyable struct should be good.
parse_function_argument_to_string() doesn't need a workspace_ctx, but there
should be no reason to explicitly nullify it either.
To reduce conflicts between branches like `main` and `main/sub`, it's
better to first delete refs in git that have been deleted in jj, and
then add/update refs that have been added/updated in jj.
Since we now write a (partial) view object of the exported branches to
disk (since 7904474320), we can safely skip exporting some
branches. We already skip conflicted branches. This commit makes us
also skip branches that we fail to write to the backing Git repo,
instead of failing the whole operation (after possibly updating some
Git refs).
I made the `export_refs()` function return the branches that
failed. We should probably make that a struct later and have a
separate field for branches that we skipped due to conflicts.
Closes#493.
When skipping branches we fail to update in the backing Git repo, we
must also skip updating the `exported_view` object, so we don't trick
ourselves into thinking the branch was already updated in the Git repo
on the next export.
I'm going to make the export skip branches that we fail to update in
the Git repo. For that, we need to know the branch name while
interacting with the `git2::Repository` object. This little
refactoring prepares for that.
The comment says that we collect the changes to make before making
them, in order to reduce the risk of making some changes before
failing. However, there is nothing in the code that collects changes
that can fail, and it's all doing comparisons in memory, so it should
be very fast. It's been like that since I added it in 47b3abd0f7. We
still need to preserve the structure to avoid mutating `mut_repo`
while iterating over branches, however, so I just updated the comment.
This adds a test for attempting to export both a branch called `main`
and one called `main/sub` (#493), as well as for exporting a branch
with an empty string as name (reported directly to me by @lkorinth).
I'm thinking of adding alias expansion at this stage, and it would be a bit
tedious to pass around mutable context by function parameter. So let's reduce
the number of the intermediate functions.
This also produces a better error message.
It would be nice to be able to use snapshot testing and not have to
parse the output of `jj op log`. This patch lets us do that by
providing a new environment variable and config for overriding the
timestamps. Unlike `operation.hostname` and `operation.username`,
these are only meant for tests.
This makes the tests more hermetic, even though I don't think the
default values (taken from `whoami`) can break any tests (then we
would have already seen them break). Now we just need to make the
operation log's timestamps predictable and then we can start using
operation IDs in snapshot tests.
The expression 'x ~ empty()' is identical to 'x & file(".")', but more
intuitive.
Note that 'x ~ empty()' is slower than 'x & file(".")' since the negative
intersection isn't optimized right now. I think that can be handled as
follows: 'x ~ filter(f)' -> 'x & filter(!f)' -> 'filter(!f, x)'
There are no "non-normal" files, so "normal" is not needed. We have
symlinks and conflicts, but they are not files, so I think just "file"
is unambiguous.
I left `testutils::write_normal_file()` because there it's used to
mean "not executable file" (there's also a `write_executable_file()`).
I left `working_copy::FileType::Normal` since renaming `Normal` there
to `File` would also suggest we should rename `FileType`, and I don't
know what would be a better name for that type.
We currently get the hostname and username from the `whoami` crate. We
do that in lib crate, without giving the caller a way to override
them. That seems wrong since it might be used in a server and
performing operations on behalf of some other user. This commit makes
the hostname and username configurable, so the calling crate can pass
them in. If they have not been passed in, we still default to the
values from the `whoami` crate.
This migrates the native backend from Protobuf to Thrift since
Google's Protobuf team does let us import jj into Google's monorepo if
it uses a third-party Protobuf library.
Since the native backend is not supported, I didn't write any
migration code for it.
We can't remove `lib/src/protos/store.proto` yet, because it's also
used by the Git backend (only the `predecessors` and `change_id`
fields).
When we export branches to Git, we didn't update our own record of
Git's refs. This frequently led to spurious conflicts in these refs
(e.g. #463). This is typically what happened:
1. Import a branch pointing to commit A from Git
2. Modify the branch in jj to point to commit B
3. Export the branch to Git
4. Update the branch in Git to point to commit C
5. Import refs from Git
In step 3, we forgot to update our record of the branch in the repo
view's `git_refs` field. That led to the import in step 5 to think
that the branch moved from A to C in Git, which conflicts with the
internal branch target of B.
This commit fixes the bug by updating the refs in the `MutableRepo`.
Closes#463.
As I said in the previous patch, I don't know why I made the initial
export to Git a no-op. Exporting everything makes more sense to
(current-)me. It will make it slightly easier to skip exporting
conflicted branches (#463). It also lets us remove a `jj export` call
from `test_templater.rs`.
The first time we export to Git, we don't actually export
anything. That's a little weird and I don't know why I did it that
way. It seems more natural to export the current state. I'd like to
change it to do that. However, that currently means we'll detach the
current HEAD if it points to any of the branches we export. This patch
restructures the code a bit and skips the detach step if the target
branch already points to the right commit.
We could write the view object to the operation store instead, but
then we would need to make sure we don't GC it (once we add support
for GC of the operation store).
I'm going to add other error variants for when we fail to read/write
to disk, and I don't think we need to have a custom message for each
in the CLI crate. It's easier to pass along the message from the lib
crate.
(`ConflictedBranch`, on the other hand, is an expected error so I
think we should continue to let let the CLI crate define the error
message for it.)
In order to fix#463, I'm going to make us export to Git a little
earlier, before finishing the transation. That means we won't have an
operation yet, but we don't need that anyway.
To fix#463, I think we want to skip conflicted branches when we
export instead of erroring out. It seems we didn't have test case for
the current behavior, so let's add one.
This is a test case for #463. It's not exactly the same case, but I'm
confident that the root cause is the same (that the
`.jj/repo/git_export_operation_id` doesn't include the git refs we
just updated).
These calls often appear in expressions long enough that not having to
qualify it means that we can sometimes avoid wrapping a line. I
noticed because IntelliJ told me that `test_git.rs` had some
unnecessary qualificiations (the function was already imported there).
As mentioned in the previous commit, we need to remove the Protobuf
dependency in order to be allowed to import jj into Google's
repo. This commit makes `SimpleOpStore` store its data using Thrift
instead of Protobufs. It also adds automatic upgrade of existing
repos. The upgrade process took 18 s in my repo, which has 22k
operations. The upgraded storage uses practically the same amount of
space. `jj op log` (the full outut) in my repo slowed down from 1.2 s
to 3.4 s. Luckily that's an uncommon operation. I couldn't measure any
difference in `jj status` (loading a single operation).
In order to allow building jj inside of Google, our Protobuf team
doesn't want to us to use a Google-unsupported implementation. Since
there is no supported implementation in Rust, we have to migrate off
of Protobufs. I'm starting with the operation store. This commit moves
the current implementation to a separate file so it can easily be
disabled by a Caargo feature.
Decouples view/operation IDs from serialized forms, which are not
necessarily stable. Not breaking as these IDs are persistent, never
recomputed or used for integrity checking.
This should help us reason about the safety implication. New inner module
is added to encapsulate unsafe access.
DirtyCell provides .with_ref(callback) instead of .borrow(). This isn't
strictly needed, but should clarify the intent of the temporary reference.
This also allows us to rewrite DirtyCell without unsafe code, if needed,
by leveraging OnceCell<T> x RefCell<Option<T>> pair.
The `testutils` module should ideally not be part of the library
dependencies. Since they're used by the integration tests (and the CLI
tests), we need to move them to a separate crate to achieve that.
Unfortunately, TOML requires quotes around the argument. So, the
usage is `jj --config-toml ui.color=\"always\"` in bash. The plan is
to eventually have a `--config` option with simpler syntax for
simple cases.
As discussed in https://github.com/martinvonz/jj/discussions/688.
If I'm reading this attribute correctly, it says that if the
`map_first_last` feature is enabled, then we should enable the
`map_first_last` feature, which seems like it would not have any
effect. We started getting warnings from the nightly compiler about
this line because it tries to enable a feature that's stable in that
version.
The idea is to (ab)use pest::error::Error type to pretty-print error
message with code span. pest::error::Error will be constructed upfront
to erase lifetime from Span<'i>/Position<'i>.
Baby step towards embedding matcher in RevsetExpression. If we had a fileset
language or regex pattern, we would probably want to parse it at this stage
so the syntax error can be reported without evaluation.
If you remove all refs from the backing Git repo and then run `jj git
import`, we would see that all commits disappeared from the Git repo,
so we would remove them from the jj repo too. However, we do that by
doing a history walk from old heads to the new heads, which includes
the root commit when the new heads is an empty set. That means that we
mark the root commit as abandoned, which led to a crash in
`rewrite.rs` (when we try pick the root commit's first parent to use
as parent for rebased commits).
I was trying to create a reproduction script for #412, but the script
ran into another bug first. The script removed all the local and
remote branches from the backing Git repo. I noticed that we would
then try to abandon all commits. We should still count Git HEAD's
target as visible and not try to abandon it. This patch fixes that.
Sometimes a tagged commit is not in any branch on the remote, but we
should still consider them upstream and not include them in the
default log template.
This was reported by @colemickens but now that I think about it, I
remember seeing such commits in the Git core repo (v1.4.4.5 and a few
commits before it were never merged into main).
We don't have a good way of testing this because we don't have a
command for creating tags.
Closes#681
The function has only one caller since 25b922cd0b and it's pretty
small. Inlining also means we can reuse the joined paths created in
it, so I did that by extracting variables for them.
Since 'merges()' just filters the candidates set per item, it doesn't need
a candidates argument. Perhaps, 'merges(x)' could be a predicate to select
merge commits within a subgraph 'x', but I don't know if that would be
useful.
Since 'filter(expr)' is identical to 'expr & filter()', it can be rewritten
in order to minimize the candidates set to be scanned.
The implementation is somewhat similar to revsetlang.optimize() of Mercurial,
but I've split the tree rewriting logic to several steps. I think that's good
for maintainability and should help us conclude that a recursion will
eventually terminate.
This helps to match '(filter, _) | (_, filter)' to rewrite the expression
tree. Only one predicate is allowed for now, but I think it can be extended
to internalize 'f(c) & g(c)' as '(g*f)(c)' to eliminate redundant lookup
of commit object.
I had `init.defaultBranch = main` in my global config (just being
rolled out internally at Google, it seems), which made
`test_import_refs_reimport_head_removed()` and
`test_fetch_initial_commit()` fail. This fixes it.
Since d56ae79d3f, `WorkingCopy` no longer reads `.gitignores`
directly from `$HOME/.gitignore`, so we don't need the workaround to
prevent it in the tests.
More workspace-derived parameters will be added, and I don't think wrapping
with Option for each makes sense because all parameters should be available
if workspace exists.
It seems a bit invasive that RepoPath constructor processes an environment
like cwd, but we need an unmodified input string to build a readable error.
The error could be rewrapped at cli boundary, but I don't think it would
worth inserting indirection just for that.
I made s/file_path/fs_path/ change because there's already to_fs_path()
function, and "file path" in RepoPath context may be ambiguous.
The native backend is just a proof of concept and there's no real
reason to use it other than for testing, so let's reduce the risk of
accidentally creating repos using it.
Previously an expression 'foo-bar-' failed to parse because
1. try first rule: 'foo-bar-' matches (identifier_part+ ~ '-')+, but the
trailing '' doesn't match identifier_part+
2. fall back to second rule: 'foo' matches identifier_part+
=> (identifier 'foo')
Instead, we need to consume as much (identifier_part ~ '-' ~ ...) as possible
before falling back to the identifier_part rule.
I think the trailing + of identifier_part+ is redundant, so removed it as
well.
I had intended for the `features = ["toml"]` to enable *only* the
`toml` feature, but I forgot to disable the default features so it had
no effect. This shrinks the binary from 17.4 MiB to 16.8 MiB.
Let WorkspaceCommandHelper clone it. WorkspaceCommandHelper could return
workspace_id by reference, but doing that would introduce noisy .clone()
calls and lifetime mess.
For consistency with the tree_state handling. This isn't that simple and
concise compared to the tree_state one, so I'm fine to drop the series.
I've extracted {operation_id, workspace_id} pair so these values can be
safely initialized by OnceCell. The extracted struct is named after the
"checkout" file.
Here OnceCell<T> serves as RefCell<Option<T>>, but it doesn't require runtime
Ref/RefMut wrapper. This allows us to get rid of some .clone() calls needed to
hide Ref<_> from public interface.
I'll make TreeState::snapshot() return a boolean denoting whether tree_state
is updated or not. New tree_id can be obtained from TreeState, so let's
remove it from the return value.
While making tree_state() return RefMut<TreeState> instead of RefMut<Option<_>>,
I felt uncomfortable that tree_state(&self) returned a mutable reference. So
this patch splits it into tree_state() and tree_state_mut().
I was reading a draft of "Git Rev News: Edition 91" [1] where Peff
mentions some unfinished patches to allow negative timestamps in
Git. So I figured I should add support for that before I forget. I
haven't checked if libgit2 supports it, so it might be that our Git
backend still doesn't support it after this patch.
[1] https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-91.md
When the remote rejects a push, we now say something like this:
```
Remote rejected the update of some refs (do you have permission to push to ["refs/heads/main"]?)
```
That message could be formatted better, but this seems good enough for
now. We should probably have some more custom conversion from
`GitPushError` to `CommandError` in the CLI layer.
This changes `RepoLoader` to take a map of functions that load a
specific type of backend, keyed by the backend type. The backend type
is read from `.jj/repo/store/backend`.
We currently determine if the repo uses the Git backend or the local
backend by checking for presence of a `.jj/repo/store/git_target`
file. To make it easier to add out-of-tree backends, let's instead add
a file that indicates which backend to use.
The `ReadonlyRepo::init_*()` functions were unused or used only in
tests. Let's remove them, thereby making the repo less aware of
specific backend implementations.
By calling `repo.loader()` on the existing repo, we reuse the same
commit store etc. That should make it easier to add out-of-tree
backends by removing one place we'd otherwise need to pass in a way of
creating a backend (now we reuse the instance that was already created
for the existing `repo`).
In many of these places, we don't need an owned value, so using a
reference means we don't force the caller to clone the value. I really
doubt it will have any noticeable impact on performance (I think these
are all once-per-repo paths); it's just a little simpler this way.
I feel it doesn't make sense for a simple getter function to create an
owned vec after 0108673087 "backend: let each backend handle root commit
on write."
This moves the logic for handling the root commit when writing commits
from `CommitBuilder` into the individual backends. It always bothered
me a bit that the `commit::Commit` wrapper had a different idea of the
number of parents than the wrapped `backend::Commit` had.
With this change, the `LocalBackend` will now write the root commit in
the list of parents if it's there in the argument to
`write_commit()`. Note that root commit itself won't be written. The
main argument for not writing it is that we can then keep the fake
all-zeros hash for it. One argument for writing it, if we were to do
so, is that it would make the set of written objects consistent, so
any future processing of them (such as GC) doesn't have to know to
ignore the root commit in the list of parents.
We still treat the two backends the same, so the user won't be allowed
to create merges including the root commit even when using the
`LocalBackend`.
I had made the backends unaware of the virtual root commit because
they don't need to know about it, and we could avoid some duplicated
code by putting that in `Store` instead. However, as we saw in
b21a123bc8, the root commit being virtual has some user-visible
effects (they can't create a merge with the root and some other
commit). So I'm thinking that we may want to make the root commit an
actual commit, depending on which backend is used. Specificially, when
using the Git backend, we cannot record the root commit as an actual
parent since Git would fail when trying to look it up. Backends that
don't need compatibility can make the root commit an actual commit,
however.
This commit therefore makes the backends aware of the root commit. It
makes it remain a virtual commit in the Git backend, and makes it an
actual commit in the `LocalBackend`.
This commit breaks any existing repos using the `LocalBackend`, but
there shouldn't be any such repos other than for testing.
I feel the original -------/+++++++ pair is slightly confusing because
each half can be a separator by itself. I don't know what character other
than '-'/'+' is preferred, but let's pick '%' (for "mod") per @martinvonz
suggestion.
`wc_commit` seems clearer than `checkout` and not too much longer. I
considered `working_copy` but it was less clear (could be the path to
the working copy, or an instance of `WorkingCopy`). I also considered
`working_copy_commit`, but that seems a bit too long.
This will be a basic building block of 'jj log PATH'. The implementation
is naive, but works fine for small repos like jj. For mid-size repos,
there would be various areas which need to be optimized.