When doing things like testing snapshot performance differences,
this allows you to turn off the monitor, no matter what the enabled
user or repository configuration has, e.g.
jj st --config-toml='core.fsmonitor="none"'
Signed-off-by: Austin Seipp <aseipp@pobox.com>
I was a bit surprised to learn (or be reminded?) that checking out
symlinks on Windows leads to a panic. This patch fixes the crash by
materializing symlinks from the repo as regular files. It also updates
the snapshotting code so we preserve the symlink-ness of a path. The
user can update the symlink in the repo by updating the regular file
in the working copy. This seems to match Git's behavior on Windows
when symlinks are disabled.
If the operation corresponding to a workspace is missing for some reason
(the specific situation in the test in this commit is that an operation
was abandoned and garbage-collected from another workspace), currently,
jj fails with a 255 error code. Teach jj a way to recover from this
situation.
When jj detects such a situation, it prints a message and stops
operation, similar to when a workspace is stale. The message tells the
user what command to run.
When that command is run, jj loads the repo at the @ operation (instead
of the operation of the workspace), creates a new commit on the @
commit with an empty tree, and then proceeds as usual - in particular,
including the auto-snapshotting of the working tree, which creates
another commit that obsoletes the newly created commit.
There are several design points I considered.
1) Whether the recovery should be automatic, or (as in this commit)
manual in that the user should be prompted to run a command. The user
might prefer to recover in another way (e.g. by simply deleting the
workspace) and this situation is (hopefully) rare enough that I think
it's better to prompt the user.
2) Which command the user should be prompted to run (and thus, which
command should be taught to perform the recovery). I chose "workspace
update-stale" because the circumstances are very similar to it: it's
symptom is that the regular jj operation is blocked somewhere at the
beginning, and "workspace update-stale" already does some special work
before the blockage (this commit adds more of such special work). But it
might be better for something more explicitly named, or even a sequence
of commands (e.g. "create a new operation that becomes @ that no
workspace points to", "low-level command that makes a workspace point to
the operation @") but I can see how this can be unnecessarily confusing
for the user.
3) How we recover. I can think of several ways:
a) Always create a commit, and allow the automatic snapshotting to
create another commit that obsoletes this commit.
b) Create a commit but somehow teach the automatic snapshotting to
replace the created commit in-place (so it has no predecessor, as viewed
in "obslog").
c) Do either a) or b), with the added improvement that if there is no
diff between the newly created commit and the former @, to behave as if
no new commit was created (@ remains as the former @).
I chose a) since it was the simplest and most easily reasoned about,
which I think is the best way to go when recovering from a rare
situation.
Our virtual file system at Google (CitC) would like to know the commit
so it can scan backwards and find the closest mainline tree based on
it. Since we always record an operation id (which resolves to a
working-copy commit) when we write the working-copy state, it doesn't
seem like a restriction to require a commit.
Our internal working copy implementations at Google will need the
commit so they can walk history backwards until they get to a "public"
commit. They'll then use that to tell build tools and virtual file
systems to present that as a base.
I'm not sure if we'll need to update `reset()` too. It's currently
only used by `jj untrack`, which doesn't change the commit's parent,
so it wouldn't affect any history walks.
It seems pretty clear from the context. Turns out we only use the
function in a test case. Maybe we don't even need it. It's easy to
provide it, though.
It's about time we make the working copy a pluggable backend like we
have for the other storage. We will use it at Google for at least two
reasons:
* To support our virtual file system. That will be a completely
separate working copy backend, which will interact with the virtual
file system to update and snapshot the working copy.
* On local disk, we need to tell our build system where to find the
paths that are not in the sparse patterns. We plan to do that by
wrapping the standard local working copy backend (the one moved in
this commit), writing a symlink that points to the mainline commit
where the "background" files can be read from.
Let's start by renaming the exising implementation to
`local_working_copy`.
The `#[tokio::main]` annotation uses a multi-threaded runtime by
default. We don't need that for querying watchman. Switching to the
single-threaded runtime saves about 20 ms.
In `LockedWorkingCopy::drop()`, we panic if the caller had not called
`finish()`. IIRC, the idea was both to find bugs where we forgot to
call `finish()` and to prevent continuing with a modified
`WorkingCopy` instance. I don't think the former has been a problem in
practice. It has been a problem in practice to call `discard()` to
avoid the panic, though. To address that, we can make the `Drop`
implementation discard the changes (forcing a reload of the state if
the working copy is accessed again).
Many (most?) callers of `Store::empty_tree_id()` really want a
`MergedTreeId`, so let's create a helper for that. It returns the
`Legacy` variant, which is what all current callers used. That should
be all we need since the two variants compare equal these days, and
since trees built based on the legacy variant can get promoted to the
new variant on write if the config is enabled.
This introduces a `MergedTreeBuilder` type, which takes a set of base
trees and overrides. The idea is that it will be able to write
multiple trees or a legacy tree. For now, it's only able to write
legacy trees. To show that it works, the working copy's snaphotting
code has been updated to use it.
We were using `current_tree()` only for an assertion where we were
walking its entries. Now that `MergedTree` supports that, we can
replace `current_tree()` by `current_merged_tree()`.
There's more work needed before the working copy can fully work with
tree-level conflicts. We still need to be able to store multiple tree
ids in the `tree_state` file, and we need to be able to create
multiple trees instead of writing conflict objects to the backend.
To support tree-level conflicts, we're going to need to update the
working copy from one `MergedTree` to another. We're going need to
store multiple tree ids in the `tree_state` file. This patch gets us
closer to that by getting the diff from `MergedTree`s`, even though we
assume that they are legacy trees for now, so we can write to the
single-tree `tree_state` file.
When we do an update between two `MergedTree` instances, we'll get
diffs between two `Merge<Option<TreeValue>>`. This commit prepares for
that by changing the type of the `before` and `after` arguments we
pass into the closure in `update()`.
I think it's a little easier to follow if we don't update the stats in
the large callback. It also reduces the risk of forgetting to update
the stats in some case (like in the exec-bit-optimization case I just
removed).
When updating the working copy from one tree to another, if only the
executable bit has changed between the two trees, we set the
executable bit on the file without touching its contents. The
optimization probably gets used quite rarely. Maybe it's even so
rarely that it's a pessimization overall. Perhaps its value lies more
in that we avoid updating the file's mtime unnecessarily. Either way,
I'm about to change this code to use `Merge<Option<TreeValue>>` and
that will make this block more complex. I don't think it's worth the
complexity even it provides some small benefit sometimes.
When the main `TreeState::snapshot()` thread doesn't receive any
updated tree entries over the channel, it correctly doesn't write a
new tree. However, it also doesn't write the working copy state file
(`.jj/working_copy/tree_state`). This resulted in performance
regression in 3f97a6da78. From that commit, repeated snapshotting
would have to re-read all files from disk because it didn't remember
the updated mtime from the previous time.
This patch fixes the bug by also writing the file if there were any
new file states.
This doesn't seem to make any difference right now, but it will if we
write the state file when there are mtime-only changes, which we
currently don't do.
The code for getting the current tree object was repeated a few times
over. I'm going to soon make it return a `MergedTree` and I don't want
to repeat that code (it's more complicated than the current code).