It's faster to add only files matched by the Watchman matcher to the
set of deleted files than to add all files and then removed files not
matched. This speeds up `jj diff` with Watchman in the Linux repo from
~530 ms to ~460 ms.
The intention in this and some future commits is to have `update_file_state` accept `&self` instead of `&mut self` to make clear what data is updated by `update_file_state` and to ensure transactional safety of the `TreeState` contents.
It's currently a bit complicated to snapshot the working copy and
there's a lot of duplication in tests. This commit introduces a
function to simplify it. I made the function snapshot the working copy
and save the updated state. Some of the tests I changed previously
discarded the changes instead of saving them, but I think they all did
so because it was simpler. I left a few call sites unchanged because
they make concurrent changes.
This is breaking change. Old jj binary will panic if it sees a view saved by
new jj. Alternatively, we can store both new and legacy data for backward
compatibility.
This also lets us compare the resulting tree because the working copy
now exactly matches the tree (it used to be that the `.gitignore` file
wasn't initially snapshotted).
I guess we don't depend on `read_view()` ever returning `NotFound` the
way `read_operation()` does, but it seems like it still should return
`NotFound` when the view doesn't exist.
There's a comment saying that mutating the file's current state
simplifies later logic, but I don't think that's true. It might have
been true in the past, when we had `FileType::Conflict`.
We don't seem to have any tests that our protection from undetected
changes caused by writes happening right after checkout, so let's add
one. The test case loops 100 times and each iteration fails slightly
more than 80% of the time on my machine (if I remove the protection in
`TreeState::update_file_state()`), so it seems quite good at
triggering the race.
I don't see any reason for these tests to differ depending on backend,
so let's avoid running them twice (there are probably lots of other
tests that we're also running with different backends for little
reason).
When snapshotting a conflict, we read the contents and parse conflict
markers. If we determine that it's no longer a conflict, then we write
a normal file entry to the tree instead. When we do that, we re-read
the file from the working copy. Let's instead write the file contents
we already read.
When a file's mtime has changed on disk, we update our record of that
mtime, but we did so in a separate place for conflicts compared to
non-conflicts. Let's reuse it.
The working copy's current tree tracks whether a file is a
conflict. We also track that in the `TreeState` object. That allows us
to not read the trees object to decide if we should try to parse a
file as a conflict.
One disadvantage is that it's redundant information that needs to be
kept in sync with the tree object. Also, for Watchman, we would like
to completely ignore the persisted `FileState`.
This commit removes the `FileType::Conflict` variant and instead
checks in the tree object whether a given path was a conflict. This is
the change I mentioned in dc8a207737. We still skip the check
completely if the file's mtime etc. is unchanged, so it shouldn't have
much effect in the common case of a mostly unchanged working copy. I
measured a slowdown on `jj diff` by ~3% in the Linux repo with a clean
working copy with all mtimes bumped. I think the simpler code and
reduced risk of subtle bugs is worth the performance hit.
The idea is simple. New heads are ignored until the node dependency resolution
stuck. Then, only the first head that will unblock the visit will be queued.
Closes#242
The original idea was similar to Mercurial's "topo" sorting, but it was bad
at handling merge-heavy history. In order to render merges of topic branches
nicely, we need to prioritize branches at merge point, not at fork point.
OTOH, we do also want to place unmerged branches as close to the fork point
as possible. This commit implements the former requirement, and the latter
will be addressed by the next commit.
I think this is similar to Git's sorting logic described in the following blog
post. In our case, the in-degree walk can be dumb since topological order is
guaranteed by the index. We keep HashSet<CommitId> instead of an in-degree
integer value, which will be used in the next commit to resolve new heads as
late as possible.
https://github.blog/2022-08-30-gits-database-internals-ii-commit-history-queries/#topological-sorting
Compared to Sapling's beautify_graph(), this is lazy, and can roughly preserve
the index (or chronological) order. I tried beautify_graph() with prioritizing
the @ commit, but the result seemed too aggressively reordered. Perhaps, for
more complex history, beautify_graph() would produce a better result. For my
wip branches (~30 branches, a couple of commits per branch), this works pretty
well.
#242
This partially reverts 4c8f484278 "graphlog: key by commit id (not index
position)." As Martin pointed out, it made "log -r 'tags()' -T.." in git
repo super slow. Apparently, both clone() and hash map insertion/lookup costs
increased by that change. Since we don't need CommitId inside the graph
iterator, we can simply replace it with IndexPosition, and resolve it to
CommitId later.
Copied from MergedTree::has_conflict(). I feel it's slightly better since
RefTarget "is" always a Conflict-based type. It could be inverted to
RefTarget::is_resolved(), but refs are usually resolved, and all callers
have special case for !is_resolved() state.
We upgraded to clap 4.1 in 8f1dc490 (cargo: upgrade to clap 4.1,
2023-03-17), and the pipe to `sed` was removed in the README.md file in
that same commit. However the documentation was not updated in mod.rs,
which leads us to give the obsolete (now incorrect) advice when we run
`jj util completion --help`.
See #1393.