I plan to use this matcher for some future `jj add` command (for
#323). The idea is that we'll do a path-restricted walk of the working
copy based on the intersection of the sparse patterns and any patterns
specified by the user. However, I think it will be useful before that,
for @arxanas's fsmonitor feature (#362).
The new `Visit::Nothing` variant lets us more easily restructure
`DifferenceMatcher::visit()` to make it handle the case of not
removing anything. I think this makes the code a little clearer.
I didn't initially create a `Visit::Nothing` variant because I was
worried that the fact that there then are two ways of expressing this
value (there's also `Visit::Specific` with empty sets). However, the
value is quite useful for pattern matching, so I'm now thinking it's
worth the risk.
If a commit's author field has the placeholder user/email values
(i.e. "(no name configured)" and "(no email configured)"), and they
have now configured their email and username, they probably want us to
update the author field with the new information, so that's what this
patch does. Thanks to durin42@ for the suggestion on #322.
If the source commit becomes empty as a result of
`move/squash/unsquash`, we abandon it. However, perhaps we shouldn't
do that if the source commit is a working-copy commit because
working-copy commits are often work-in-progress commits.
The background for this change is that @arxanas had just started a new
change and had set a description on it, and then decided to make some
changes in the working copy that should be in the parent
commit. Running `jj squash` then abandoned the working-copy commit,
resuling in the description getting lost.
The regular `Display` format is (not surprisingly) more user-friendly,
as pointed out by @yuja.
I also switched to using format strings for these cases, and some
nearby strings for consistency.
We can easily make the `DirEntry` available here, so we can call
`.metadata()` on that instead of on the `Path`. I think that avoids
walking the path. I'm sure this has no significant impact on
performance, but it's also almost as readable.
When we have just written a file to disk on checkout, let's record the
size we expected instead of what we got from `fstat()`. This should
address a race where the file was modified between the time we wrote
it and the time we requested its stat. I just happened to notice this
while going through the code; it seems very unlikely to be noticed in
practice.
When we have just written a file or conflict, we can get metadata for
it via the open file descriptor instead of using the path. That
removes the risk of a race where the file got removed or replaced by
another file type (at least on Unix).
These assertions were there to catch bugs, but when the bugs happen,
the assertions can obsure the underlying error (as @tp-woven found out
on #258). Let's just print errors instead.
We don't even have any settings that affect the repo, so there's no
point in passing the settings. I think this was a leftover from before
we separated out the "workspace" concept; now we no longer create a
working-copy commit when we initialize a repo (we do that when we
attach the workspace).
We don't need the `config` crate's support for JSON etc., so let's
just enable the TOML feature. (Trying to import all the JSON, RON,
dependencies etc. into Google's source control was a pain.)
This adds a `--reversed` flag to `jj log` to show commits with later
commits further down. It works both with and without the graph.
Since the graph-drawing code is already independent of the
relationship between commits, it doesn't need any updating.
The request to show the log output with more recent commits at the
bottom comes up once in a while (among Mercurial users, and now also
for jj from @arxanas). It's pretty easy to implement by adding an
adapter to the current `RevsetGraphIterator`. It works by first
collecting all nodes and edges into a vector and then yielding them in
reverse order and with reversed edges. That means it's no longer lazy,
but that seems fine since the feature is optional. Also, it's only the
subset of nodes that are in the selected revset that will be
collected.
Making the CLI use the new iterator adapter will come in a later
patch.
It's much easier to update the tests with `insta`.
It also presents you with the bad output including real newlines (a
diff, actually), so we can remove the `println!()` calls we had in
order to get readable output without escaped newlines.
The biggest difference in the API is that fields are now public. The
exception from that is `oneof` fields, which still require setters and
getters.
I couldn't measure any difference in performance. I didn't expect any
difference either, but it's good that it didn't seem to regress. I
timed `jj debug operation <some hash prefix>`, which will read the
whole operation log (to check that the prefix is unambiguous).
Tree merges can currently fail because of a failure to look up an
object, or because of a failure to read its contents. Both results in
`BackendError` because of a `impl From<std::io::Error> for
BackendError`. That's kind of correct in this case, but it wasn't
intentional (that impl was from `local_backend`), and we need to
making errors more specific for better error handling.
I think I copied the name `write_tree()` from Git, but I find it quite
confusing, since it's not clear if it write a tree to the working copy
or reads the working copy and writes a tree to the store (it's the
former).
It seems to me that we have never created a Git index in order to
create a commit, not even in the earliest versions of the code (before
it was moved to Git).
Now that I'm using GitHub PRs instead of pushing directly to the main
branch, it's quite annoying to have to abandon the old commits after
GitHub rebases them. This patch makes it so we compare the remote's
previous heads to the new heads and abandons any commits that were
removed on the remote. As usual, that means that descendants get
rebased onto the closest remaining commit.
This is half of #241. The other half is to detect rewritten branches
and rebase on top.
Let's say we have a simple history like this:
```
B C D
\|/
A
```
Branch `main` initially points to commit B. Two concurrent operations
then move the branch to commits C and D. When the two concurrent
operations get merged, the branch will be recorded as pointing to
"C+D-B". If a subsequent operation now abandons commit B, we would
update the "removed" side of the branch conflict. That seems a little
dishonest. I think the reason I did it that way was in order to not
keep B visible back when having it present in the "removed" side would
keep it visible (before 33bf6ce1d5).
I noticed this issue while working on #241 because
`test_import_refs_reimport()` started failing. That test case is
pretty much exactly the case above.
When committing the working copy, we try to not visit ignored
directories (as e.g. `target/` often is), but we need to visit it if
there are already tracked files in it. I initially missed that in
c1060610bd and then fixed it in a028f33e3b. The fix works by
checking if the next path after the ignored path is inside the ignore
path (viewed as a directory). However, I forgot to handle the case
where there are no paths at all after the ignored path. So, for
example, if the `target/` directory should be ignored and it there
were no tracked paths after `target/` in alphabetical order, we would
still visit the directory. That's why the bug reproduced in the
`git-branchless` repo but not in the `jj` repo (because there are
files under `testing/` and `tests/` here).
Closes#247.
This patch makes room for sparse patterns in the `TreeState` proto
message. We also start setting that value to a list of just the
pattern `.` when we create new working copies. Old working copies
without the sparse patterns are also interpreted as having that single
pattern. Note that this absence of sparse patterns is different from a
present list of no patterns. The latter is a valid state and means
that no paths are included in the sparse checkout.
This adds a matcher that takes two input matchers and creates a new
matcher from them. The composite matcher matches paths matched by the
first matcher but not matched by the second matcher. I plan to use
this for sparse checkouts. They'll also be useful if we add support
for negative patterns to filter e.g. `jj files` by.
Knowing that a matchers matches everything recursively from a certain
directory is useful for various optimizations. For example, it lets
you avoid visiting a directory if you're using a matcher with a
negative condition (so you return what does *not* match).
The `DescendantRebaser` keeps a map of branches from the source
commit, so it gets efficient lookup of branches to update when a
commit has been rebased. This map was not kept up to date as we
rebased. That could lead to branches getting left on hidden
intermediate commits. Specifically, if a commit with a branch was
rewritten by some command, and an ancestor of it was also rewritten,
then we'd only update the branch only the first step and not update it
again when rebasing onto the rewritten ancestor.
When a directory is missing in one merge input (base or one side), we
would consider that a merge conflict. This patch changes that so we
instead merge trees by treating the missing tree as empty.
This introduces a `connected(x)` function, which is simply the same as
`x:x`. It's occasionally useful if `x` is a long expression. It's also
useful as a building block for `root(x)` (coming soon).
This release is mostly about the fix for #177, which looks pretty bad
even though I think it is actually harmless. It also has `jj log -p`
contributed by @yuja!
We do it for all the other kinds of objects already. It's useful to
have the path for backends that store objects by path (we don't have
any such backends yet). I think the reason I didn't do it from the
beginning was because we had separate `RepoPath` types for files and
directories back then.
We depend on comparing the workspace root with the Git repo's path to
know if we're sharing the working copy with it. For that to work
reliably, we need the paths to be canonicalized, so that's what this
patch tries to do.
There was a TODO about adding a test case for a delete/modify conflict
in a branch target that got resolved by abandoning a commit. The
resolution is to delete the branch. That case couldn't happend with
our old evolution-based mechanism for tracking rewrites (because we
couldn't un-prune a commit then).
This involved copying `UnresolvedHeadRepo::resolve()` into the CLI
crate (and modifying it a bit to print number of rebased commit),
which is unfortunate.
The function is now pretty simple, and there's only one caller, so
let's inline it. It probably makes sense to move the code out of
`repo.rs` at some point.
It's the transaction's job to create a new operation, and that's where
the knowledge of parent operations is. By moving the logic for merging
in another operation there, we can make it contiuously update its set
of parent operations. That removes the risk of forgetting to add the
merged-in operation as a parent. It also makes it easier to reuse the
function from the CLI so we can inform the user about the process
(which is what I was investigating when I noticed that this cleanup
was possible).
If we have recorded in `MutableRepo` that commits have been abandoned
or rewritten, we should always rebase descendants before committing
the transaction (otherwise there's no reason to record the
rewrites). That's not much of a risk in the CLI because we already
have that logic in a central place there (`finish_transaction()`), but
other users of the library crate could easily miss it. Perhaps we
should automatically do any necessary rebasing we commit the
transaction in the library crate instead, but for now let's just have
a check for that to catch such bugs.
Certain commands should never rewrite commits, or they take care of
rebasing descendants themselves. We have an optimization in
`commands.rs` for those commands, so they skip the usual automatic
rebasing before committing the transaction. That's risky to have to
remember and `MutableRepo` already knows if any commits have been
rewritten (that wasn't the case before, in the Evolution-based
code). So let's just have `MutableRepo` do the check instead.
It's useful for the UI layer to know that there's been concurrent
operations, so it can inform the user that that happened. It'll be
even more useful when we soon start making the resolution involve
rebasing commits, since that's even more important for the UI layer to
present to the user. This patch gets us a bit closer to that by moving
the resolution to the repo level.
We had a few lines of similar code where we added a new of the
operation log and then removed the old heads. By moving that code into
a new type, we prepare for further refactorings.
I want to make it so when we apply a repo-level change that removes a
head, we rebase descendants of that head onto the closes visible
ancestor, or onto its successor if the head has been rewritten (see
#111 for details). The view itself doesn't have enough information for
that; we need repo-level information (to figure out relationships
between commits).
The view doesn't have enough information to do the.
It's unusual for the current commit to have descendants, but it can
happen. In particular, it can easily happen when you run `jj new`. You
probably don't want to abandon it in those cases.
I forgot to bump the version to 0.3.2 before tagging and releasing it,
so the released 0.3.2 has version number 0.3.1 in the source code and
(therefore) reported from `jj --version`. I'm therefore bumping it
from 0.3.1 to 0.3.3 now, so there can be a matching 0.3.3 release.
I was able to build a working musl binary with this change, by running
this command:
```
cargo build --release --target x86_64-unknown-linux-musl
```
Thanks to @arxanas for the tip.
There's been *a lot* of changes since 0.2.0 almost a year ago. With
the attention the project has gotten recently, I feel like I should
cut a new release and start keeping a changelog. So let's start by
bumping the version to 0.3.0.
The library crate shouldn't look up the user's `$HOME` directory
(maybe the library is used by a server process), so let's have the
caller pass it into the library crate instead.
I'm not sure it'll be useful, but it seems nice to be able to set the
same values via config or environment variables. Perhap we should
simply use `config::Environment` to make everything configurable via
environment variables, but I'll leave that for later.
It's useful to have `signature()` live on `UserSettings` because that
will let us cache information (such as the timestamp) in the
instance. It will also make it easier to have the timestamp settable
via regular config files. I don't know that that will be useful, but
it seems like a clean way of implementing it if we can have
environment variables simply as an overlay of configs.
We don't really need a BTreeMap for keeping the unchanged ranges. The
only place it helps a bit is when refining a diff because we may then
insert some more unchanged ranges in the list. I think there has to be
very many unchanged ranges for that to matter, however. This patch
therefore replace the BTreeMap by a sorted Vec. `cargo bench` says
that a few tests got ~20% faster.
I'm looking into this code now because I'm thinking of copying some of
it for the "partial conflict resolution" tool I'm working on for
Mercurial.
I wanted to replace the BTreeMap by a Vec and noticed that we actually
sometimes end up having a `0..n` range followed by a `0..0` after
refinement. We currently compare those two as equal because I had not
thought that we could end up attempting to add two ranges with the
same start point. When trying to insert the second range (`0..0`), the
BTreeMap will keep the existing key (`0..n`) and replace the
value. That's probably works, but it's clearly not what I
intended. Let's fix by sorting by the end point if the start point is
equal. This actually improves some benchmarks by a few percent (maybe
because the subsequent compaction can then remove the `0..0` range).
When the backing Git repo is inside the workspace (typically directly
in `.git/`), let's point to it by a relative path so the whole
workspace can be moved without breaking the link.
Closes#72.
This patch introduces a `JJ_TIMESTAMP` environment variable that lets
us specify the timestamp to use in tests. It also updates the tests to
use it, which means we get to simplify the tests a lot now that that
the hashes are predictable.
Originally, I had thought that these warnings would only potentially show up in nightly because there was a feature which exposed these functions, and we would be able to enable that feature and conditionally not define the conflicting methods. But it looks like these warnings also show up in stable. I've just suppressed each of them individually. Other options would be to rename them and just make them wrapper methods, or to disable `unstable_name_collisions` warnings at a higher scope (possibly including at the crate level).
1b6efdc3f8 moved `.jj/git/` into `.jj/store/` for consistency with
the layout of native stores. It provided automatic format upgrades for
repos with the old format. It's been about four months now, so let's
remove the migration code.
We no longer need the commit ID, so we shouldn't make the callers pass
it. This lets us simplify several tests, because they no longer to
create commits just to check out a tree in the working copy.
We used to use the value to detect races, but we use the tree ID and
the operation ID these days, so we don't need the commit ID.
By changing this, we can avoid creating some commit IDs in tests,
which is why I tackled this issue now.
There are only two callers of `LockedWorkingCopy::check_out()`. One is
in `commands.rs`. That caller already checks after taking the lock
that the old commit ID is as expected. The other caller is
`WorkingCopy::check_out()`. We can simply move the check to that level
since it's the only caller that cares now.
We resolve checkouts in favor of the first-committed operation (which
is more likely to have managed to update the working copy). The test
case has been flaky on GitHub lately. I've run it 1000 times on my
machine without failure. I don't know if GitHub's machines are just
faster in some way (SSD, maybe) that makes them finish the two
operations in the test in the same millisecond. Let's add a
1-millisecond sleep to see if that helps. If it doesn't, then maybe
the issue is that the clock has lower precision (or their clocks can
go backwards?).
`LockedWorkingCopy::discard()` shouldn't result in changes to the
on-disk state, but `LockedWorkingCopy::check_out()` may have already
written a state file, which is surprising. The changes also remain in
memory, which is also surprising. Let's fix both of those issues.