It would be nice to be able to use snapshot testing and not have to
parse the output of `jj op log`. This patch lets us do that by
providing a new environment variable and config for overriding the
timestamps. Unlike `operation.hostname` and `operation.username`,
these are only meant for tests.
We currently get the hostname and username from the `whoami` crate. We
do that in lib crate, without giving the caller a way to override
them. That seems wrong since it might be used in a server and
performing operations on behalf of some other user. This commit makes
the hostname and username configurable, so the calling crate can pass
them in. If they have not been passed in, we still default to the
values from the `whoami` crate.
We had a recent bug where we used a repo reference from before we
started a transaction and modified the repo. While it's often safe and
correct to use such references, it isn't always. This patch removes
all such cases. I think it generally makes the code clearer, and
better prepared for #50, if we ever get around to that. I found these
by temporarily making `WorkspaceCommandHelper::start_transaction()`
take a mutable reference.
These assertions were there to catch bugs, but when the bugs happen,
the assertions can obsure the underlying error (as @tp-woven found out
on #258). Let's just print errors instead.
It's the transaction's job to create a new operation, and that's where
the knowledge of parent operations is. By moving the logic for merging
in another operation there, we can make it contiuously update its set
of parent operations. That removes the risk of forgetting to add the
merged-in operation as a parent. It also makes it easier to reuse the
function from the CLI so we can inform the user about the process
(which is what I was investigating when I noticed that this cleanup
was possible).
If we have recorded in `MutableRepo` that commits have been abandoned
or rewritten, we should always rebase descendants before committing
the transaction (otherwise there's no reason to record the
rewrites). That's not much of a risk in the CLI because we already
have that logic in a central place there (`finish_transaction()`), but
other users of the library crate could easily miss it. Perhaps we
should automatically do any necessary rebasing we commit the
transaction in the library crate instead, but for now let's just have
a check for that to catch such bugs.
We had a few lines of similar code where we added a new of the
operation log and then removed the old heads. By moving that code into
a new type, we prepare for further refactorings.
I think these remaining implementations of `Drop` are for types that
are infrequently dropped (unlike `Transaction`), so it be fine to be
more strict about them.
The `Repo` doesn't do anything with the `WorkingCopy` except keeping a
reference to it for its users to use. In fact, the entire lib crate
doesn't do antyhing with the `WorkingCopy`. It therefore seems simpler
to have the users of the crate manage the `WorkingCopy` instance. This
patch does that by letting `Workspace` own it. By not keeping an
instance in `Repo`, which is `Sync`, we can also drop the
`Arc<Mutex<>>` wrapping.
I left `Repo::working_copy()` for convenience for now, but now it
creates a new instance every time. It's only used in tests.
This further decoupling should help us add support for multiple
working copies (#13).
It's been a lot of work, but now we're finally able to remove the
`Evolution` state! `jj obslog` still works as before (it just walks
the predecessor pointers).
The removal of hidden heads was just there to help with the transition
away from evolution (#32). Now that we no longer depend on evolution
for removing old heads, we can remove the hack.
I recently made the CLI remove hidden heads when a transaction is
committed (38474a9). Let's move that to `Transaction::commit()`, so
the library crate becomes more similar to how the CLI behaves and more
similar to our evolution-less future (#32).
The two types have become very similar so it doesn't seem that there's
any point in having two types. We should probably do the same with
`ReadonlyEvolution` and `MutableEvolution`.
When using the command line interface (which is the only interface so
far), it seems more useful to see the exact command that was run than
a logical description of what it does. This patch makes the CLI record
that information in the operation metadata in a new key/value field. I
put it in a generic key/value field instead of a more specialized
field because the key/value field seems like a useful thing to have in
general. However, that means that we "have to" do shell-escaping when
saving the data instead of leaving the data unescaped and adding the
shell-escaping when presenting it. I added very simple shell-escaping
for now.
I've wanted the API to look like this for a while. It seems like a
good API to me. It means that the caller won't have to reload the repo
after committing. The cost seems relatively small. It involves copying
potentially a lot of data in memory (at least the View object), but it
shouldn't involve reading from disk or any other processing. To reduce
the amount of data to copy, it may be worth switching to persistent
data types. I've also wanted to do that for the copying we do when
start a transaction.
I couldn't measure any slowdown caused by this change.
I suspect that at least one reason that I didn't make
`MutableRepo::base_repo` by an `Arc<ReadonlyRepo>` before was that I
thought that that would mean that `start_transaction()` would need be
moved off of `ReadonlyRepo` so it can be given an
`&Arc<ReadonlyRepo>`, which would make it much less convenient to
use. It turns out that a `self` argument can actually be of type
`&Arc<ReadonlyRepo>`.
This adds `MutableRepo::merge()`, which applies the difference between
two `ReadonRepo`s to itself. That results in much simpler code than
the current code in `merge_op_heads()`. It also lets us write `undo`
using the new function. Finally -- and this is the actual reason I did
it now -- it prepares for using the index when enforcing view
invariants.
I think the `as_` prefix of `as_repo_mut()` makes it sound like it
returns a view of the `Transaction`, but the `MutableRepo` is actually
a part of it. Also, the convention seems to be to put the `mut_` in
the name first if the function returns a name with a matching name
(like `MutableRepo` does).
This is yet another step towards making the `View` types
simpler. Perhaps we eventually won't need to wrap the types returned
from the `OpStore` at all.
This continues the work to make the `View` types be only about the
state of the current view and not about operations in general (which
has been moved out `OpStore` and qOpHeadsStore`).
It can be useful to write an operation to the `OpStore` without also
making it visible when you load the repo. I had planned to add that
functionality at least for hooks, so the hooks can be run commands
with `jj --at-op=<operation>` and decide whether to publish the
operation. However, the immediate goal is to let us rewrite
`op_heads_store::merge_op_heads()` to use the usual `Transaction`
API. That needs to be able to just write the operation without
publishing it, since the publishing step takes a long, which
`op_heads_store::merge_op_heads()` (its caller, actually) has already
taken.
Most methods on `Transaction` only need the `MutableRepo`, so it makes
for that functionality to be on the latter. That will let us update
the methods to also update the index, which would otherwise have been
harder because it would require a mutable borrow of both the view and
the index. This patch makes most current methods on `Transaction` just
delegate to `MutableRepo`. We may want to remove some of these
delegating methods later.
With this change, we start writing the incremental index to disk, so
the next reader won't have to re-read the commits and create the
index.
As of this change, we simply write a new index file for each
transaction. That will clearly mean that the stack of files gets deep
pretty quickly. For now, the user will have to do `jj debug reindex`
when things get slow. I plan to change it so instead of writing an
incremental index file every time, we first check if the new index
file would have at least as many commits as the parent file, and if it
will, we write a combined one instead. That should apply recursively,
so we'd have O(log n) index files.
I don't know why I made the walk stop at heads instead of indexed
commits before. Perhaps I did it because it's cheap to check in the
set of head. However, it gets very expensive to walk all the way back
to the root if the parents are not in the set of heads.
With tons of groundwork done, wee can now finally keep the index up to
date within a transaction! That means that we can start relying on the
index to always be valid, so we can use it e.g. for finding common
ancestors within a transaction. That should help speed up `jj evolve`
immensely on large repos.
We still don't write the updated index to disk when the transaction
closes. That will come later.
`Transaction::add_head()` currently invalidates the whole evolution
state. We've had support for incrementally updating evolution since
4619942a57. We should start taking advantage of that. Let's add a
fast-path in `Transaction::add_head()` for the common case where we
add a single commit on top of an existing head. That cheap an simple
to check for. However, it won't cover the case of adding a child off
of a non-head. It's still a good start.
I want to keep the index updated within the transaction. I tried doing
that by adding a `trait Index`, implemented by `ReadonlyIndex` and
`MutableIndex`. However, `ReadonlyRepo::index` is of type
`Mutex<Option<Arc<IndexFile>>>` (because it is lazily initialized),
and we cannot get a `&dyn Index` that lives long enough to be returned
from a `Repo::index()` from that. It seems the best solution is to
instead create an `Index` enum (instead of a trait), with one readonly
and one mutable variant. This commit starts the migration to that
design by replacing the `Repo` trait by an enum. I never intended for
there there to be more implementations of `Repo` than `ReadonlyRepo`
and `MutableRepo` anyway.
When a transaction gets dropped without being committed or explicitly
discarded, we currently raise an assertion error. I added that check
because I kept forgetting to commit transactions. However, it's quite
normal to want to drop transactions in error cases. The current
assertion means that we panic and don't report the actual error to the
user in such cases. We should probably audit the code paths where we
commit transactions and decide for each if we simply want to to
discard the transaction or not. In some cases, we may want to commit
the transaction without integrating it in the operation log
(i.e. without creating a file entry in .jj/views/op_heads/). However,
we can do that later. For now, let's just make sure we don't panic
when dropping the transaction in release builds.