Evolution needs to have fast access to the predecessors. This change
adds that information to the commit index.
Evolution also needs fast access to the change id and the bit saying
whether a commit is pruned. We'll add those soon.
Some tests changed because they previously added commits with
predecessors that were not indexed, which is no longer allowed from
this change. (We'll probably eventually want to allow that again, so
that the user can prune predecessors they no longer care about from
the repo.)
We currently write a new incremental index file every time. That means
that the stack of index files quickly gets deep, which makes it slow
to read the index. This commit makes it so that we squash the new
index segment into its parent if the parent has fewer commits. That
means we'll limit the number of files to O(log n). Writes time will
also be O(log n) on average.
I've confused myself a few times already thinking that level 0 is the
root, so that's probably more intuitive. It also makes tests simpler
because the initial part of the list is unchanged when a new
transaction commits.
With this change, we start writing the incremental index to disk, so
the next reader won't have to re-read the commits and create the
index.
As of this change, we simply write a new index file for each
transaction. That will clearly mean that the stack of files gets deep
pretty quickly. For now, the user will have to do `jj debug reindex`
when things get slow. I plan to change it so instead of writing an
incremental index file every time, we first check if the new index
file would have at least as many commits as the parent file, and if it
will, we write a combined one instead. That should apply recursively,
so we'd have O(log n) index files.
The check for adding an existing commit to the index only checked if
the commit was already in the `MutableIndex`, not if it was already in
the parent `ReadonlyIndex`.
With tons of groundwork done, wee can now finally keep the index up to
date within a transaction! That means that we can start relying on the
index to always be valid, so we can use it e.g. for finding common
ancestors within a transaction. That should help speed up `jj evolve`
immensely on large repos.
We still don't write the updated index to disk when the transaction
closes. That will come later.
`Transaction::add_head()` currently invalidates the whole evolution
state. We've had support for incrementally updating evolution since
4619942a57. We should start taking advantage of that. Let's add a
fast-path in `Transaction::add_head()` for the common case where we
add a single commit on top of an existing head. That cheap an simple
to check for. However, it won't cover the case of adding a child off
of a non-head. It's still a good start.
I want to keep the index updated within the transaction. I tried doing
that by adding a `trait Index`, implemented by `ReadonlyIndex` and
`MutableIndex`. However, `ReadonlyRepo::index` is of type
`Mutex<Option<Arc<IndexFile>>>` (because it is lazily initialized),
and we cannot get a `&dyn Index` that lives long enough to be returned
from a `Repo::index()` from that. It seems the best solution is to
instead create an `Index` enum (instead of a trait), with one readonly
and one mutable variant. This commit starts the migration to that
design by replacing the `Repo` trait by an enum. I never intended for
there there to be more implementations of `Repo` than `ReadonlyRepo`
and `MutableRepo` anyway.
I've been confused twice that rebasing an open commit so it results in
conflicts doesn't show the conflicts in the log output. That's because
we create a successor instead if a commit with conflicts is open. I
guess I thought it would be expected that a child commit was not
created. Since it seems surprising in practice, let's change it and
we'll see if the new behavior is more or less surprising.
We want to be able to be able to do fast `.contains()` checks on the
result, so `Iterator` was a bad type. We probably should hide the
exact type (currently `HashSet` for both readonly and mutable views),
but we can do that later. I actually thought I'd want to use
`.contains()` for indiciting public-phase commits in the log output,
but of course want to also indicate ancestors as public. This still
seem like a step (mostly) in the right direction.
Mercurial's "phase" concept is important for evolution, and it's also
useful for filtering out uninteresting commits from log
output. Commits are typically marked "public" when they are pushed to
a remote. The CLI prevents public commits from being rewritten. Public
commits cannot be obsolete (even if they have a successor, they won't
be considered obsolete like non-public commits would).
This commits just makes space for tracking the public heads in the
View.
All commits in the view are supposed to be reachable from its
heads. If a head is removed and there are git refs pointing to
ancestors of it (or to the removed head itself), we should make that
ancestor a head.
I think it's better to let the caller decide if the parents should be
added. One use case for removing a head is when fetching from a Git
remote where a branch has been rewritten. In that case, it's probably
the best user experience to remove the old head. With the current
semantics of `View::remove_head()`, we would need to walk up the graph
to find a commit that's an ancestor and for each commit we remove as
head, its parents get temporarily added as heads. It's much easier for
callers that want to add the parents as heads to do that.
Git refs are important at least for understanding where the remote
branches are. This commit adds support for tracking them in the view
and makes `git::import_refs()` update them.
When merging views (either because of concurrent operations or when
undoing an earlier operation), there can be conflicts between git ref
changes. I ignored that for now and let the later operation win. That
will probably be good enough for a while. It's not hard to detect the
conflicts, but I haven't yet decided how to handle them. I'm leaning
towards representing the conflicting refs in the view just like how we
represent conflicting files in the tree.
This commits makes it so that running commands outside a repo results
in an error message instead of a panic.
We still don't look for a `.jj/` directory in ancestors of the current
directory.
I'm preparing to publish an early version before someone takes the
name(s) on crates.io. "jj" has been taken by a seemingly useless
project, but "jujube" and "jujube-lib" are still available, so let's
use those.
When you run e.g. `jj st` outside of a repo, it just
crashes. That'll probably give new users a bad impression, so I
was planning to improve error handling a bit. A good place to
start is by fixing the code I recently added (which obviously
should have been using `thiserror` from the beginning). That's
what this commit does.
Also, this is the first commit in this repo created with
Jujube! I've just started dogfooding it myself.
This adds `jj git fetch` for fetching from a git remote. There remote
has to be added in the underlying git repo if it doesn't already
exist. I think command will still be useful on typical small projects
with just a single remote on GitHub. With this and the `jj git push` I
added recently, I think I have enough for my most of my own
interaction with GitHub.
The fact that no commits from the underlying Git repo were imported
when creating a new Jujube repo from it was quite surprising. This
commit finally fixes that.
When using Git as a store, new commits created in the underlying Git
repo are only made visible by making changes on top of them (e.g by
checking them out, so a working copy commit is created on top). That's
especially confusing when creating a new repo backed by an existing
Git repo, because the commits from that repo don't show up.
This commit prepares for fixing that by adding a function for updating
heads based on git refs. Since we don't yet track git refs (or
anything similar), the function just makes sure the refs are visible
in the Jujube repo by making them (anonymous) heads.
`Transaction::add_head()` and others would let the caller add
non-heads to the set (i.e. ancestors of others heads) and the the
non-heads were filterd out when the transaction was committed. That's
a little surprising, so let's try to keep the set valid even within a
transaction. That will surely make commands that add many commits
noticeably slower in large repos. Hopefully we can improve that
later.
It's annoying to have to have the Git repo and Jujube repo in separate
directories. This commit adds `jj init --git`, which creates a new
Jujube repo with an empty, bare git repo in `.jj/git/`. Hopefully the
`jj git` subcommands will eventually provide enough functionality for
working with the Git repo that the user won't have to use Git commands
directly. If they still do, they can run them from inside `.jj/git/`,
or create a new worktree based on that bare repo.
The implementation is quite straight-forward. One thing to note is
that I made `.jj/store` support relative paths to the Git repo. That's
mostly so the Jujube repo can be moved around freely.
A pruned commit just indicates that its predecessors should be evolved
onto the pruned commit's parent instead of onto the pruned commit
itself. The pruned commit itself can be divergent. For example, if
there are several pruned sucessors of a commit, then it's unclear
where the predecessor's children should be rebased to.
Before this commit, when the `evolve()` evolved a stack of orphans, it
would use the evolve state from the beginning of the function to
calculate where they should go. That meant that only the bottom-most
orphan(s) would get evolved to their right place. This commit fixes
that by use the Transaction's evolution state.
The point of having the `modified` and `removed` files in the test was
to show that they don't get untracked, but I forgot to include them in
the `.gitignores`, so there was no reason they would have gotten
untracked anyway.
The project's source of truth is now in Git and I really miss support
for anonymous heads and evolution (compared to when the code was in
Mercurial). I'm therefore more motivated to make the tool useful for
day-to-day work on small repos, so I can use it myself. Until now, I
had been more focused on improving performance when it was used as a
read-only client for medium-to-large repos.
One important feature for my day-to-day work is support for
ignores. This commit adds simple and effective, but somewhat hacky
support for that. libgit2 requires a repo to check if a file should be
ignored (presumably so it can respect `.git/info/excludes`). To work
around that, we create a temporary git repo in `/tmp/` whenever the
working copy is committed. We set that temporary git repo's working
copy to be shared with our own working copy. Due to
https://github.com/libgit2/libgit2sharp/issues/1716 (which seems to
apply to the non-.NET version as well), this workaround unfortunately
leaves a .git file (pointing to the deleted temporary git repo) around
in every Jujube repo. That's always ignored by libgit2, so it's not
much of a problem.
This removes one level of indirection, which is nice because it was
visible to the callers. The `Index` struct is now empty. The next step
is obviously to delete it (and perhaps rename `IndexFile` to `Index`
or `ReadonlyIndex`).