Commit graph

2667 commits

Author SHA1 Message Date
Martin von Zweigbergk
88904e2b63 revsets: add support for function syntax
This adds `parents(foo)` and `ancestors(foo)` as alternative ways of
writing `:foo` and `*:foo`. 

I haven't added support for for whitespace yet; the parsing is very
strict. The error messages will also need to be improved later.
2021-04-18 21:25:58 -07:00
Martin von Zweigbergk
2d6325b0f4 revsets: define grammar in pest 2021-04-18 21:25:58 -07:00
Martin von Zweigbergk
0d62a336af revsets: initial support for Mercurial-style revsets
This patch adds initial support for a DSL for specifying revisions
inspired by Mercurial's "revset" language. The initial support
includes prefix operators ":" (parents) and "*:" (ancestors) with
naive parsing of the revsets. Mercurial uses postfix operator "^" for
parent 1 just like Git does. It uses prefix operator "::" for
ancestors and the same operator as postfix operator for descendants. I
did it differently because I like the idea of using the same operator
as prefix/postfix depending on desired direction, so I wanted to apply
that to parents/children as well (and for
predecessors/successors). The "*" in the "*:" operator is copied from
regular expression syntax. Let's see how it works out. This is an
experimental VCS, after all.

I've updated the CLI to use the new revset support.

The implementation feels a little messy, but you have to start
somewhere...
2021-04-18 21:25:51 -07:00
Martin von Zweigbergk
7861968f64 index: make IndexRef::entry_by_id() etc return entry with repo's lifetime
It's useful to be able to know that given a `repo: RepoRef<'a>`, the
the lifetime of `repo.index().entry_by_id()` will also be `'a`.
2021-04-15 07:00:04 -07:00
Martin von Zweigbergk
4c3d73ff3b evolution: walk orphans using index
This actually seems to make it slightly slower, but it fixes an
important bug (we used to evolve only one topological branch per `jj
evolve` call). The slowdown seemed to be on the order of 5% when
evolving 100 commits on git.git's "what's cooking" branch.
2021-04-14 08:25:14 -07:00
Martin von Zweigbergk
783e1f6512 repo: make MutableRepo have an Arc<ReadonlyRepo> instead of a reference
I suspect that at least one reason that I didn't make
`MutableRepo::base_repo` by an `Arc<ReadonlyRepo>` before was that I
thought that that would mean that `start_transaction()` would need be
moved off of `ReadonlyRepo` so it can be given an
`&Arc<ReadonlyRepo>`, which would make it much less convenient to
use. It turns out that a `self` argument can actually be of type
`&Arc<ReadonlyRepo>`.
2021-04-11 13:42:31 -07:00
Martin von Zweigbergk
ce855bccfa repo: make reload() and reload_at() return a new ReadonlyRepo
After this patch `ReadonlyRepo` is even closer to readonly. That makes
it easier to reason about. It will allow some further cleanups too.
2021-04-11 10:39:29 -07:00
Martin von Zweigbergk
e3ca27bf77 revsets: support git refs 2021-04-10 10:10:09 -07:00
Martin von Zweigbergk
40f75ec641 revsets: don't crash if given non-hex symbol 2021-04-10 10:08:47 -07:00
Martin von Zweigbergk
9e8a7e2ba6 revsets: move code for resolving symbol to commit to new module 2021-04-10 09:46:27 -07:00
Martin von Zweigbergk
102f7a0416 diff: also recurse into final region after after unchanged regions
See test case for details.


Before:
test bench_diff_10k_lines_reversed  ... bench:  36,249,659 ns/iter (+/- 174,455)
test bench_diff_10k_modified_lines  ... bench:  37,258,890 ns/iter (+/- 803,963)
test bench_diff_10k_unchanged_lines ... bench:       4,252 ns/iter (+/- 69)
test bench_diff_1k_lines_reversed   ... bench:     982,834 ns/iter (+/- 6,467)
test bench_diff_1k_modified_lines   ... bench:   3,343,469 ns/iter (+/- 23,243)
test bench_diff_1k_unchanged_lines  ... bench:         231 ns/iter (+/- 2)
test bench_diff_git_git_read_tree_c ... bench:      95,559 ns/iter (+/- 816)


After:
test bench_diff_10k_lines_reversed  ... bench:  36,186,715 ns/iter (+/- 196,903)
test bench_diff_10k_modified_lines  ... bench:  37,511,000 ns/iter (+/- 1,370,476)
test bench_diff_10k_unchanged_lines ... bench:       3,099 ns/iter (+/- 8)
test bench_diff_1k_lines_reversed   ... bench:     986,010 ns/iter (+/- 11,565)
test bench_diff_1k_modified_lines   ... bench:   3,370,938 ns/iter (+/- 17,041)
test bench_diff_1k_unchanged_lines  ... bench:         230 ns/iter (+/- 2)
test bench_diff_git_git_read_tree_c ... bench:     102,189 ns/iter (+/- 1,052)


So this patch makes diffing even slower (but still easily fast enough
for all cases I've run into in real life). There's probably a lot that
can be done to make things faster, but the first priority is that the
diffs are correct and easy to read.
2021-04-08 23:54:54 -07:00
Martin von Zweigbergk
f4a41f3880 trees: make tree diff return an iterator instead of taking a callback
This is yet another step towards making it easy to propagate
`BrokenPipe` errors. The `jj diff` code (naturally) diffs two trees
and prints the diffs. If the printing fails, we shouldn't just crash
like we do today.

The new code is probably slower since it does more copying (the
callback got references to the `FileRepoPath` and `TreeValue`). I hope
that won't make a noticeable difference. At least `jj diff -r
334afbc76fbd --summary` didn't seem to get measurably slower.
2021-04-07 23:18:00 -07:00
Martin von Zweigbergk
8b2ce18254 trees: make diff_entries() return an iterator instead of taking a callback
The iterator version is easier to use and we get rid of the ugly type
parameter for the error type. I also simplified the code by using
`Peekable` iterators.
2021-04-07 15:48:11 -07:00
Martin von Zweigbergk
5c10c93e64 diff: fix tests broken by the previous commit
Sorry, I forgot to run the automated tests again :(
2021-04-07 11:00:04 -07:00
Martin von Zweigbergk
0dd000d236 diff: do final refinement at byte-level for non-word bytes
This results in significantly more readable diffs on commits like
659393bec2 in this repo.


Before:
test bench_diff_10k_lines_reversed  ... bench:  38,122,998 ns/iter (+/- 557,688)
test bench_diff_10k_modified_lines  ... bench:  32,556,563 ns/iter (+/- 548,114)
test bench_diff_10k_unchanged_lines ... bench:       4,231 ns/iter (+/- 15)
test bench_diff_1k_lines_reversed   ... bench:     958,296 ns/iter (+/- 46,963)
test bench_diff_1k_modified_lines   ... bench:   3,014,723 ns/iter (+/- 15,830)
test bench_diff_1k_unchanged_lines  ... bench:         249 ns/iter (+/- 2)
test bench_diff_git_git_read_tree_c ... bench:      78,599 ns/iter (+/- 1,079)

After:
test bench_diff_10k_lines_reversed  ... bench:  38,289,493 ns/iter (+/- 413,712)
test bench_diff_10k_modified_lines  ... bench:  37,352,516 ns/iter (+/- 1,293,950)
test bench_diff_10k_unchanged_lines ... bench:       4,238 ns/iter (+/- 13)
test bench_diff_1k_lines_reversed   ... bench:     967,253 ns/iter (+/- 8,506)
test bench_diff_1k_modified_lines   ... bench:   3,358,028 ns/iter (+/- 37,154)
test bench_diff_1k_unchanged_lines  ... bench:         233 ns/iter (+/- 1)
test bench_diff_git_git_read_tree_c ... bench:      95,787 ns/iter (+/- 740)


So the biggest slowdown is when there are modified lines.
2021-04-07 10:27:17 -07:00
Martin von Zweigbergk
f634ff0e3f files: make diff() return an iterator instead of using a callback
Iterators are generally nicer to work with. My immediate goal is to be
able to propagate errors when failing to write to stdout.
2021-04-07 10:07:18 -07:00
Martin von Zweigbergk
d7395cc34a diff: add copyright header 2021-04-06 21:26:37 -07:00
Martin von Zweigbergk
7e4e43f358 diff: first diff lines, then refine to words, producing better diffs
The new diff algorithm produces pretty bad diffs in some cases, such
as cc4b1e9230 in this repo (the parent of this commit). I think the
problem there is that many words are repeated over and over. Diffing
first at the line level and then refining the diff of the changed
ranges at the word level gives much better results. That's what this
patch does. After this patch, `jj diff -r cc4b1e923091` looks pretty
similar to the diff in GitHub's UI.

I hope to get around to doing the same for the merge code soon.

Impact on benchmarks:

Before:
test bench_diff_10k_lines_reversed  ... bench:  42,647,532 ns/iter (+/- 765,347)
test bench_diff_10k_modified_lines  ... bench:  21,407,980 ns/iter (+/- 126,366)
test bench_diff_10k_unchanged_lines ... bench:       4,235 ns/iter (+/- 16)
test bench_diff_1k_lines_reversed   ... bench:   1,190,483 ns/iter (+/- 7,192)
test bench_diff_1k_modified_lines   ... bench:   1,919,766 ns/iter (+/- 9,665)
test bench_diff_1k_unchanged_lines  ... bench:         231 ns/iter (+/- 1)
test bench_diff_git_git_read_tree_c ... bench:     174,702 ns/iter (+/- 1,199)

After:
test bench_diff_10k_lines_reversed  ... bench:  38,289,509 ns/iter (+/- 129,004)
test bench_diff_10k_modified_lines  ... bench:  33,140,659 ns/iter (+/- 3,989,339)
test bench_diff_10k_unchanged_lines ... bench:       3,099 ns/iter (+/- 14)
test bench_diff_1k_lines_reversed   ... bench:     973,551 ns/iter (+/- 94,895)
test bench_diff_1k_modified_lines   ... bench:   3,033,818 ns/iter (+/- 29,513)
test bench_diff_1k_unchanged_lines  ... bench:         230 ns/iter (+/- 1)
test bench_diff_git_git_read_tree_c ... bench:      79,100 ns/iter (+/- 963)


So most of them get slower, as expected. The last one, taken from a
real diff in the git.git repo, get faster, however (which is also what
I would have expected).
2021-04-04 21:50:31 -07:00
Martin von Zweigbergk
cc4b1e9230 test: fix merge tests to expect line-based merging
I made a quite late change in a recent patch to make the merge code to
merge based on lines instead of words. I forgot to update the tests
(and to even run them). Sorry :(
2021-04-01 08:27:27 -07:00
Martin von Zweigbergk
c071d412af diff: use new diff algorithm for content diff
The previous patch switched over the content-merge code to use the new
histogram diff code. This patch switches over the content-diff code to
use the histogram diff code. As before, the immediate goal is to speed
it up. `jj diff -r c28ded83fc` in the git.git repo is a good example
of a diff that's extremely slow to calculate with our current
LCS-based diff. With this patch, that drops from 35 s to 0.12 s.

The diff was slightly better before. I think that's mostly because of
our different definition of a "word" in the data. We can improve that
later. The speedup we get now is easily worth the slightly worse diff.
2021-03-31 22:22:59 -07:00
Martin von Zweigbergk
3c35dbace6 merge: use new diff algorithm for finding sync regions
With the histogram diff code from the previous patch, we can now start
using that for finding the "sync regions" in 3-way merge. That helps a
lot with the slow merging we had before this patch. `jj diff -r
9d540e9726` in the git.git repo drops from 22 s to 0.15 s with this
patch. (That commit is a rather arbitrary merge commit from aroun 5
years ago.)

With the new diff algorithm, the output of `jj diff -r 9d540e9726` in
git.git looks better if we find unchanged sync regions based on lines
than on words, so that's what I'm using in this patch. That's a change
compared the the LCS-based diff we used before this patch. I suspect
the reason that finding sync regions based on words works worse now is
not because of the change from LCS to histogram but because of the
change in how we define a word. My goal right now is mostly to make it
faster; I'll get back to refining the diff result later.
2021-03-31 22:16:19 -07:00
Martin von Zweigbergk
1e657c5331 diff: add a histogram(-like?) diff algorithm
The current diff algorithm does a full LCS on the words of the texts,
which is really slow. Diffing the working copy when e.g.
`src/commands.py` has changes far apart takes seconds. This patch adds
an implementation inspired by JGit's Histogram diff. I say "inspired"
because I just didn't quite understand it :P In particular, I didn't
understand what it does when it finds non-unique elements. I decided
to line up the leading common elements on both sides of the merge. I
don't know if that usually gives good enough results in practice.

I'm sure this can still be optimized a lot, but this seems good enough
as a start. There is also many things to improve about the quality of
the diffs.
2021-03-31 22:15:36 -07:00
Martin von Zweigbergk
998e23db3c index: add IndexEntry::parents() and predecessors() returning Vec<IndexEntry> 2021-03-31 14:48:03 -07:00
Martin von Zweigbergk
53d1757994 dag_walk: remove unused TopoIter 2021-03-18 16:42:30 -07:00
Martin von Zweigbergk
db4e8bc458 cargo: upgrade to protobuf 2.22.1 to avoid workaround for rustfmt::skip 2021-03-18 13:06:42 -07:00
Martin von Zweigbergk
07c2b2316f repo: remove obsolete part of a TODO (we use the index to filter out non-heads) 2021-03-17 08:28:21 -07:00
Martin von Zweigbergk
30cd94f842 dag_walk: rename unreachable() to heads() to match name we use in index module 2021-03-16 23:54:51 -07:00
Martin von Zweigbergk
5aec8b9d77 evolution: use index for filtering out ancestors of candidates in new_parent()
This speeds up `jj evolve` of 100 linear commits of the "what's
cooking" branch in the git.git repo further, from ~700 ms to ~400 ms.
2021-03-16 23:43:44 -07:00
Martin von Zweigbergk
73f20c8696 transaction: delete write_commit() and as_repo_ref() helpers
With this patch, the simple delegating helpers are gone from
`Transaction`.
2021-03-16 22:45:58 -07:00
Martin von Zweigbergk
f9873c49ec transaction: remove add_head(), remove_head(), and set_view() helpers 2021-03-16 22:31:28 -07:00
Martin von Zweigbergk
06df609482 transaction: delete check_out() and set_checkout() helpers 2021-03-16 22:31:28 -07:00
Martin von Zweigbergk
808d0af66d transaction: remove evolution() and store() helpers 2021-03-16 22:31:24 -07:00
Martin von Zweigbergk
16d97ef8c0 transaction: remove index() and view() helpers 2021-03-16 22:05:51 -07:00
Martin von Zweigbergk
5ed14185a0 git: take a MutableRepo instead of a Transaction 2021-03-16 22:05:51 -07:00
Martin von Zweigbergk
769f88bbae tests: rename test_transaction to test_mut_repo
The test doesn't test any logic in the `Transaction` type itself
anymore.
2021-03-16 22:05:51 -07:00
Martin von Zweigbergk
2c2b5fb3b7 evolution: take a MutableRepo instead of a Transaction 2021-03-16 22:05:51 -07:00
Martin von Zweigbergk
c3b9d1cd13 rewrite: take a MutableRepo instead of a Transaction 2021-03-16 22:05:51 -07:00
Martin von Zweigbergk
ee8423a69e MutableRepo: rename repo to base_repo to clarify its role 2021-03-16 22:05:50 -07:00
Martin von Zweigbergk
69de4698ac tests: set $HOME in a few tests to avoid depending in developer's ~/.gitignore
I just changed my `~/.gitignore` and some tests started failing
because the working copy respects the user's `~/.gitignore`. We should
probably not depend on `$HOME` in the library crate. For now, this
patch just makes sure we set it to an arbitrary directory in the tests
where it matters.
2021-03-16 22:05:36 -07:00
Martin von Zweigbergk
67e11e0fc3 git_store: wait 1 minute for lock on refs to help tests
`test_commit_parallel` was failing on Mac in the GitHub CI. I suspect
the reason was that it was timing out. The test runs in about 1 s on
my Linux desktop and in about 3 s on my Mac laptop. It failed after 31
in the GitHub CI. This patch increases the timeout to 1 minute to try
to make the test pass. It would be better to set the timeout to a
higher value only in tests, but this will be good enough for now. By
the way, it has turned out that git notes (at least libgit2's
implementation of them) are too slow, so we should probably eventually
create our own storage for the extra metadata instead.
2021-03-16 11:28:22 -07:00
Martin von Zweigbergk
81a0e0bd2a protobuf: upgrade to version 2.22.0
I only noticed that there was a newer version when running `cargo
install --path .`, which resulted in warnings about deprecated
functions. There's no other reason I'm aware of to upgrade now.
2021-03-15 17:09:29 -07:00
Martin von Zweigbergk
1ebdd4ecf0 MutableRepo: use index when enforcing view invariants
We can now finally use the commit index for filtering out ancestors
from the sets of heads.

I haven't timed the change from most of the recent work on
performance, but I did a measurement after this commit. I modified a
commit in the git.git repo's "what's cooking" branch (because that's
linear). Then I ran `jj evolve` so the 100 commits after it would get
evolved. That took ~700ms. `git rebase` of the same 100 commits took
~6s.

I also compared `jj op undo` of that `jj evolve` operation. With this
patch, that was sped up from ~6.8s to ~125ms.
2021-03-15 16:35:45 -07:00
Martin von Zweigbergk
3ecb4ec16b MutableRepo: in fast-path for adding head, simply remove parent heads 2021-03-15 15:38:09 -07:00
Martin von Zweigbergk
2c92fca75a MutableView: don't require whole Commit when CommitId is enough 2021-03-15 15:36:03 -07:00
Martin von Zweigbergk
b4b1de3ddc view: let MutableRepo enforce view invariants
`MutableRepo` has more information needed for taking fast-paths, and
it will have to make the same decision for doing incremental updates
of the evolution state anyway.
2021-03-15 15:17:36 -07:00
Martin von Zweigbergk
b9fe944e76 view: remove unnecessary removing of parents in add_head()
We call `enforce_invariants()` right after removing the parent
commits, and that will remove parents anyway.
2021-03-15 15:06:14 -07:00
Martin von Zweigbergk
12a47bd6ed MutableRepo: don't calculate evolution state only to update it 2021-03-15 15:03:50 -07:00
Martin von Zweigbergk
f0619c07ac MutableEvolution: make MutableRepo responsible for lazy calculation
This patch continues the work from the previous pathc. From this
patch, we no longer calculate the evolution state just because a
transaction starts. We still unnecessarily calculate it when adding a
commit within the transaction, however. I'll fix that next.
2021-03-15 15:03:14 -07:00
Martin von Zweigbergk
61acee52f4 ReadonlyEvolution: make ReadonlyRepo responsible for lazy calculation
This patch changes it so that `ReadonlyEvolution` does not lazily
calculate its state and the caller, i.e. `ReadonlyRepo`, is instead
responsible for the laziness. That will allow the caller to make
decisions based on whether the state has been
calculated. Specifically, we don't want to calculate the evolution
state in order to update it incrementally if it hasn't already been
calculated. It's better to just leave it uncalculated in that case.

As a result of moving the laziness out of `ReadonlyEvolution`, we also
don't need to the reference to `ReadonlyRepo` anymore, which
simplifies things a bunch. The next patch will continue by making the
corresponding change to `MutableEvolution`, which will let us simplify
even more.
2021-03-15 14:41:27 -07:00
Martin von Zweigbergk
43315bc9d2 git: fix bad formatting from commit 1e9d428406 2021-03-14 22:28:12 -07:00
Martin von Zweigbergk
91117f36b6 cargo: work around warning in generated protobuf code with new nightly rustc 2021-03-14 22:25:43 -07:00
Martin von Zweigbergk
1e9d428406 git: skip tags pointing to GPG keys and similar when importing refs 2021-03-14 20:14:18 -07:00
Martin von Zweigbergk
429a1ad7ab git: set authentication callback on fetch as well
I guess I had not run `jj git fetch` from GitHub until I tried to
fetch the result of PR #6 just now.
2021-03-14 17:18:51 -07:00
Jun Wu
d1d502c062 tests: disable tests failing on Windows
This unblocks enabling GitHub CI. I took a quick look at
some failures but the causes do not seem obvious to me.
2021-03-14 15:51:32 -07:00
Jun Wu
935da3e13f lock: treat PermissionDenied on Windows as transient error
On Windows it can be PermissionDenied when creating the new file
exclusively. This change makes lock_concurrent test pass on Windows.
2021-03-14 15:51:32 -07:00
Jun Wu
eacab648b0 working_copy: clean up ".git" automatically
TreeState::write_tree leaves a ".git" file in the working copy. This is
undesirable but more problematic on Windows - The second time
TreeState::write_tree would panic because Repository::init_opts will fail
with a Permission Denied error.

This seems to be a libgit2 defect. But for now let's just remove ".git"
automatically. This makes `cargo test --test smoke_test` pass on Windows.
2021-03-14 15:49:42 -07:00
Jun Wu
4cd29a2130 working_copy: avoid std::os::unix on Windows
std::os::unix::fs::PermissionsExt::mode() does not exist on Windows.
Treat files on Windows as regular files.
2021-03-14 15:49:22 -07:00
Martin von Zweigbergk
5631e85502 view: don't enforce invariants in merge_views()
We now only call the function from `MutableRepo::merge()`. There we
pass the result to `MutableView::set_view()`, which already enforces
the invariants.
2021-03-14 11:07:34 -07:00
Martin von Zweigbergk
8048d9641e commands: rewrite jj op undo using new MutableRepo::merge() 2021-03-14 10:57:57 -07:00
Martin von Zweigbergk
a7f4f4cf5b rustfmt: configure to merge imports by module
Perhaps we should even set the config to "Item" to reduce merge conflicts.
2021-03-14 10:53:14 -07:00
Martin von Zweigbergk
4b8484e561 rustfmt: configure to group imports 2021-03-14 10:46:25 -07:00
Martin von Zweigbergk
ac9fb1832d OpHeadsStore: move logic for merging repos to MutableRepo
This adds `MutableRepo::merge()`, which applies the difference between
two `ReadonRepo`s to itself. That results in much simpler code than
the current code in `merge_op_heads()`. It also lets us write `undo`
using the new function. Finally -- and this is the actual reason I did
it now -- it prepares for using the index when enforcing view
invariants.
2021-03-14 10:43:39 -07:00
Martin von Zweigbergk
e9ddfdd8bc Repo: repurpose ReadonlyRepo::loader() to return loader for existing repo
It's sometimes useful to create a `RepoLoader` given an existing
`ReadonlyRepo`. We already do that in `ReadonlyRepo::reload()`. This
patch repurposes `ReadonlyRepo::reload()` for that.
2021-03-14 10:34:18 -07:00
Martin von Zweigbergk
82c683bf63 Transaction: rename as_repo_mut() to mut_repo()
I think the `as_` prefix of `as_repo_mut()` makes it sound like it
returns a view of the `Transaction`, but the `MutableRepo` is actually
a part of it. Also, the convention seems to be to put the `mut_` in
the name first if the function returns a name with a matching name
(like `MutableRepo` does).
2021-03-14 00:25:05 -08:00
Martin von Zweigbergk
7ea0c6a868 View: move op_id/base_op_id to Repo
This is yet another step towards making the `View` types
simpler. Perhaps we eventually won't need to wrap the types returned
from the `OpStore` at all.
2021-03-14 00:25:02 -08:00
Martin von Zweigbergk
c1de8b0f3a View: move creation of Operation to Transaction
This continues the work to make the `View` types be only about the
state of the current view and not about operations in general (which
has been moved out `OpStore` and qOpHeadsStore`).
2021-03-14 00:16:21 -08:00
Martin von Zweigbergk
cf2baf58a7 OpHeadsStore: simplify by returning Operation from get_single_op_head() 2021-03-14 00:16:21 -08:00
Martin von Zweigbergk
f6488e2e9f OpHeadsStore: check for fast-forward merge before calling merge_op_heads()
This is another little refactoring to prepare for using the
`Transaction` API in `merge_op_heads()`.
2021-03-14 00:16:13 -08:00
Martin von Zweigbergk
9452d17b75 OpHeadsStore: pass around RepoLoader instead of various stores
This is to prepare for using the regular `Transaction` API for
creating the merge operation in `OpHeadStore`.
2021-03-14 00:15:04 -08:00
Martin von Zweigbergk
eac0c9f579 OpHeadsStore: when merging ops, also remove ancestor op from disk early
This is more code, but I think it's clearer because the code for
removing the ancestors from the set of parents and from disk is now
close. I hope this will also help prepare for some further changes.
2021-03-14 00:13:04 -08:00
Martin von Zweigbergk
d4c39d399f OpHeadsStore: read operation objects before calling merge_op_heads()
This is just a little refactoring to prepare for filtering out
ancestors earlier.
2021-03-14 00:13:04 -08:00
Martin von Zweigbergk
27293829d6 Transaction: allow writing a transaction to the OpStore without publishing it
It can be useful to write an operation to the `OpStore` without also
making it visible when you load the repo. I had planned to add that
functionality at least for hooks, so the hooks can be run commands
with `jj --at-op=<operation>` and decide whether to publish the
operation. However, the immediate goal is to let us rewrite
`op_heads_store::merge_op_heads()` to use the usual `Transaction`
API. That needs to be able to just write the operation without
publishing it, since the publishing step takes a long, which
`op_heads_store::merge_op_heads()` (its caller, actually) has already
taken.
2021-03-14 00:12:57 -08:00
Martin von Zweigbergk
337b15c98d cleanup: replace #[cfg(not(windows))] by $[cfg(unix)]
I didn't realize that the `unix` configuration existed before.
2021-03-12 15:45:55 -08:00
Martin von Zweigbergk
f79874d612 view: let repo get current operation from OpHeadsStore and pass in
This makes the View types a lot simpler.
2021-03-11 22:23:02 -08:00
Martin von Zweigbergk
82a3ff6ef8 repo: make OpHeadsStore accessible directly on ReadonlyRepo
We can now get rid of `MutableView::update_op_heads()`.
2021-03-10 23:27:36 -08:00
Martin von Zweigbergk
212dd35d01 view: let repo create OpHeadsStore and pass in to view 2021-03-10 23:14:00 -08:00
Martin von Zweigbergk
ec07104126 view: move creation of initial operation to OpHeadsStore
This is a step towards getting rid of
`MutableView::update_op_heads()`.
2021-03-10 22:57:59 -08:00
Martin von Zweigbergk
2590e127f7 view: move get_single_op_head() onto OpHeadsStore 2021-03-10 21:59:32 -08:00
Martin von Zweigbergk
b4d4cd143a view: move locking of .jj/view/op_heads/ to OpHeadsStore 2021-03-10 21:51:08 -08:00
Martin von Zweigbergk
4bd121dab5 view: split out separate type for keeping track of op heads 2021-03-10 21:34:11 -08:00
Martin von Zweigbergk
2955bc4a29 repo: let repo types directly have an OpStore
I'd like to make `ReadonlyView` and `MutableView` focused on just the
state of the view (i.e. the set of heads, git refs, etc.). The
responsibility for managing the `.jj/view/op_heads/` directory should
be moved out of it. This prepares for that.
2021-03-10 20:55:56 -08:00
Martin von Zweigbergk
48d7903925 repo: simplify and clarify name of base_op_head_id() functions 2021-03-10 15:39:15 -08:00
Martin von Zweigbergk
9ee521d9d3 transaction: fix (mostly harmless) race where index can get re-calculated 2021-03-10 15:22:03 -08:00
Martin von Zweigbergk
47a7cf7101 view: extract function for updating operation heads
This will be used to address the race in `Transaction::commit()`.
2021-03-10 15:17:54 -08:00
Martin von Zweigbergk
fc73ef8d6e view: delete an incorrect comment about a race
Unlike in `Transaction::commit()`, in the `view` module, we actually
don't update the `.jj/view/op_heads/` directory until after we've
recorded the index associated with the operation, so there's no race
there.
2021-03-10 14:30:27 -08:00
Martin von Zweigbergk
a715fd0ae7 view: drop stale comment about resolving concurrent operations
The comment was from the time when we resolved divergent operations at
write time.
2021-03-08 23:18:26 -08:00
Martin von Zweigbergk
e6aa2402a6 view: drop redundant filtering of ancestors of public heads
I added `enforce_invariants()` in 1f593a4193 and then forgot to use
it in 4db3d8d3a6.
2021-03-08 23:18:26 -08:00
Martin von Zweigbergk
f755c3f740 cleanup: access integer types' MAX constants directly on the type
Using `std::u32::MAX` is deprecated.
2021-03-08 23:18:17 -08:00
Martin von Zweigbergk
02e6420606 repo: inline MutableRepo's {view,index,evolution}_mut() methods
The methods are now only called from within the type. Inlining means
that the borrow checker will let us borrow these separate fields
concurrently. We'll take advantage of that soon.
2021-03-08 23:17:29 -08:00
Martin von Zweigbergk
9f7854f02c repo: stop wrapping view and index in Option in MutableRepo
I think the `Option<>` wrapping was from the time when `MutableView`
had a reference back to the repo (and `MutableIndex` was probably
wrapped out of habit).
2021-03-08 00:08:02 -08:00
Martin von Zweigbergk
ef16d102e2 transaction: move most functionality to MutableRepo
Most methods on `Transaction` only need the `MutableRepo`, so it makes
for that functionality to be on the latter. That will let us update
the methods to also update the index, which would otherwise have been
harder because it would require a mutable borrow of both the view and
the index. This patch makes most current methods on `Transaction` just
delegate to `MutableRepo`. We may want to remove some of these
delegating methods later.
2021-03-07 23:10:32 -08:00
Martin von Zweigbergk
1e623bd019 index: update in memory and on disk while resolving operation conflicts
Updating the index on disk means that reader won't have to calculate
the state. Updating it in memory means that we can take advantage of
it while resolving conflicts. We will do that soon.
2021-03-06 23:30:03 -08:00
Martin von Zweigbergk
779db67f8f index_store: avoid passing whole repo into get_index_at_op()
I want to be able to load the index at an operation before the repo
has been loaded.
2021-03-06 23:06:35 -08:00
Martin von Zweigbergk
d1509ffdd4 index: extract a function for adding all commits from a segment
I also restructured the loop in `maybe_squash_with_ancestors()` to
hopefully make it a little clearer.
2021-03-06 20:30:12 -08:00
Martin von Zweigbergk
3fc35288c0 index: remove dir field from ReadonlyIndex and MutableIndex 2021-03-06 10:02:19 -08:00
Martin von Zweigbergk
502ba895f5 index: move ReadonlyFile::associate_with_operation() to IndexStore
After this patch, the `index` module no longer knows about the
".jj/index/operations/" directory; that knowledge is now only in
`IndexStore`.
2021-03-06 10:00:30 -08:00
Martin von Zweigbergk
c4fe7aab10 index: move ReadonlyRepo::load_at_operation() to IndexStore 2021-03-06 09:52:44 -08:00
Martin von Zweigbergk
12bfbc489c index: move ReadonlyIndex::index() to IndexStore 2021-03-06 09:52:38 -08:00
Martin von Zweigbergk
2fdf9721c0 index: move load() to IndexStore 2021-03-06 09:52:27 -08:00
Martin von Zweigbergk
403e86c138 index: introduce IndexStore, which owns ReadonlyIndex files
This patch introduces a new `IndexStore` struct. The idea is that it
will know about the directory in which the index files are stored, the
associations with operations. It may also cache `Arc<ReadonlyIndex>`
instances so if multiple `ReadonlyIndex` instances are loaded, they
can be returned from the cache. That may be useful when merging
operations because the operations are likely to share a large parent
index file. For now, however, all the new type has is `init()`,
`load()`, and `reinit()`.
2021-03-06 09:52:16 -08:00
Martin von Zweigbergk
0a4ef1030f repo: add support for loading at given operation without loading head op first
The only way to load the repo at a current operation (as with
`--at-op`) is currently to first load it at the head operation and
then call `reload()` on the repo. This patch makes it so we can load
the repo directly at the requested operation.
2021-03-06 09:52:10 -08:00
Martin von Zweigbergk
df53871daf repo: extract a type for loading the repo in two stages
We'll want to be able to load the repo at a given operation without
first loading the head operation as we do today. This patch introduces
a struct for keeping the state of a half-loaded repo. In that
half-loaded state, the store and the op-store have been loaded, but
the view has not yet been loaded. That makes it possible for callers
to use the loaded op-store for looking up an operation to load the
view at.
2021-03-06 09:52:10 -08:00
Martin von Zweigbergk
5a32118af1 repo: move creation of OpStore out of View
We want to support loading the repo at a specific operation without
first loading the head operation (like we currently do). One reason
for that is of course efficiency. A possibly more important reason is
that the head operation may be conflicted, depending on how we decide
to deal with operation-level conflicts. In order to do that, it makes
sense to move the creation of the `OpStore` outside of the
`View`. That also suggests that the `.jj/view/op_store/` directory
should move to `.jj/op_store/`, so this patch also does that. That's
consistent with how `.jj/store/` is outside of `.jj/working_copy/`.
2021-03-06 09:52:00 -08:00
Martin von Zweigbergk
e2e9fe8f0d index: add stats for number of change ids and pruned commits 2021-03-06 09:50:22 -08:00
Martin von Zweigbergk
bc64cf02c7 index: don't use an all-0 change id in tests
It was weird to have the same change id for all commits. I think that
was a leftover from a me just quickly getting tests to pass.
2021-03-06 09:30:52 -08:00
Martin von Zweigbergk
031a39ecba cleanup: fix lots of issues found in the lib crate by clippy
I had forgotten to pass `--workspace` to clippy all this time :P
2021-02-26 23:15:43 -08:00
Martin von Zweigbergk
d961f61623 evolution: calculate state using index
All the information needed for calculating the evolution state is now
in the index, so let's use it. This speeds up calculation of the
evolution state from 1.53s to 150ms in the git.git repo. In the Linux
repo, it was sped up from 28.9s to 3.07s. That's still unbearably slow
(and still pretty slow in the git.git repo too). We may need to keep a
persistent cache of the evolution state, but that will have to come
later; this improvement is good enough for now.
2021-02-26 21:19:18 -08:00
Martin von Zweigbergk
190899fe76 index: make rev_walk() iterate over IndexEntry instead of CommitId
We have the `IndexEntry` available in the iterator, so let's return it
so the caller doesn't need to look it up themselves.
2021-02-26 16:49:27 -08:00
Martin von Zweigbergk
8be84c345b index: also index change id 2021-02-26 10:33:34 -08:00
Martin von Zweigbergk
3a53a187ff index: add flag indicating pruned commit 2021-02-26 10:33:34 -08:00
Martin von Zweigbergk
d80903ce48 index: also index predecessors
Evolution needs to have fast access to the predecessors. This change
adds that information to the commit index.

Evolution also needs fast access to the change id and the bit saying
whether a commit is pruned. We'll add those soon.

Some tests changed because they previously added commits with
predecessors that were not indexed, which is no longer allowed from
this change. (We'll probably eventually want to allow that again, so
that the user can prune predecessors they no longer care about from
the repo.)
2021-02-26 10:33:34 -08:00
Martin von Zweigbergk
afc59a210a index: make tests more focused, and add tests of octopus merge 2021-02-26 10:33:32 -08:00
Martin von Zweigbergk
2a531832d6 rewrite: make merge_commit_trees() use index for finding common ancestors
The index is now always kept up to date and it has functionality for
finding common ancestors, so let's use it! This should make merging
commits a little faster if their common ancestor is far away (which is
rare). It's probably much more important that the index-based
algorithm is more correct. Also, it returns multiple common ancestors
in the criss-cross case, which lets us do a recursive merge like git
does. I'm leaving the recursive merge for later, though.
2021-02-23 20:49:18 -08:00
Martin von Zweigbergk
bb94516175 index: add support for finding common ancestors
We currently need to read the commit objects for finding common
ancestors. That can be very slow when the common ancestor is far back
in history. This patch adds a function for finding common ancestors
using the index instead.

Unlike the current algorithm, which only returns one common ancestor,
the new index-based one correctly handles criss-cross merges.

Here are some timings for finding the common ancestors in the git.git
repo:

                          |      Without index     |       With Index       |
                          | First run | Subsequent | First run | Subsequent |
v2.30.0-rc0 v2.30.0-rc1   |   5.68 ms |    5.94 us |   40.3 us |    4.77 us |
v2.25.4 v2.26.1           |   1.75 ms |    1.42 us |   13.8 ms |    4.29 ms |
v1.0.0 v2.0.0             |    492 ms |    2.79 ms |   23.4 ms |    6.41 ms |

Finding ancestors of v2.25.4 and v2.26.1 got much slower because the
new algorithm finds all common ancestors. Therefore, it also finds
v2.24.2, v2.23.2, v2.22.3, v2.21.2, v2.20.3, v2.19.4, v2.18.3, and
v2.17.4, which it then filters out because they're all ancestors of
v2.25.3.

Also note that the result was incorrect before, because the old
algorithm would return as soon as it had found a common ancestor, even
if it's not the latest common ancestor. For example, for the common
ancestor between v1.0.0 and v2.0.0, it returned an ancestor of v1.0.0
because it happened to get there by following some side branch that
led there more quickly.

The only place we currently need to find the common ancestor is when
merging trees, which we only do when the user runs `jj merge`, as well
as when operating on existing merge commits (e.g. to diff or rebase
them). That means that this change won't be very noticeable. However,
it's something we clearly want to do sooner or later, so we might as
well get it done.
2021-02-23 17:29:23 -08:00
Martin von Zweigbergk
422d333d4b index: make heads() return result in index order instead of hash order
It's nice to have a non-random order for tests (we can revisit later
if it shows up in profiling). I'm changing the order to be the index
order so the future caller of `heads_pos()` (not `heads()`) will also
get consistent order.
2021-02-23 17:24:55 -08:00
Martin von Zweigbergk
1481935472 index: extract a function for removing ancestors of set based on positions
We already have the `heads()` function, which works on
`CommitId`s. This just extracts a function that works on
positions. I'll use it soon.
2021-02-21 22:28:44 -08:00
Martin von Zweigbergk
5aadbcf6fc evolve: pass Transaction to listener functions, so they see the updated state 2021-02-21 22:27:13 -08:00
Martin von Zweigbergk
62ce5782b5 index: when writing incremental index, squash into parent file if smaller
We currently write a new incremental index file every time. That means
that the stack of index files quickly gets deep, which makes it slow
to read the index. This commit makes it so that we squash the new
index segment into its parent if the parent has fewer commits. That
means we'll limit the number of files to O(log n). Writes time will
also be O(log n) on average.
2021-02-16 23:47:43 -08:00
Martin von Zweigbergk
a51543b752 index: make first level in stats be the root index
I've confused myself a few times already thinking that level 0 is the
root, so that's probably more intuitive. It also makes tests simpler
because the initial part of the list is unchanged when a new
transaction commits.
2021-02-16 23:45:54 -08:00
Martin von Zweigbergk
b122f33312 index: don't write empty incremental index file 2021-02-16 23:45:52 -08:00
Martin von Zweigbergk
a7b6bcfd79 transaction: write incremental index on commit
With this change, we start writing the incremental index to disk, so
the next reader won't have to re-read the commits and create the
index.

As of this change, we simply write a new index file for each
transaction. That will clearly mean that the stack of files gets deep
pretty quickly. For now, the user will have to do `jj debug reindex`
when things get slow. I plan to change it so instead of writing an
incremental index file every time, we first check if the new index
file would have at least as many commits as the parent file, and if it
will, we write a combined one instead. That should apply recursively,
so we'd have O(log n) index files.
2021-02-15 11:03:41 -08:00
Martin von Zweigbergk
86915f0a6f index: fix check for adding existing commit to index
The check for adding an existing commit to the index only checked if
the commit was already in the `MutableIndex`, not if it was already in
the parent `ReadonlyIndex`.
2021-02-15 10:28:18 -08:00
Martin von Zweigbergk
37cf6a8395 transaction: don't walk to root when adding on top of non-head
I don't know why I made the walk stop at heads instead of indexed
commits before. Perhaps I did it because it's cheap to check in the
set of head. However, it gets very expensive to walk all the way back
to the root if the parents are not in the set of heads.
2021-02-15 10:28:18 -08:00
Martin von Zweigbergk
0f56e014b7 tests: some fixups to test_transaction as a result of reordering commits 2021-02-15 10:28:07 -08:00
Martin von Zweigbergk
3c832cbbbe index: let index structs keep track of the index directory
This matches how it's done for the other struct (View, WorkingCopy).
2021-02-14 01:03:49 -08:00
Martin von Zweigbergk
b77740e58a index: move function for saving MutableIndex onto the struct 2021-02-14 01:03:49 -08:00
Martin von Zweigbergk
713d32d803 index: keep up to date within transaction
With tons of groundwork done, wee can now finally keep the index up to
date within a transaction! That means that we can start relying on the
index to always be valid, so we can use it e.g. for finding common
ancestors within a transaction. That should help speed up `jj evolve`
immensely on large repos.

We still don't write the updated index to disk when the transaction
closes. That will come later.
2021-02-14 00:58:11 -08:00
Martin von Zweigbergk
e19a65cf14 transaction: make add_head() use incremental update of evolution in common case
`Transaction::add_head()` currently invalidates the whole evolution
state. We've had support for incrementally updating evolution since
4619942a57. We should start taking advantage of that. Let's add a
fast-path in `Transaction::add_head()` for the common case where we
add a single commit on top of an existing head. That cheap an simple
to check for. However, it won't cover the case of adding a child off
of a non-head. It's still a good start.
2021-02-14 00:56:34 -08:00
Martin von Zweigbergk
f05a12d301 index: make CompositeIndex non-public and add new IndexRef enum instead
We're getting close to finally having a `RepoRef::index()` method.
2021-02-13 13:56:26 -08:00
Martin von Zweigbergk
face4d637f index: define methods from CompositeIndex directly on {Readonly,Mutable}Index
This is one step towards making `CompositeIndex` non-public (and maybe
deleting it). Next, we'll add an `IndexRef` enum similar to `RepoRef`
etc.
2021-02-13 13:46:58 -08:00
Martin von Zweigbergk
8dda4b05e4 index: add "segment_" prefix to methods in IndexSegment
I'm about to move the functions from `CompositeIndex` to an new
`Index` trait implemeted by `ReadonlyIndex` and `MutableIndex`. Since
those types already implement `IndexSegment`, the names would conflict
and it would get annoying to have to disambiguate them. This commit
therefore prepares for that by adding a `segment_` prefix to the
functions in `IndexSegment`.
2021-02-13 13:45:31 -08:00
Martin von Zweigbergk
3066381d57 transaction: add accessors for view and evolution directly on transaction 2021-02-13 13:43:48 -08:00
Martin von Zweigbergk
72aebc9da3 view: replace View trait by enum with Readonly and Mutable variants 2021-02-13 08:31:41 -08:00
Martin von Zweigbergk
d1e5f46969 evolution: replace Evolution trait by enum with Readonly and Mutable variants 2021-02-13 08:31:41 -08:00
Martin von Zweigbergk
f1666375bd repo: replace Repo trait by enum with readonly and mutable variants
I want to keep the index updated within the transaction. I tried doing
that by adding a `trait Index`, implemented by `ReadonlyIndex` and
`MutableIndex`. However, `ReadonlyRepo::index` is of type
`Mutex<Option<Arc<IndexFile>>>` (because it is lazily initialized),
and we cannot get a `&dyn Index` that lives long enough to be returned
from a `Repo::index()` from that. It seems the best solution is to
instead create an `Index` enum (instead of a trait), with one readonly
and one mutable variant. This commit starts the migration to that
design by replacing the `Repo` trait by an enum. I never intended for
there there to be more implementations of `Repo` than `ReadonlyRepo`
and `MutableRepo` anyway.
2021-02-13 08:31:23 -08:00
Martin von Zweigbergk
a1983ebe96 git: add a ref to each commit we create
I just learned that attaching a git note is not enough to keep a
commit from being GC'd. I had read `git help gc` before but it was
quite misleading (I just sent a patch to clarify it). Since the git
note is not enough, we need to create some other reference. This patch
makes it so we write refs in `refs/jj/keep/` for every commit we
create. We will probably want to remove unnecessary refs (ancestors of
commits pointed to by other refs) once we have a `jj gc` command.
2021-02-13 08:16:18 -08:00
Martin von Zweigbergk
dd98f0564e git: remove git note pointing to conflicts
We store conflicts as blobs with JSON data and with a git note
pointing to them to prevent GC. These are stored in the git tree as
regular files. The only thing that distinguishes them is that their
filename ends with `.jjconflict`. Since they are referenced from the
tree, there's no need for the git note to prevent GC (which doesn't
work anyway, as I just learned), and we don't store any additional
data in the note either, so let's just remove it.
2021-02-13 08:15:12 -08:00
Martin von Zweigbergk
fa30cf768f index: rename UnsavedIndexData to MutableIndex 2021-02-07 23:35:37 -08:00
Martin von Zweigbergk
8170c06573 index: rename IndexFile to ReadonlyIndex 2021-02-07 23:35:22 -08:00
Martin von Zweigbergk
51373b75ff index: use correct per-level file name in stats (previously always top-level) 2021-02-07 23:34:57 -08:00
Martin von Zweigbergk
302c66825f working_copy: preserve executable bit on Windows
Windows doesn't support recording the executable bit in the file
system. Before this commit, the code for reading and writing the
executable wouldn't even compile on Windows. This commit at least
makes it so we preserve whatever bit has been recorded in the repo.
At least I hope that's what it does -- I don't have access to a
Windows machine right now.
2021-02-07 00:50:21 -08:00
Martin von Zweigbergk
d4aed83aa6 working_copy: correct comment about stat accuracy
We don't take a write lock when writing a file; other processes can
modify the file.
2021-02-07 00:48:50 -08:00
Martin von Zweigbergk
3d679de022 working_copy: print warning about ignored symlinks instead of failing build
The project doesn't currently build on Windows. One reason is because
we had a `unimplemented!()` when trying to write a symlink. Let's
print a warning instead, so the project can start building on
Windows. (The next patch will fix another build problem on Windows.)
2021-02-07 00:46:17 -08:00
Martin von Zweigbergk
e0112a4be0 transaction: report failure to close transaction only in debug builds
When a transaction gets dropped without being committed or explicitly
discarded, we currently raise an assertion error. I added that check
because I kept forgetting to commit transactions. However, it's quite
normal to want to drop transactions in error cases. The current
assertion means that we panic and don't report the actual error to the
user in such cases. We should probably audit the code paths where we
commit transactions and decide for each if we simply want to to
discard the transaction or not. In some cases, we may want to commit
the transaction without integrating it in the operation log
(i.e. without creating a file entry in .jj/views/op_heads/). However,
we can do that later. For now, let's just make sure we don't panic
when dropping the transaction in release builds.
2021-02-07 00:11:42 -08:00
Martin von Zweigbergk
4ecbd89378 repo: move MutableRepo from transaction module to repo module 2021-01-31 18:15:32 -08:00
Martin von Zweigbergk
2d03b514fc transaction: move construction of MutableRepo out of Transaction::new()
I'm about to move `MutableRepo` to the `repo` module and it will make
more sense to have the construction of it there then.
2021-01-31 18:15:32 -08:00
Martin von Zweigbergk
bf53c6c506 transaction: add factory function to MutableRepo
This helps finish the encapsulate of the `evolution` field.
2021-01-31 18:15:32 -08:00
Martin von Zweigbergk
5604303954 transaction: avoid direct access to members of MutableRepo
I'm about to move `MutableRepo` to the `repo` module so it becomes
more important to encapsulate access. Besides, the new functions
introduced in this commit reduces some duplication.

There's still one access of `MutableRepo::evolution` in
`Transaction::new()`. I'll address that next by adding a factory
function to `MutableRepo`.
2021-01-31 18:15:32 -08:00
Martin von Zweigbergk
a28fe7b388 transaction: slightly simplify write_commit() by using store() 2021-01-31 18:15:22 -08:00
Martin von Zweigbergk
9ffd35caf8 transaction: when checking out open commit with conflicts, create child commit
I've been confused twice that rebasing an open commit so it results in
conflicts doesn't show the conflicts in the log output. That's because
we create a successor instead if a commit with conflicts is open. I
guess I thought it would be expected that a child commit was not
created. Since it seems surprising in practice, let's change it and
we'll see if the new behavior is more or less surprising.
2021-01-22 11:41:52 -08:00
Martin von Zweigbergk
bb730d8a2b merge: rewrite code for 3-way merge of files to handle not just trivial cases
The most annoying remaining bug is that 3-way merge frequently panics
with "unhandled merge case". This commit fixes that by rewriting the
merge code. The new code is based on the algorithm used in Mercurial
(which was in turn copied from Bazaar):

 1. Find "sync" regions, which are regions that are the unchanged in
    the base and two sides. Note their start end end positions in each
    version.

 2. Produce the output by taking the sync regions and inserting the
    result of merging the regions between the sync regions. These
    regions can either be changed on only one side, in which case we
    use that version, or it can be changed on both sides, in which
    case we indicate a conflict in the output.

It's both more correct and much easier to follow.
2021-01-22 11:41:50 -08:00
Martin von Zweigbergk
7957feca49 diff: make tokenization return slices instead of making copies 2021-01-21 22:42:55 -08:00
Martin von Zweigbergk
30939ca686 view: return &HashSet instead of Iterator
We want to be able to be able to do fast `.contains()` checks on the
result, so `Iterator` was a bad type. We probably should hide the
exact type (currently `HashSet` for both readonly and mutable views),
but we can do that later. I actually thought I'd want to use
`.contains()` for indiciting public-phase commits in the log output,
but of course want to also indicate ancestors as public. This still
seem like a step (mostly) in the right direction.
2021-01-16 13:00:05 -08:00
Martin von Zweigbergk
79eecb6119 git: mark imported remote-tracking branches as public 2021-01-16 12:14:42 -08:00
Martin von Zweigbergk
4db3d8d3a6 view: add tracking of "public" heads (copying Mercurial's phase concept)
Mercurial's "phase" concept is important for evolution, and it's also
useful for filtering out uninteresting commits from log
output. Commits are typically marked "public" when they are pushed to
a remote. The CLI prevents public commits from being rewritten. Public
commits cannot be obsolete (even if they have a successor, they won't
be considered obsolete like non-public commits would).

This commits just makes space for tracking the public heads in the
View.
2021-01-16 11:48:35 -08:00
Martin von Zweigbergk
265f90185e tests: simplify transaction tests slightly by using testutils more 2021-01-16 11:31:57 -08:00
Martin von Zweigbergk
f43880381f view: make sure we don't leave a dangling git ref
All commits in the view are supposed to be reachable from its
heads. If a head is removed and there are git refs pointing to
ancestors of it (or to the removed head itself), we should make that
ancestor a head.
2021-01-16 11:05:32 -08:00
Martin von Zweigbergk
1f593a4193 view: create helper for enforcing view's invariants
The only invariant we currently enforce is that the set of heads does
not include any ancestors of other commits in the set. I'm about to
make sure that we don't end up with dangling git refs (pointing to
commits no reachable from the heads). It will be useful to have a
single place to enforce that since we'll need to do the same thing
after updating the view as after merging views.
2021-01-16 10:35:46 -08:00
Martin von Zweigbergk
1f27a78957 view: make remove_head() not add parents as heads
I think it's better to let the caller decide if the parents should be
added. One use case for removing a head is when fetching from a Git
remote where a branch has been rewritten. In that case, it's probably
the best user experience to remove the old head. With the current
semantics of `View::remove_head()`, we would need to walk up the graph
to find a commit that's an ancestor and for each commit we remove as
head, its parents get temporarily added as heads. It's much easier for
callers that want to add the parents as heads to do that.
2021-01-15 01:08:05 -08:00
Martin von Zweigbergk
315818260f git: slightly simplify a few tests 2021-01-11 00:34:04 -08:00
Martin von Zweigbergk
19b542b318 git: simplify error handling by passing git repo into git module functions 2021-01-11 00:25:39 -08:00
Martin von Zweigbergk
da0bbbe637 view: start tracking git refs
Git refs are important at least for understanding where the remote
branches are. This commit adds support for tracking them in the view
and makes `git::import_refs()` update them.

When merging views (either because of concurrent operations or when
undoing an earlier operation), there can be conflicts between git ref
changes. I ignored that for now and let the later operation win. That
will probably be good enough for a while. It's not hard to detect the
conflicts, but I haven't yet decided how to handle them. I'm leaning
towards representing the conflicting refs in the view just like how we
represent conflicting files in the tree.
2021-01-10 20:13:22 -08:00
Martin von Zweigbergk
3df6a92df6 view: merge concurrent operations ordered by transaction commit time
This will make it easier to test the result of concurrent operations
(just make sure the operations don't commit during the same
millisecond).
2021-01-10 19:34:52 -08:00
Martin von Zweigbergk
c4cd12e93e view: use the Operation wrapper type in merge_op_heads()
This is partly to prepare for merging the operations in order of
transaction-commit time (currently merged in order of operation id),
so we can get a predictable order in tests (assuming transactions are
not committed the same millisecond).
2021-01-10 19:34:52 -08:00
Martin von Zweigbergk
48e664c716 view: make the View types not store concrete OpStore type 2021-01-10 19:34:52 -08:00
Martin von Zweigbergk
a3de39a35a repo: inline init_cycles() now that it's very short
The function used to be larger when we had more reference cycles
between e.g. the `WorkingCopy` and `Repo`. Now there's only
`Evolution` left, so let's inline the function.
2021-01-04 11:18:16 -08:00
Martin von Zweigbergk
7494a03081 repo: return error when attempting to load repo where there is none
This commits makes it so that running commands outside a repo results
in an error message instead of a panic.

We still don't look for a `.jj/` directory in ancestors of the current
directory.
2021-01-04 09:18:09 -08:00
Martin von Zweigbergk
0137acd0a8 cargo: release 0.1.1
This release is mostly to fix the regressed binary name (back from
`jujube` to `jj`).
2021-01-03 23:06:46 -08:00
Martin von Zweigbergk
bb4b028b2b cargo: fill in crates.io metadata 2021-01-03 10:37:40 -08:00
Martin von Zweigbergk
abc9dc1733 cargo: rename crates to names available on crates.io
I'm preparing to publish an early version before someone takes the
name(s) on crates.io. "jj" has been taken by a seemingly useless
project, but "jujube" and "jujube-lib" are still available, so let's
use those.
2021-01-03 10:16:00 -08:00
Martin von Zweigbergk
d7b9bd55e8 git: remove unnecessary taking of reference (reported by clippy) 2021-01-02 19:38:18 -08:00
Martin von Zweigbergk
e2d6252766 git: on push, check that remote branch was actually updated
I had missed in `git2-rs`'s documentation that you need to check
in a callback if the remote ref(s) got updated by the push or
not. This adds such a check and a new error variant for rejected
branch updates.
2021-01-02 19:27:42 -08:00
Martin von Zweigbergk
7542c484a8 git: pass ssh credentials from ssh-agent on push
I tried to push a commit from my Jujube repo to GitHub using `jj
git push --branch main` and it became clear that we need to pass
SSH credentials. This commit hopefully fixes that. I've only made
it pass credentials for ssh-agent for now, because that seems to
be enough to make it work for me personally. If this commit
becomes visible on GitHub, it should mean that it worked.
2021-01-02 19:27:39 -08:00
Martin von Zweigbergk
14fe58e76a git: use thiserror for errors
When you run e.g. `jj st` outside of a repo, it just
crashes. That'll probably give new users a bad impression, so I
was planning to improve error handling a bit. A good place to
start is by fixing the code I recently added (which obviously
should have been using `thiserror` from the beginning). That's
what this commit does.

Also, this is the first commit in this repo created with
Jujube! I've just started dogfooding it myself.
2021-01-02 08:24:27 -08:00
Martin von Zweigbergk
5b8e10394d transaction: add a message to check for unclosed transaction
I've forgotten to close a transaction a few times and while the
message ('assertion failed: self.closed') is clear to me now, it
probably won't be clear to others or to me in the future.
2021-01-01 12:24:53 -08:00
Martin von Zweigbergk
e14db781b0 git: add subcommand for fetching from remote
This adds `jj git fetch` for fetching from a git remote. There remote
has to be added in the underlying git repo if it doesn't already
exist. I think command will still be useful on typical small projects
with just a single remote on GitHub. With this and the `jj git push` I
added recently, I think I have enough for my most of my own
interaction with GitHub.
2021-01-01 11:11:09 -08:00
Martin von Zweigbergk
7e65a3d589 git: restructure test a bit to make the functions more reusable 2020-12-31 23:28:02 -08:00
Martin von Zweigbergk
634a04e234 git: return error instead of panicking on unexpected libgit2 error
This is obviously what I meant to do in the commit that introduced the
code.
2020-12-31 09:44:58 -08:00
Martin von Zweigbergk
ff3b20c537 git: import git refs as anonymous heads when creating Git-backed repo
The fact that no commits from the underlying Git repo were imported
when creating a new Jujube repo from it was quite surprising. This
commit finally fixes that.
2020-12-29 23:59:35 -08:00
Martin von Zweigbergk
4235ea975d tests: clarify a test slightly by moving assertion of out helper 2020-12-29 23:37:09 -08:00
Martin von Zweigbergk
8377000fd9 git: add a function for updating heads from git refs
When using Git as a store, new commits created in the underlying Git
repo are only made visible by making changes on top of them (e.g by
checking them out, so a working copy commit is created on top). That's
especially confusing when creating a new repo backed by an existing
Git repo, because the commits from that repo don't show up.

This commit prepares for fixing that by adding a function for updating
heads based on git refs. Since we don't yet track git refs (or
anything similar), the function just makes sure the refs are visible
in the Jujube repo by making them (anonymous) heads.
2020-12-29 23:30:34 -08:00
Martin von Zweigbergk
905a5c97d6 transaction: make sure set of heads has only heads
`Transaction::add_head()` and others would let the caller add
non-heads to the set (i.e. ancestors of others heads) and the the
non-heads were filterd out when the transaction was committed. That's
a little surprising, so let's try to keep the set valid even within a
transaction. That will surely make commands that add many commits
noticeably slower in large repos. Hopefully we can improve that
later.
2020-12-29 20:44:17 -08:00
Martin von Zweigbergk
0d85850017 git: return a new repo instance from the store instead of the store's instance
Returning the store's internal `git2::Repository` instance wrapped in
a `Mutex` makes it easy to run into deadlocks. Let's return a freshly
loaded repo instance instead.
2020-12-28 23:38:20 -08:00
Martin von Zweigbergk
a8a9f7dedd init: add support for creating new repo backed by bare git repo in .jj/git/
It's annoying to have to have the Git repo and Jujube repo in separate
directories. This commit adds `jj init --git`, which creates a new
Jujube repo with an empty, bare git repo in `.jj/git/`. Hopefully the
`jj git` subcommands will eventually provide enough functionality for
working with the Git repo that the user won't have to use Git commands
directly. If they still do, they can run them from inside `.jj/git/`,
or create a new worktree based on that bare repo.

The implementation is quite straight-forward. One thing to note is
that I made `.jj/store` support relative paths to the Git repo. That's
mostly so the Jujube repo can be moved around freely.
2020-12-28 00:54:03 -08:00
Martin von Zweigbergk
e82197d981 git: extract function for pushing commit to remote branch, and test it 2020-12-28 00:53:41 -08:00
Martin von Zweigbergk
d481001271 commands: add a jj git push command
This commit starts adding support for working with a Jujube repo's
underlyng Git repo (if there is one). It does so by adding a command
for pushing from the Git repo to a remote, so you can work with your
anonymous branches in Jujube and push to a remote Git repo without
having to switch repos and copy commit hashes.

For example, `jj git push origin main` will push to the "main" branch
on the remote called "origin". The remote name (such as "origin") is
resolved in that repo. Unlike most commands, it defaults to pushing
the working copy's parent, since it is probably a mistake to push a
working copy commit to a Git repo.

I plan to add more `jj git` subcommands later. There will probably be
at least a command (or several?) for making the Git repo's refs
available in the Jujube repo.
2020-12-27 00:59:05 -08:00
Martin von Zweigbergk
b820eddde3 commands: add an interactive mode for jj restore
This adds an interactive mode for `jj restore`. It works by first
creating two temporary directories with the contents of the subset of
files that differ between the two trees, and then letting the user
edit the directory representing the right/after side. This has some
advantages compared to the interactive modes in Git and Mercurial:

 * It lets the user edit the final state as opposed to the diff itself
   (depending on the diff tool, of course). I think most users find it
   easier to edit the file contents than to edit the patch
   format.

 * It delegates the hard work to a tool that is already written (this
   is a big advantage for an immature tool like Jujube, but it is not
   an advantage from the user's point of view).

Almost all of the work in this commit went into adding a function that
takes two trees, lets the user edit the diff, and returns a new tree
id. I plan to reuse that function for other interactive commands. One
planned command is `jj edit`, which will let the user edit the changes
in a commit. `jj edit -r abc123` will be mostly about providing a more
intuitive name for `jj restore --source abc123^ --destination abc123`,
plus it will be different for merge commits (it will edit only the
changes in the merge commit). I also plan to add `jj split` by letting
the user edit the full diff, leaving only the parts that should go
into the first commit. Perhaps there will also be commands for moving
part of a commit out of or into a parent commit.
2020-12-26 01:16:19 -08:00
Martin von Zweigbergk
fa44ef8d1b conflicts: add another helper for writing materialized conflict to store
This extracts a bit from `Transaction::check_out()` for taking a
Conflict, materializing it, and writing the resulting plain file to
the store. It will soon be reused.
2020-12-26 00:35:45 -08:00
Martin von Zweigbergk
9ade41078a working_copy: remove a working_copy_path argument I missed earlier
This should clearly have been removed in 4734eb6.
2020-12-26 00:35:45 -08:00
Martin von Zweigbergk
4ce2aed17f lock: use exponential backoff 2020-12-25 15:08:49 -08:00
Martin von Zweigbergk
9138de6ff2 git_store: prevent conflict data from being GC'd
Before this commit, running Git's GC in a Git repo backing a Jujube
repo would risk deleting the conflict data we store as blobs in the
Git repo. This commit fixes that by adding a Git note pointing to the
conflict blob.

I wasn't able to add a test case for this because libgit2 doesn't
support gc [1]. Just testing that the ref is there doesn't seem very
useful.

 [1] https://github.com/libgit2/libgit2/issues/3247
2020-12-25 00:52:09 -08:00
Martin von Zweigbergk
3613fe3f59 git_store: avoid confusing delegation from write_conflict() to write_file()
Very little code was saved by reusing `write_file()` and it made it
confusing (needed to provide an unused filename). Also, I'll soon want
access to the `locked_repo` variable in `write_conflict()`.
2020-12-25 00:26:28 -08:00
Martin von Zweigbergk
8cca56ee77 git_store: extract function for retrying note-writing
I'll add a second call to it very soon.
2020-12-25 00:23:27 -08:00
Martin von Zweigbergk
ddf8416d92 git_store: use exponential backoff when retrying note-writing
I have never run into this being a problem in practice, but this
change is a stepping stone for two things:

 1. Using exponential backoff for other locks (in particular the
    working copy).

 2. Making the Git store write a ref for conflict objects, so they
    don't get GC'd (I want to do that before even I start dogfooding).
2020-12-24 23:22:07 -08:00
Martin von Zweigbergk
210405b21a cargo: update dependencies
For no reason other than to stay up to date.
2020-12-24 01:16:55 -08:00
Martin von Zweigbergk
9c8f13608f cargo: update blake2
For no reason other than to stay up to date.
2020-12-24 01:15:38 -08:00
Martin von Zweigbergk
c7ee24727a protobuf: generate code at build-time
I had tried to generate the protobuf code at build time many months
ago, but decided against it because it slowed down the build too
much. I didn't realize there was the
"cargo:rerun-if-changed=<filename>" feature that time. Given that that
exists, it seems like an obvious win to generate the source code at
build time.

I put the generated sources in `$OUT_DIR` (where [1] says they should
be), then include them in the `protos` module by using the `include!`
macro. The biggest problem with that is that I couldn't get IntelliJ
to understand it, even after enabling the experimental features
described in [2].

 [1] https://doc.rust-lang.org/cargo/reference/build-script-examples.html#code-generation

 [2] https://github.com/intellij-rust/intellij-rust/issues/1908#issuecomment-592773865
2020-12-24 01:05:17 -08:00
Martin von Zweigbergk
4619942a57 evolution: add support for updating state incrementally
We currently recalculate the entire evolution state whenever a new
commit is added within a transaction. That's clearly wasteful. This
commit makes the state-update incremental.
2020-12-23 18:37:55 -08:00
Martin von Zweigbergk
ea7a6b9ce1 evolution: keep track of non-obsolete commits per change id
I'm about to make the evolution state updated incrementally. To be
able to tell if a new commit is divergent, we'll need to keep track of
already existing, non-obsolete commits in the change, even if they're
not divergent before the new commit is added.
2020-12-23 18:10:03 -08:00
Martin von Zweigbergk
6807407814 evolution: fix it so pruned commits can be divergent
A pruned commit just indicates that its predecessors should be evolved
onto the pruned commit's parent instead of onto the pruned commit
itself. The pruned commit itself can be divergent. For example, if
there are several pruned sucessors of a commit, then it's unclear
where the predecessor's children should be rebased to.
2020-12-23 18:01:01 -08:00
Martin von Zweigbergk
cc9008c6bb evolution: create State struct in place
I'm about to add more fields to the type and this will help to
slightly reduce the boilerplate for initializing and the maps and sets
and creating the struct.
2020-12-23 17:32:32 -08:00
Martin von Zweigbergk
66ba74cf5a evolution: use updated state when resolving descendants of orphans
Before this commit, when the `evolve()` evolved a stack of orphans, it
would use the evolve state from the beginning of the function to
calculate where they should go. That meant that only the bottom-most
orphan(s) would get evolved to their right place. This commit fixes
that by use the Transaction's evolution state.
2020-12-23 17:32:32 -08:00
Martin von Zweigbergk
0219dbb359 transaction: make evolution_mut() return a mutable reference
This includes fixing a lifetime bound on `MutableEvolution` that was
reversed.
2020-12-23 17:32:32 -08:00
Martin von Zweigbergk
110c083e78 evolution: don't wrap State in Mutex<Option<>> 2020-12-23 17:32:31 -08:00
Martin von Zweigbergk
fce5ec21f6 evolution: make MutableEvolution own its state
Before this commit, it could share its state with the
`ReadonlyEvolution`. That makes no sense when the state is modified,
and would crash if we tried to get a mutable reference to the
state. It only "worked" because the state is not yet updated within a
transaction (a known bug). I'm about to fix that bug, so we need to
fix the ownership, which this commit does.
2020-12-23 17:32:31 -08:00
Martin von Zweigbergk
648ec34a4c evolution: move shared implementation onto the State struct
There was some duplicate between `ReadonlyEvolution` and
`MutableEvolution` that could be extracted. It will help to have this
shared code on the `State` object for the next few patches.
2020-12-23 17:32:31 -08:00
Martin von Zweigbergk
88e7f4a30c tests: start using the maplit crate 2020-12-23 17:32:31 -08:00
Martin von Zweigbergk
c41251eaff working_copy: fix test to show that already tracked files are not ignored
The point of having the `modified` and `removed` files in the test was
to show that they don't get untracked, but I forgot to include them in
the `.gitignores`, so there was no reason they would have gotten
untracked anyway.
2020-12-22 10:03:42 -08:00
Martin von Zweigbergk
3b326a942c working_copy: add support for .gitignore files
The project's source of truth is now in Git and I really miss support
for anonymous heads and evolution (compared to when the code was in
Mercurial). I'm therefore more motivated to make the tool useful for
day-to-day work on small repos, so I can use it myself. Until now, I
had been more focused on improving performance when it was used as a
read-only client for medium-to-large repos.

One important feature for my day-to-day work is support for
ignores. This commit adds simple and effective, but somewhat hacky
support for that. libgit2 requires a repo to check if a file should be
ignored (presumably so it can respect `.git/info/excludes`). To work
around that, we create a temporary git repo in `/tmp/` whenever the
working copy is committed. We set that temporary git repo's working
copy to be shared with our own working copy. Due to
https://github.com/libgit2/libgit2sharp/issues/1716 (which seems to
apply to the non-.NET version as well), this workaround unfortunately
leaves a .git file (pointing to the deleted temporary git repo) around
in every Jujube repo. That's always ignored by libgit2, so it's not
much of a problem.
2020-12-20 00:37:43 -08:00
Martin von Zweigbergk
9ad225b3b5 trees: make entries() function be the recursive one, since it's more common 2020-12-20 00:26:06 -08:00
Martin von Zweigbergk
8ec100713d tree: for walking tree, replace function with callback by iterator
Iterators are a lot easier to use.
2020-12-20 00:16:05 -08:00
Martin von Zweigbergk
4734eb6493 working_copy: let WorkingCopy and TreeState have the working copy path
I don't know why I didn't do it this way from the beginning.
2020-12-18 23:56:32 -08:00
Martin von Zweigbergk
14e7df995a index: move static functions from Index to IndexFile and delete it
The Index struct no longer has any state, so it's not needed.
2020-12-18 16:12:45 -08:00
Martin von Zweigbergk
00fb670c9c index: make Index::load() return Arc<IndexFile> instead of Index
This removes one level of indirection, which is nice because it was
visible to the callers. The `Index` struct is now empty. The next step
is obviously to delete it (and perhaps rename `IndexFile` to `Index`
or `ReadonlyIndex`).
2020-12-18 16:12:45 -08:00
Martin von Zweigbergk
af1760b02e index: load index file eagerly in Index::load() now that Repo::index() is lazy 2020-12-18 16:12:45 -08:00
Martin von Zweigbergk
6721b576b7 releasing: add copyright header also to generated files
This doesn't seem right to me, but it seems the release tool (`cross`)
requires it.
2020-12-15 18:36:23 -08:00
Martin von Zweigbergk
6b1427cb46 import commit 0f15be02bf4012c116636913562691a0aaa7aed2 from my hg repo 2020-12-12 00:23:38 -08:00