On Windows, it seems that you can't rename a file if the target file
is open (Stebalien/tempfile#131). I think that's the reason for our
failing tests on Windows. This patch adds a simple wrapper around
`NamedTempFile::persist()` that returns the existing file instead of
failing, if there is one.
This is to address issue #8. I haven't added the optimization to avoid
walking all the files in `target/` yet. Even so, this patch still
speeds up `jj st` in this repo, with ~13k files in `target/`, from
~320 ms to ~100 ms (-5.1dB). The time actually checking if paths match
gitignores seems to go down from 116 ms to 6 ms. I think that's mostly
because libgit2 has to look for `.gitignore` files in every parent
directory every time we ask it about a file, while the rewritten code
looks for a `.gitignore` file only when visiting a new directory.
When rendering a non-contiguous subset of the commits, we want to
still show the connections between the commits in the graph, even
though they're not directly connected. This commit introduces an
adaptor for the revset iterators that also yield the edges to show in
such a simplified graph.
This has no measurable impact on `jj log -r ,,v2.0.0` in the git.git
repo.
The output of `jj log -r 'v1.0.0 | v2.0.0'` now looks like this:
```
o e156455ea491 e156455ea491 gitster@pobox.com 2014-05-28 11:04:19.000 -07:00 refs/tags/v2.0.0
:\ Git 2.0
: ~
o c2f3bf071ee9 c2f3bf071ee9 junkio@cox.net 2005-12-21 00:01:00.000 -08:00 refs/tags/v1.0.0
~ GIT 1.0.0
```
Before this commit, it looked like this:
```
o e156455ea491 e156455ea491 gitster@pobox.com 2014-05-28 11:04:19.000 -07:00 refs/tags/v2.0.0
| Git 2.0
| o c2f3bf071ee9 c2f3bf071ee9 junkio@cox.net 2005-12-21 00:01:00.000 -08:00 refs/tags/v1.0.0
| |\ GIT 1.0.0
```
The output of `jj log -r 'git_refs()'` in the git.git repo is still
completely useless (it's >350k lines and >500MB of data). I think
that's because we don't filter out edges to ancestors that we have
transitive edges to. Mercurial also doesn't filter out such edges, but
Git (with `--simplify-by-decoration`) seems to filter them out. I'll
change it soon so we filter them out.
The current diff algorithm does a full LCS on the words of the texts,
which is really slow. Diffing the working copy when e.g.
`src/commands.py` has changes far apart takes seconds. This patch adds
an implementation inspired by JGit's Histogram diff. I say "inspired"
because I just didn't quite understand it :P In particular, I didn't
understand what it does when it finds non-unique elements. I decided
to line up the leading common elements on both sides of the merge. I
don't know if that usually gives good enough results in practice.
I'm sure this can still be optimized a lot, but this seems good enough
as a start. There is also many things to improve about the quality of
the diffs.
This patch introduces a new `IndexStore` struct. The idea is that it
will know about the directory in which the index files are stored, the
associations with operations. It may also cache `Arc<ReadonlyIndex>`
instances so if multiple `ReadonlyIndex` instances are loaded, they
can be returned from the cache. That may be useful when merging
operations because the operations are likely to share a large parent
index file. For now, however, all the new type has is `init()`,
`load()`, and `reinit()`.
We currently need to read the commit objects for finding common
ancestors. That can be very slow when the common ancestor is far back
in history. This patch adds a function for finding common ancestors
using the index instead.
Unlike the current algorithm, which only returns one common ancestor,
the new index-based one correctly handles criss-cross merges.
Here are some timings for finding the common ancestors in the git.git
repo:
| Without index | With Index |
| First run | Subsequent | First run | Subsequent |
v2.30.0-rc0 v2.30.0-rc1 | 5.68 ms | 5.94 us | 40.3 us | 4.77 us |
v2.25.4 v2.26.1 | 1.75 ms | 1.42 us | 13.8 ms | 4.29 ms |
v1.0.0 v2.0.0 | 492 ms | 2.79 ms | 23.4 ms | 6.41 ms |
Finding ancestors of v2.25.4 and v2.26.1 got much slower because the
new algorithm finds all common ancestors. Therefore, it also finds
v2.24.2, v2.23.2, v2.22.3, v2.21.2, v2.20.3, v2.19.4, v2.18.3, and
v2.17.4, which it then filters out because they're all ancestors of
v2.25.3.
Also note that the result was incorrect before, because the old
algorithm would return as soon as it had found a common ancestor, even
if it's not the latest common ancestor. For example, for the common
ancestor between v1.0.0 and v2.0.0, it returned an ancestor of v1.0.0
because it happened to get there by following some side branch that
led there more quickly.
The only place we currently need to find the common ancestor is when
merging trees, which we only do when the user runs `jj merge`, as well
as when operating on existing merge commits (e.g. to diff or rebase
them). That means that this change won't be very noticeable. However,
it's something we clearly want to do sooner or later, so we might as
well get it done.
I had tried to generate the protobuf code at build time many months
ago, but decided against it because it slowed down the build too
much. I didn't realize there was the
"cargo:rerun-if-changed=<filename>" feature that time. Given that that
exists, it seems like an obvious win to generate the source code at
build time.
I put the generated sources in `$OUT_DIR` (where [1] says they should
be), then include them in the `protos` module by using the `include!`
macro. The biggest problem with that is that I couldn't get IntelliJ
to understand it, even after enabling the experimental features
described in [2].
[1] https://doc.rust-lang.org/cargo/reference/build-script-examples.html#code-generation
[2] https://github.com/intellij-rust/intellij-rust/issues/1908#issuecomment-592773865