Since the concurrent diff algorithm is significantly slower when using
the Git backend, I think we'll have to use switch between the two
algorithms depending on backend. Even if the concurrent version always
performed as well as the sequential version, exactly how concurrent it
should be probably still depends on the backend. This commit therefore
adds a function to the `Backend` trait, so each backend can say how
much concurrency they deal well with. I then use that number for
choosing between the sequential and concurrent versions in
`MergedTree::diff_stream()`, and also to decide the number of
concurrent reads to do in the concurrent version.
When diffing two trees, we currently start at the root and diff those
trees. Then we diff each subtree, one at a time, recursively. When
using a commit backend that uses remote storage, like our backend at
Google does, diffing the subtrees one at a time gets very slow. We
should be able to diff subtrees concurrently. That way, the number of
roundtrips to a server becomes determined by the depth of the deepest
difference instead of by the number of differing trees (times 2,
even). This patch implements such an algorithm behind a `Stream`
interface. It's not hooked in to `MergedTree::diff_stream()` yet; that
will happen in the next commit.
I timed the new implementation by updating `jj diff -s` to use the new
diff stream and then ran it on the Linux repo with `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0`. That slowed down by
~20%, from ~750 ms to ~900 ms. Maybe we can get some of that
performance back but I think it'll be hard to match
`MergedTree::diff()`. We can decide later if we're okay with the
difference (after hopefully reducing the gap a bit) or if we want to
keep both implementations.
I also timed the new implementation on our cloud-based repo at
Google. As expected, it made some diffs much faster (I'm not sure if
I'm allowed to share figures).
I'm about to add a few more checks for diffing with a matcher. I think
it will help make it readable and reduce the risk of mixing up
variables between each part of the test if we use some nested blocks.
I also removed some unnecessary `.clone()` calls while at it.
Originally, my motivation was to try again to get `mike` to not push empty
commits (which this should do). I'm now reconsidering this, since *not* pushing
empty commits will make the output of the CI job a little harder to read. If
this becomes an issue, I might even add `--allow-empty` to the `mike`
invocations later.
A more important motivation is that even for a 400-byte file, changing it for
every PR blows up the size of the repo eventually.
The cause for the changes to this file was that `gzip` stores a timestamp
inside the `.gz` file.
I'm going to add a Merge method that removes negative/positive terms pair, and
swap_remove() is the easiest option. The order of the conflicted ref targets
doesn't matter.
Many callers use interleaved iterators, and recently-added serialization code
is built on top of that, so I think it's better to store terms in that format.
map() functions no longer use MergeBuilder as we know the mapped values are
ordered properly. flatten() and simplify() are reimplemented to work with the
interleaved values. The other changes are trivial.
Summary: What was going to just be some minor touch-ups to the existing content
ended in another rework of the frontmatter, this time primarily the sales pitch
and basic feature explanation.
The motivation here is simple: you should not just encounter a three-word noun
that is a hyperlink to pages with 1,000 words actually explaining the three-word
noun itself is. It's jarring!
Instead, the frontmatter is longer, expanding on each major selling point and
similarity to other tools. It actually *describes* the important, distinct
design decisions that tell you what the tool is and does, rather than just link
you around a bunch.
For example, one immediate thing is that calling jj a "DVCS" is actually kind
of odd when it later becomes apparent that you can have multiple data model and
commit backends; Google for example uses it in a more centralized manner than
others would via Piper/CitC. Calling it a "DVCS" is a bit strange in this sense
when *really* what we mean is that the Git data model allows independent copies
of the repo.
Overall I think this is *much* better for people who are just going to see the
README and may or may not bounce off immediately.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I9f0f78e56157ef434ec239710e00f3bd
This motivation for this is so we can easily skip calling the function
if the user has opted out of the propagation of abandoned commits we
usually do (#2504). However, it seems like a good piece of code to
extract regardless of that feature.
One less git2 API use in CLI.
The function name GitBackend::init_colocated() is a bit odd, but we need to
specify the work-tree path, not the ".git" repo path. So we can't eliminate
the notion of the working copy path anyway.
Summary: Without `devShell` providing the needed Darwin-specific inputs, `cargo
build` does not work inside a `nix develop` or `direnv` environment; libgit2 in
particular fails on being able to find the Security framework.
The actual `nix build` invocation however *does* work because we correctly
include those dependencies in the package `buildInputs`. So just factor them
out, and use them in both places.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I484bf381ca31c29c4c39fb6d184bdd21
https://github.com/jimporter/mike/releases/tag/v2.0.0
The main immediate advantage of this is that `mike` will stop pushing empty
commits.
Also, we can consider switching to using symlinks instead of redirects for
mapping the "latest" version to "v0.11.0". This would make
`https://martinvonz.github.io/jj/latest/` have the same content as
`https://martinvonz.github.io/jj/v0.11.0/` (until the next version is out), but
the user would see `latest` in the URL.
For now, I set an option to keep using redirects.
I did a bit of non-exhaustive testing; it seems to work.
Summary: Apparently this was broken. Maybe I broke it. Maybe something upstream
changed and caused a regression. But without it, we get the stable `rustfmt` in
the `nix develop` shell environment, not the nightly version. Fix it.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I602ed8e5691c4d48f8db575d62624955
This also adds `jobs`, the argument reading the thread count to use and `shell_command`.
While we're at it, make `execute` a no-op and teach `run` to resolve the passed revsets.
I also fixed my misunderstanding of `Clap` which makes
`jj run 'echo hello world' -r 'mine() & ~origin@remote' --jobs 4` parse correctly.
Also contains a small fix in the `pre-commit` example for it.
Summary: This is currently used by `new.rs`, `workspace.rs`, and `rebase.rs`,
and may be useful for other commands and custom CLIs. So just go ahead and move
it into the parent module hierarchy.
Also rename the function to `resolve_all_revs`, as it isn't actually specific to
rebase at all.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I0ea12afd8107f95a37a91340820221a0
Summary: A natural extension of the existing support, as suggested by Scott
Olson. Closes#2496.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I91c9c8c377ad67ccde7945ed41af6c79
This adds two MkDocs extensions to make list handling more flexible.
It took some trial-and-error, but it seems this config works OK.
revsets.md: use saner formatting that is now possible.
sapling-comparison.md: this was the one case I saw made worse by the
new plugins. I changed the Markdown formatting, it still looks sane.
What make rebase_to_dest_parent a good candidate for jj_lib::rewrite module:
- It is used both in obslog and interdiff. It's a sign that it may be moved to a lower layer
- CommandError is returned by converting from TreeMergeError. Not explicitly.
- It only use jj_lib::rewrite fonctions.