Commit graph

4664 commits

Author SHA1 Message Date
Yuya Nishihara
09987c1d27 merge: micro-optimize allocation of Merge object for resolved value
It's super common that a Merge object holds a resolved value, so let's inline
up to 1 element. T of Merge<T> usually consists of a couple of pointer-sized
fields. I don't see any measurable speed up, but it's no worse than the
original.
2023-11-07 17:10:12 +09:00
Martin von Zweigbergk
1140295829 merged_tree: extract polling of tree futures into a function 2023-11-07 00:03:50 -08:00
Martin von Zweigbergk
c77417d4e4 merged_tree: drop outer loop in TreeDiffStreamImpl::poll_next()
As suggested by Yuya. I also added a comment and an assertion in the
case where return `Poll::Pending`.
2023-11-07 00:03:50 -08:00
Martin von Zweigbergk
d989d4093d merged_tree: let backend influence whether to use new diff algo
Since the concurrent diff algorithm is significantly slower when using
the Git backend, I think we'll have to use switch between the two
algorithms depending on backend. Even if the concurrent version always
performed as well as the sequential version, exactly how concurrent it
should be probably still depends on the backend. This commit therefore
adds a function to the `Backend` trait, so each backend can say how
much concurrency they deal well with. I then use that number for
choosing between the sequential and concurrent versions in
`MergedTree::diff_stream()`, and also to decide the number of
concurrent reads to do in the concurrent version.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
f40adb84fc merged_tree: add a Stream for concurrent diff off trees
When diffing two trees, we currently start at the root and diff those
trees. Then we diff each subtree, one at a time, recursively. When
using a commit backend that uses remote storage, like our backend at
Google does, diffing the subtrees one at a time gets very slow. We
should be able to diff subtrees concurrently. That way, the number of
roundtrips to a server becomes determined by the depth of the deepest
difference instead of by the number of differing trees (times 2,
even). This patch implements such an algorithm behind a `Stream`
interface. It's not hooked in to `MergedTree::diff_stream()` yet; that
will happen in the next commit.

I timed the new implementation by updating `jj diff -s` to use the new
diff stream and then ran it on the Linux repo with `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0`. That slowed down by
~20%, from ~750 ms to ~900 ms. Maybe we can get some of that
performance back but I think it'll be hard to match
`MergedTree::diff()`. We can decide later if we're okay with the
difference (after hopefully reducing the gap a bit) or if we want to
keep both implementations.

I also timed the new implementation on our cloud-based repo at
Google. As expected, it made some diffs much faster (I'm not sure if
I'm allowed to share figures).
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
9af09ec236 test_meregd_tree: test diffing with a matcher
We didn't have any tests at all for `MergedTree::diff()` with a
matcher other than `EverythingMatcher`. This patch adds a few.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
16aa8e8f10 test_merged_tree: nest each part of test_diff_dir_file()
I'm about to add a few more checks for diffing with a matcher. I think
it will help make it readable and reduce the risk of mixing up
variables between each part of the test if we use some nested blocks.

I also removed some unnecessary `.clone()` calls while at it.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
c9ce80a82a merged_tree: extract function for merged iterator of basenames in diff
I'm going to reuse this for stream/async diffing.
2023-11-06 23:12:02 -08:00
Martin von Zweigbergk
b72f04ba61 merged_tree: rename all_tree_conflict_names() since it's not about conflicts 2023-11-06 23:12:02 -08:00
Ilya Grigoriev
043f786fbf website: Stop mike from always changing sitemaps.xml.gz
Originally, my motivation was to try again to get `mike` to not push empty
commits (which this should do). I'm now reconsidering this, since *not* pushing
empty commits will make the output of the CI job a little harder to read. If
this becomes an issue, I  might even add `--allow-empty` to the `mike`
invocations later.

A more important motivation is that even for a 400-byte file, changing it for
every PR blows up the size of the repo eventually.

The cause for the changes to this file was that `gzip` stores a timestamp
inside the `.gz` file.
2023-11-06 17:10:27 -08:00
Ilya Grigoriev
ed245e234d poetry: Update pyproject.toml to use newer convention
The previous commit checks that Poetry down to version 1.3.2 (current Debian
stable version) support it.
2023-11-06 17:10:27 -08:00
Ilya Grigoriev
61cb38a512 poetry: create a CI with Debian stable's version of poetry.
This is mainly for our own information. It doesn't have to be a required check.
2023-11-06 17:10:27 -08:00
Ilya Grigoriev
745f5b7f0e poetry: Poetry 1.7 issues
1. Add --no-root to poetry invocations. Poetry 1.7 displays an error otherwise
(though things still work)

https://github.com/orgs/python-poetry/discussions/8622
https://github.com/python-poetry/poetry/issues/1132

2. Document https://github.com/python-poetry/poetry/issues/8623
2023-11-06 17:10:27 -08:00
Yuya Nishihara
3fddc31da8 merge: remove Merge::take() which is no longer used
Merge::take() is no longer a cheap function. We can add into_vec() if needed.
2023-11-07 06:52:35 +09:00
Yuya Nishihara
92dfe59ade refs: run non-trivial merge of ref targets without destructuring Merge object 2023-11-07 06:52:35 +09:00
Yuya Nishihara
93601541cb refs: use swap_remove() in non-trivial merge of ref targets
I'm going to add a Merge method that removes negative/positive terms pair, and
swap_remove() is the easiest option. The order of the conflicted ref targets
doesn't matter.
2023-11-07 06:52:35 +09:00
Yuya Nishihara
e4b0e629c9 merge_tools: borrow file contents without consuming Merge object 2023-11-07 06:52:35 +09:00
Yuya Nishihara
895bbce8c0 files: use borrowed Merge iterator in merge()
Since the underlying Merge data type is no longer (Vec<T>, Vec<T>), it doesn't
make sense to build removes/adds Vecs and concatenate them.
2023-11-07 06:52:35 +09:00
dependabot[bot]
f344651d2c cargo: bump the cargo-dependencies group with 1 update
Bumps the cargo-dependencies group with 1 update: [libc](https://github.com/rust-lang/libc).

- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.149...0.2.150)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: cargo-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-06 08:49:54 -08:00
Antoine Cezar
afc2382833 cli: move description utils to description_util.rs 2023-11-06 17:49:02 +01:00
Yuya Nishihara
f1898a31b5 merge: simply print interleaved conflict values in debug output
We could apply that for the resolved case, but Resolved/Conflicted label
seems more useful than just printing Merge([value]).
2023-11-06 07:21:06 +09:00
Yuya Nishihara
b07b370ed3 merge: simply generate content hash from interleaved values 2023-11-06 07:21:06 +09:00
Yuya Nishihara
46ffb2f0b2 merge: store negative/positive terms internally in an interleaved Vec
Many callers use interleaved iterators, and recently-added serialization code
is built on top of that, so I think it's better to store terms in that format.

map() functions no longer use MergeBuilder as we know the mapped values are
ordered properly. flatten() and simplify() are reimplemented to work with the
interleaved values. The other changes are trivial.
2023-11-06 07:21:06 +09:00
Yuya Nishihara
287728fee7 merge: extract trivial_merge() that takes interleaved adds/removes iterator
The Merge type will store interleaved terms instead of separate adds/removes
vecs.
2023-11-06 07:21:06 +09:00
Yuya Nishihara
01523ba4f3 merge: rewrite bottom half of trivial_merge() for non-copyable types
The input values of trivial_merge() will be changed to Iterator<Item = T>
where T: Eq + Hash. It could be <Item = &'a T>, but it doesn't have to be.
2023-11-06 07:21:06 +09:00
Yuya Nishihara
d8a7822c38 merge_tools: do not follow or change permission of symlinks
Fixes #2525.
2023-11-06 07:20:52 +09:00
Austin Seipp
56ba801478 readme: improved frontmatter, part 2
Summary: What was going to just be some minor touch-ups to the existing content
ended in another rework of the frontmatter, this time primarily the sales pitch
and basic feature explanation.

The motivation here is simple: you should not just encounter a three-word noun
that is a hyperlink to pages with 1,000 words actually explaining the three-word
noun itself is. It's jarring!

Instead, the frontmatter is longer, expanding on each major selling point and
similarity to other tools. It actually *describes* the important, distinct
design decisions that tell you what the tool is and does, rather than just link
you around a bunch.

For example, one immediate thing is that calling jj a "DVCS" is actually kind
of odd when it later becomes apparent that you can have multiple data model and
commit backends; Google for example uses it in a more centralized manner than
others would via Piper/CitC. Calling it a "DVCS" is a bit strange in this sense
when *really* what we mean is that the Git data model allows independent copies
of the repo.

Overall I think this is *much* better for people who are just going to see the
README and may or may not bounce off immediately.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I9f0f78e56157ef434ec239710e00f3bd
2023-11-05 12:34:21 -06:00
Martin von Zweigbergk
9a5c19a3c6 contributing.md: suggest making the gh-pages branch immutable 2023-11-05 08:07:02 -08:00
Martin von Zweigbergk
7c923514ee git: add config to disable abandoning of unreachable commits
Some users prefer to have commits not get abandoned when importing
refs. This adds a config option for that.

Closes #2504.
2023-11-05 06:10:54 -08:00
Martin von Zweigbergk
7bf8906f9c git: extract a function for abandoning unreachable commits
This motivation for this is so we can easily skip calling the function
if the user has opted out of the propagation of abandoned commits we
usually do (#2504). However, it seems like a good piece of code to
extract regardless of that feature.
2023-11-05 06:10:54 -08:00
Yuya Nishihara
d9fbf21794 merge: have Merge::adds()/removes() return iterator
The Merge type will be changed to store interleaved values internally.
2023-11-05 16:43:06 +09:00
Yuya Nishihara
1c6913d618 merge: use Merge::iter() instead of adds()/removes() where order doesn't matter
Merge::iter() will be a slice::Iter, and be more efficient than chaining adds
and removes.
2023-11-05 16:43:06 +09:00
Yuya Nishihara
99e6ff493a merge: fix copy-paste error in doc comment for adds() 2023-11-05 16:43:06 +09:00
Yuya Nishihara
f6d85c51cd merge: add non-optional Merge accessor to the zeroth value
We have a few callers which just need to obtain an object common among all
the merge values. Let's add a non-failing accessor for that purpose.
2023-11-05 16:43:06 +09:00
Yuya Nishihara
b12c688ea0 merge: add method for indexed adds/removes access
The current adds()/removes() will be changed to return iterators.
2023-11-05 16:43:06 +09:00
Martin von Zweigbergk
6a5615c933 rewrite: use MergedTree::diff_stream() when restoring from tree 2023-11-04 21:07:49 -07:00
Martin von Zweigbergk
5b02987012 merge tools: use MergedTree::diff_stream() 2023-11-04 21:07:49 -07:00
Yuya Nishihara
602b44258e workspace: add function that initializes colocated git repository
One less git2 API use in CLI.

The function name GitBackend::init_colocated() is a bit odd, but we need to
specify the work-tree path, not the ".git" repo path. So we can't eliminate
the notion of the working copy path anyway.
2023-11-05 08:48:35 +09:00
Yuya Nishihara
77e16243d6 tests: assert paths of initialized GitBackend 2023-11-05 08:48:35 +09:00
Yuya Nishihara
ce46c10c96 git_backend: extract inner function that initializes backend with open git repo 2023-11-05 08:48:35 +09:00
Yuya Nishihara
dce640aaf1 workspace: one less cloning of workspace_root in init_external_git()
Just a trivial code cleanup.
2023-11-05 08:48:35 +09:00
Yuya Nishihara
c866b4a42d workspace: fix repository path in init_internal_git() doc comment
Also rephrased "Git backend" as "Git repo" since the new backend storage will
be created.
2023-11-05 08:48:35 +09:00
Austin Seipp
2401bf9bcd nix: include darwin deps inside devShell, too
Summary: Without `devShell` providing the needed Darwin-specific inputs, `cargo
build` does not work inside a `nix develop` or `direnv` environment; libgit2 in
particular fails on being able to find the Security framework.

The actual `nix build` invocation however *does* work because we correctly
include those dependencies in the package `buildInputs`. So just factor them
out, and use them in both places.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I484bf381ca31c29c4c39fb6d184bdd21
2023-11-04 14:35:39 -05:00
Ilya Grigoriev
5fc649cbee website: upgrade mike to version 2.0
https://github.com/jimporter/mike/releases/tag/v2.0.0

The main immediate advantage of this is that `mike` will stop pushing empty
commits.

Also, we can consider switching to using symlinks instead of redirects for
mapping the "latest" version to "v0.11.0". This would make
`https://martinvonz.github.io/jj/latest/` have the same content as
`https://martinvonz.github.io/jj/v0.11.0/` (until the next version is out), but
the user would see `latest` in the URL.

For now, I set an option to keep using redirects.

I did a bit of non-exhaustive testing; it seems to work.
2023-11-04 12:23:16 -07:00
Austin Seipp
4a6ebd0223 nix: fix the rustfmt version by using nightly
Summary: Apparently this was broken. Maybe I broke it. Maybe something upstream
changed and caused a regression. But without it, we get the stable `rustfmt` in
the `nix develop` shell environment, not the nightly version. Fix it.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I602ed8e5691c4d48f8db575d62624955
2023-11-04 13:51:15 -05:00
Philip Metzger
8e4b6dfeaa run: Teach run to resolve revsets and about jobs.
This also adds `jobs`, the argument reading the thread count to use and `shell_command`.
While we're at it, make `execute` a no-op and teach `run` to resolve the passed revsets. 
I also fixed my misunderstanding of `Clap` which makes 
`jj run 'echo hello world' -r 'mine() & ~origin@remote' --jobs 4` parse correctly. 

Also contains a small fix in the `pre-commit` example for it.
2023-11-04 16:29:06 +01:00
Austin Seipp
17bcac6838 cli: move resolve_destination_revs to cli_utils and rename
Summary: This is currently used by `new.rs`, `workspace.rs`, and `rebase.rs`,
and may be useful for other commands and custom CLIs. So just go ahead and move
it into the parent module hierarchy.

Also rename the function to `resolve_all_revs`, as it isn't actually specific to
rebase at all.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I0ea12afd8107f95a37a91340820221a0
2023-11-04 10:26:08 -05:00
Austin Seipp
e1193db4cf cli: support multiple --revision arguments to workspace add
Summary: A natural extension of the existing support, as suggested by Scott
Olson. Closes #2496.

Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I91c9c8c377ad67ccde7945ed41af6c79
2023-11-04 10:26:08 -05:00
Ilya Grigoriev
e701b08f42 poetry: update mkdocs-material in pyproject.toml
Considering the previous commit, it's unclear how compatible we are
with previous mkdocs-material versions.
2023-11-03 19:15:37 -07:00
Ilya Grigoriev
f62293d218 poetry: run poetry update, remove warnings from new mkdocs-material 2023-11-03 19:15:37 -07:00