It helps to implement transparent conversion from &str to &Wrapped(str). We
could instead wrap the reference as Wrapped<'a>(&'a str), but it has various
drawbacks. Notably we can't implement Borrow and Deref because these traits
require a reference in return position.
Since the unsafe bits are pretty small, we can instead implement cast functions
without using the ref-cast crate. However, I believe we'll trust ref-cast more
than hand-crafted unsafe code.
https://crates.io/crates/ref-casthttps://docs.rs/ref-cast/1.0.20/ref_cast/attr.ref_cast_custom.html
This removes the last use of `ouroboros`. Since `TreeEntriesDirItem`
is only used in "legacy trees" (before tree-level conflicts), I didn't
bother to check the performance impact. I also didn't bother to check
the matcher before adding the entries to the list, instead leaving
that where it is in `Iterator::next()`.
During the transition to using more async code, I keep running into
https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want
to convert `MergedTree::diff()` into a `Stream`. I don't want to
update all call sites at once, so instead I'm adding a
`MergedTree::diff_stream()` method, which just wraps
`MergedTree::diff()` in a `Stream. However, since the iterator is
synchronous, it needs to block on the async `Backend::read_tree()`
calls. If we then also block on the `Stream` in the CLI, we run into
the panic.
I've enabled the "index" component from the "basic" feature set, which would
be needed to implement colocated repo functionality. The doc suggests that
a library shouldn't activate "max-performance-safe", but our crate is also
an application so it would be okay to enable the feature. We'll need "parallel"
anyway to make GitBackend Sync.
https://docs.rs/gix/latest/gix/#feature-flags
The commit backend at Google is cloud-based (and so are the other
backends); it reads and writes commits from/to a server, which stores
them in a database. That makes latency much higher than for disk-based
backends. To reduce the latency, we have a local daemon process that
caches and prefetches objects. There are still many cases where
latency is high, such as when diffing two uncached commits. We can
improve that by changing some of our (jj's) algorithms to read many
objects concurrently from the backend. In the case of tree-diffing, we
can fetch one level (depth) of the tree at a time. There are several
ways of doing that:
* Make the backend methods `async`
* Use many threads for reading from the backend
* Add backend methods for batch reading
I don't think we typically need CPU parallelism, so it's wasteful to
have hundreds of threads running in order to fetch hundreds of objects
in parallel (especially when using a synchronous backend like the Git
backend). Batching would work well for the tree-diffing case, but it's
not as composable as `async`. For example, if we wanted to fetch some
commits at the same time as we were doing a diff, it's hard to see how
to do that with batching. Using async seems like our best bet.
I didn't make the backend interface's write functions async because
writes are already async with the daemon we have at Google. That
daemon will hash the object and immediately return, and then send the
object to the server in the background. I think any cloud-based
solution will need a similar daemon process. However, we may need to
reconsider this if/when jj gets used on a server with a custom backend
that writes directly to a database (i.e. no async daemon in between).
I've tried to measure the performance impact. That's the largest
difference I've been able to measure was on `jj diff
--ignore-working-copy -s --from v5.0 --to v6.0` in the Linux repo,
which increases from 749 ms to 773 ms (3.3%). In most cases I've
tested, there's no measurable difference. I've tried diffing from the
root commit, as well as `jj --ignore-working-copy log --no-graph -r
'::v3.0 & author(torvalds)' -T 'commit_id ++ "\n"'` (to test a
commit-heavy load).
If we're going to be able to replace most instances of `Tree` by
`MergedTree`, we'll need to be able to diff two `MergedTree`s. This
implements support for that. The implementation copies a lot from the
diff iterator we have for `Tree`. I suspect we should be able to reuse
some of the code by introducing some traits that can then be
implemented by both `Tree` and `MergedTree`. I've left a TODO about
that.
The VS Code "Better TOML" plugin (which I think most of our VS Code developers use?) doesn't support the `x.y = z` syntax at the top level, even though it's valid TOML.
This is also useful if we ever want to add additional properties in different sub-crates (although unlikely for the near future).
Summary: There's no need to go around specifying `rust-version` or `edition` or
`version` several times, now that we have a global workspace. Instead, inherit
workspace metadata from the top-level Cargo.toml file.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: Iaf905445978ed2b3377239dcdb8a6c32
Summary: This moves all dependencies across the jj-lib and jj-cli crates into
the top-level Cargo file; with that, we can change each crate instead to just
inherit the workspace version, with the toggled features enabled, by setting
a dependency such as:
dep.workspace = true
in the relevant Cargo.toml file.
This doesn't actually change any of the build semantics (from what I can tell)
nor the lockfile, and seems to respond normally. There are more cleanups that
can follow.
Two notes:
- Dependabot seems to work fine, based on what I've seen in other repos.
- `cargo add` doesn't seem to know how to add packages to a top-level
`workspace.dependencies` field; instead you can `cargo add -p jj-cli`
and move the entries, at least.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I307827e5f15c0d8ea8e2a80ec793d3c7
This improves `jj status` time by a factor of ~2x on my machine (M1 Macbook Pro 2021 16-inch, uses an SSD):
```sh
$ hyperfine --parameter-list hash before,after --parameter-list repo nixpkgs,gecko-dev --setup 'git checkout {hash} && cargo build --profile release-with-debug' --warmup 3 './target/release-with-debug/jj -R ../{repo} st'
Benchmark 1: ./target/release-with-debug/jj -R ../nixpkgs st (hash = before)
Time (mean ± σ): 1.640 s ± 0.019 s [User: 0.580 s, System: 1.044 s]
Range (min … max): 1.621 s … 1.673 s 10 runs
Benchmark 2: ./target/release-with-debug/jj -R ../nixpkgs st (hash = after)
Time (mean ± σ): 760.0 ms ± 5.4 ms [User: 812.9 ms, System: 2214.6 ms]
Range (min … max): 751.4 ms … 768.7 ms 10 runs
Benchmark 3: ./target/release-with-debug/jj -R ../gecko-dev st (hash = before)
Time (mean ± σ): 11.403 s ± 0.648 s [User: 4.546 s, System: 5.932 s]
Range (min … max): 10.553 s … 12.718 s 10 runs
Benchmark 4: ./target/release-with-debug/jj -R ../gecko-dev st (hash = after)
Time (mean ± σ): 5.974 s ± 0.028 s [User: 5.387 s, System: 11.959 s]
Range (min … max): 5.937 s … 6.024 s 10 runs
$ hyperfine --parameter-list repo nixpkgs,gecko-dev --warmup 3 'git -C ../{repo} status'
Benchmark 1: git -C ../nixpkgs status
Time (mean ± σ): 865.4 ms ± 8.4 ms [User: 119.4 ms, System: 1401.2 ms]
Range (min … max): 852.8 ms … 879.1 ms 10 runs
Benchmark 2: git -C ../gecko-dev status
Time (mean ± σ): 2.892 s ± 0.029 s [User: 0.458 s, System: 14.244 s]
Range (min … max): 2.837 s … 2.934 s 10 runs
```
Conclusions:
- ~2x improvement from previous `jj status` time.
- Slightly faster than Git on nixpkgs.
- Still 2x slower than Git on gecko-dev, not sure why.
For reference, Git's default number of threads is defined in the `online_cpus` function: ee48e70a82/thread-utils.c (L21-L66). We are using whatever the Rayon default is.
Summary: Let's be more aggressive about tracking the latest stable Rust release.
There's little benefit to being conservative so early on, especially when no
users seem to have faced any issue with upgrading, or strictly required an old
Rust version.
Right now, just lagging Rust by 1 major release probably seems fine. We're
targeting 1.71.0 to get ahead of the curve, since 1.72.0 will likely release
sometime before the next `jj` release.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Change-Id: I4e691b6ba63b5b9023a75ae0a6917672
Almost everyone calls the project "jj", and there seeems to be
consensus that we should rename the crates. I originally wanted the
crates to be called `jj` and `jj-lib`, but `jj` was already
taken. `jj-cli` is probably at least as good for it anyway.
Once we've published a 0.8.0 under the new names, we'll release 0.7.1
versions under the old names with pointers to the new crates names.