ok/jj - ok.software

ok/jj

Author	SHA1	Message	Date
Anton Bulakh	e3a1e5b80e	sign: Implement storage for digital commit signatures Recognize signature metadata from git commit objects, implement a basic version of that for the native backend. Extract the signed data (a commit binary repr without the signature) to be verified later.	2023-11-12 03:37:13 +02:00
Yuya Nishihara	b42a69db6d	git_backend: configure committer (and author) of gix::Repository Otherwise, ref updates would fail if we port git::export_refs() to gitoxide. This change isn't strictly needed for the backend itself, but we'll reuse the gix::Repository instance created by the backend when importing and exporting Git refs.	2023-11-11 22:35:54 +09:00
Yuya Nishihara	ea32c0cb9e	git_backend: pass UserSettings to GitBackend constructors	2023-11-11 22:35:54 +09:00
Yuya Nishihara	e0c35684af	merge: rename Merge::new() to Merge::from_removes_adds() Since (removes, adds) pair is no longer the canonical representation of Merge, the name Merge::new() seems too generic. Let's give more verbose name.	2023-11-07 17:10:12 +09:00
Martin von Zweigbergk	d989d4093d	merged_tree: let backend influence whether to use new diff algo Since the concurrent diff algorithm is significantly slower when using the Git backend, I think we'll have to use switch between the two algorithms depending on backend. Even if the concurrent version always performed as well as the sequential version, exactly how concurrent it should be probably still depends on the backend. This commit therefore adds a function to the `Backend` trait, so each backend can say how much concurrency they deal well with. I then use that number for choosing between the sequential and concurrent versions in `MergedTree::diff_stream()`, and also to decide the number of concurrent reads to do in the concurrent version.	2023-11-06 23:12:02 -08:00
Yuya Nishihara	d9fbf21794	merge: have Merge::adds()/removes() return iterator The Merge type will be changed to store interleaved values internally.	2023-11-05 16:43:06 +09:00
Yuya Nishihara	b12c688ea0	merge: add method for indexed adds/removes access The current adds()/removes() will be changed to return iterators.	2023-11-05 16:43:06 +09:00
Yuya Nishihara	602b44258e	workspace: add function that initializes colocated git repository One less git2 API use in CLI. The function name GitBackend::init_colocated() is a bit odd, but we need to specify the work-tree path, not the ".git" repo path. So we can't eliminate the notion of the working copy path anyway.	2023-11-05 08:48:35 +09:00
Yuya Nishihara	ce46c10c96	git_backend: extract inner function that initializes backend with open git repo	2023-11-05 08:48:35 +09:00
Martin von Zweigbergk	24b706641f	async: switch to `pollster`'s `block_on()` During the transition to using more async code, I keep running into https://github.com/rust-lang/futures-rs/issues/2090. Right now, I want to convert `MergedTree::diff()` into a `Stream`. I don't want to update all call sites at once, so instead I'm adding a `MergedTree::diff_stream()` method, which just wraps `MergedTree::diff()` in a `Stream. However, since the iterator is synchronous, it needs to block on the async `Backend::read_tree()` calls. If we then also block on the `Stream` in the CLI, we run into the panic.	2023-11-03 08:15:10 -07:00
Yuya Nishihara	162dcd49b4	cli: rewrite base GitIgnoreFile lookup to use gitoxide instead of libgit2 Since gix::Repository::config_snapshot() borrows the repo instance, it has to be allocated in caller's stack. That's why GitBackend::git_config() is removed.	2023-11-02 19:33:06 +09:00
Yuya Nishihara	c88e69ad6f	git_backend: replace git2::Repository with gix::Repository My gut feeling is that gitoxide aims to be more transparent than libgit2. We'll need to know more about the underlying Git data model. Random comments on gix API: * gix::Repository provides API similar to git2::Repository, but has less "convenient" functions. For example, we need to use .find_object() + .try_to/into_<kind>() instead of .find_<kind>(). * gix::Object, Blob, etc. own raw data as bytes. gix::object and gix::objs types provide high-level views on such data. * Tree building is pretty low-level compared to git2. * gix leverages bstr (i.e. bytes) extensively. It's probably not difficult to migrate git::import/export_refs(). It might help eliminate the startup overhead of libssl initialization. The gix-based GitBackend appears to be a bit faster, but that wouldn't practically matter. #2316	2023-11-02 19:33:06 +09:00
Yuya Nishihara	f5a61dc2b7	git_backend: open just-initialized repo with canonicalized path Otherwise, the initialized repo could have a different work-dir path than the load()-ed one. libgit2 appears to do some normalization somewhere, but gix won't.	2023-11-02 19:33:06 +09:00
Yuya Nishihara	fd187d266f	git_backend: box GitBackendInit/LoadError up front These error enums will wrap gix error types, and will become bigger enough for clippy to complain.	2023-11-02 19:33:06 +09:00
Yuya Nishihara	1788b5014e	git_backend: remove redundant copy back of author timestamp Only the committer timestamp can be updated inside a loop.	2023-10-31 06:51:27 +09:00
Yuya Nishihara	f5aa739c70	git_backend: use .strip_suffix() instead of manual slicing	2023-10-31 06:51:27 +09:00
Yuya Nishihara	9bd84c55e0	git_backend: use file mode extensively in read_tree() Both filemode() and kind() are calculated from the same underlying data, and kind() is libgit2-specific API.	2023-10-31 06:51:27 +09:00
Yuya Nishihara	b3c9cab12d	git_backend: handle read_tree() lookup/encoding errors gracefully	2023-10-31 06:51:27 +09:00
Yuya Nishihara	847adc832f	git_backend: use lossy conversion to decode non-UTF-8 commit message If message() returned None, it doesn't mean the commit message is empty. I originally mapped it to an error, but that made import of linux repo fail. https://docs.rs/git2/latest/git2/struct.Commit.html#method.message	2023-10-31 06:51:27 +09:00
Yuya Nishihara	06c254e742	git_backend: use non-owned str::from_utf8() to decode symlink target Just for consistency with the other changes. str::Utf8Error is 2 words long, so I removed the boxing.	2023-10-31 06:51:27 +09:00
Yuya Nishihara	d1c71c05c9	git_backend: remove redundant error handling for invalid hash length The only error that could be returned by libgit2 is invalid hash length, and we check that explicitly. If we switch the backends to gitoxide, there will be panicking constructor. https://docs.rs/git2/latest/git2/struct.Oid.html#method.from_bytes	2023-10-31 06:51:27 +09:00
Martin von Zweigbergk	cfcdd71865	backend: make `read_conflict` synchronous again This avoids https://github.com/rust-lang/futures-rs/issues/2090. I don't think we need to worry about reading legacy conflicts asynchronously - async is really only useful for Google's backend right now, and we don't use the legacy format at Google. In particular, I don't want `MergedTree::value()` to have to be async.	2023-10-28 16:45:40 -07:00
Martin von Zweigbergk	578d61ec2e	git: use forward slashes in relative path to backing git repo Closes #2403.	2023-10-20 22:51:14 -05:00
Martin von Zweigbergk	f541f9f3a6	cleanup: import `futures::exectutor::block_on()` instead of qualifying It seems we'll end up using `block_on()` quite a bit, at least until we're done transitioning to async, and the function name doesn't conflict with anything else, so let's always import it when we need it.	2023-10-20 07:38:34 -07:00
Martin von Zweigbergk	f8be0b2030	backends: deduplicate definition of backend names I copied the example set by `DefaultSubmoduleStore`.	2023-10-14 06:38:35 -07:00
Martin von Zweigbergk	5174489959	backend: make read functions async The commit backend at Google is cloud-based (and so are the other backends); it reads and writes commits from/to a server, which stores them in a database. That makes latency much higher than for disk-based backends. To reduce the latency, we have a local daemon process that caches and prefetches objects. There are still many cases where latency is high, such as when diffing two uncached commits. We can improve that by changing some of our (jj's) algorithms to read many objects concurrently from the backend. In the case of tree-diffing, we can fetch one level (depth) of the tree at a time. There are several ways of doing that: * Make the backend methods `async` * Use many threads for reading from the backend * Add backend methods for batch reading I don't think we typically need CPU parallelism, so it's wasteful to have hundreds of threads running in order to fetch hundreds of objects in parallel (especially when using a synchronous backend like the Git backend). Batching would work well for the tree-diffing case, but it's not as composable as `async`. For example, if we wanted to fetch some commits at the same time as we were doing a diff, it's hard to see how to do that with batching. Using async seems like our best bet. I didn't make the backend interface's write functions async because writes are already async with the daemon we have at Google. That daemon will hash the object and immediately return, and then send the object to the server in the background. I think any cloud-based solution will need a similar daemon process. However, we may need to reconsider this if/when jj gets used on a server with a custom backend that writes directly to a database (i.e. no async daemon in between). I've tried to measure the performance impact. That's the largest difference I've been able to measure was on `jj diff --ignore-working-copy -s --from v5.0 --to v6.0` in the Linux repo, which increases from 749 ms to 773 ms (3.3%). In most cases I've tested, there's no measurable difference. I've tried diffing from the root commit, as well as `jj --ignore-working-copy log --no-graph -r '::v3.0 & author(torvalds)' -T 'commit_id ++ "\n"'` (to test a commit-heavy load).	2023-10-08 23:36:49 -07:00
Martin von Zweigbergk	bd5eef9c5e	git_backend: rename some `store` variables `backend` in tests This is to avoid confusion with instances of the `Store` type.	2023-10-08 23:36:49 -07:00
Yuya Nishihara	f0ad1f53ea	git_backend: on read_commit(), fall back to importing extras as needed One problematic scenario is that we have commits imported by old jj, and all of their descendant commits are created by jj. Therefore import_head_commits() wouldn't reach the old ancestor commits. This change might bury a real bug, but I don't have a better alternative. Maybe we can remove this hack after a couple of jj releases, and add a debug command that imports all reachable Git commits from all historical heads. Closes #2343	2023-10-07 02:08:36 +09:00
Yuya Nishihara	7c96cead34	git_backend: rename git_repo_clone() as it isn't just cloning, propagate error Since git2::Repository::open() will access to the filesystem, it can technically fail.	2023-10-04 00:04:24 +09:00
Yuya Nishihara	837dc4f47c	git_backend: rewrite remaining git_repo() callers, make it private While debugging git issues, I often ended up creating a deadlock by adding debug prints. It's also not obvious that git::export_refs() works even if the git_repo() has already been locked, whereas git::import_refs() wouldn't. Let's consolidate lock handling to the backend implementation.	2023-10-04 00:04:24 +09:00
Yuya Nishihara	0d63223dad	git_backend: proxy git2::Repository methods to manage lock scope internally Since git_repo() acquires Mutex, it's super easy to create a deadlock.	2023-10-04 00:04:24 +09:00
Martin von Zweigbergk	d575aaeca8	backend: move constant functions first `root_commit_id()`, `root_change_id()`, and `empty_tree_id()` were strangely ordered between `write_symlink()` and `read_tree().	2023-09-19 05:24:51 -07:00
Yuya Nishihara	d05aaf30b4	git: promote imported commits to tree-conflict format if configured	2023-09-08 21:54:43 +09:00
Yuya Nishihara	f744e5920b	git: create no-gc ref by GitBackend Since we now have an explicit method to import heads, it makes more sense to manage no-gc refs by the backend.	2023-09-08 21:54:43 +09:00
Yuya Nishihara	17f502c83a	git: do not import unknown commits by read_commit(), use explicit method call The main goal of this change is to enable tree-level conflict format, but it also allows us to bulk-import commits on clone/init. I think a separate method will help if we want to provide progress information, enable check for .jjconflict entries under certain condition, etc. Since git::import_refs() now depends on GitBackend type, it might be better to remove git_repo from the function arguments.	2023-09-08 21:54:43 +09:00
Martin von Zweigbergk	7e6930b56f	backend: remove last few instances of `MergedTreeId::as_legacy_tree_id()`	2023-08-30 06:17:21 -07:00
Martin von Zweigbergk	e4ba6a42fc	backends: store tree id conflicts as list with alternating signs Now that we have `Merge::iter()` and friends, it's simpler to store the tree ids in a single list.	2023-08-26 07:02:04 -07:00
Martin von Zweigbergk	fd4146d485	backend: use new enum for `Commit::root_tree` We currently represent the root tree id in a commit by `Merge<TreeId>` plus a boolean `uses_tree_conflict_format`. It's better to use an enum for that. That makes it harder to forget to check which type of tree it is, and it makes it impossible to store a legacy tree with multiple ids (as we could with `uses_tree_conflict_format=false`, `root_tree=Merge::new(...)`). Maybe more importantly, we're also going to want to pass around this information in most places where we currently pass a single `TreeId`, and passing two separate values would be annoying.	2023-08-26 07:02:04 -07:00
Martin von Zweigbergk	589e0db3c5	git_backend: remove unused proto field for resolved tree id We store resolved tree ids in the regular Git commit, so we we never ended up using the `resolved` variant in the `root_tree`.	2023-08-26 07:02:04 -07:00
Waleed Khan	134d85e635	backend: reduce `BackendError` size somewhat One of the error types that I later created embedded `BackendError`, but `clippy` complained that the size of the type was too large. This helps address that.	2023-08-23 21:11:15 -07:00
Emily Fox	2c88da02b4	git: teach backend to handle empty name and email strings	2023-08-18 17:22:59 -05:00
Martin von Zweigbergk	ef5f97f8d7	conflicts: move `Merge<T>` to `merge` module The `merge` module now seems like the obvious place for this type.	2023-08-06 22:08:09 +00:00
Martin von Zweigbergk	ecc030848d	conflicts: rename `Conflict<T>` to `Merge<T>` Since `Conflict<T>` can also represent a non-conflict state (a single term), `Merge<T>` seems like better name. Thanks to @ilyagr for the suggestion in https://github.com/martinvonz/jj/pull/1774#discussion_r1257547709 Sorry about the churn. It would have been better if I thought of this name before I introduced `Conflict<T>`.	2023-08-06 22:08:09 +00:00
Martin von Zweigbergk	006c764694	backend: learn to store tree-level conflicts Tree-level conflicts (#1624) will be stored as multiple trees associated with a single commit. This patch adds support for that in `backend::Commit` and in the backends. When the Git backend writes a tree conflict, it creates a special root tree for the commit. That tree has only the individual trees from the conflict as subtrees. That way we prevent the trees from getting GC'd. We also write the tree ids to the extra metadata table (i.e. outside of the Git repo) so we don't need to load the tree object to determine if there are conflicts. I also added new flag to `backend::Commit` indicating whether the commit is a new-style commit (with support for tree-level conflicts). That will help with the migration. We will remove it once we no longer care about old repos. When the flag is set, we know that a commit with a single tree cannot have conflicts. When the flag is not set, it's an old-style commit where we have to walk the whole tree to find conflicts.	2023-07-19 22:04:16 -07:00
Waleed Khan	54dba51a08	docs: warn about missing docs for `jj-lib` crate	2023-07-10 18:28:59 +03:00
Yuya Nishihara	5346bd734f	git_backend: translate io::Error of read_conflict() to ReadObject error This is the last place in Git backend where io::Error is magically converted to BackendError::Other.	2023-07-06 20:48:46 +09:00
Yuya Nishihara	4e4ca46998	git_backend: wrap TableStoreError to preserve source error object	2023-07-06 20:48:46 +09:00
Yuya Nishihara	cf8a0466c4	backend: introduce error types specific to init/load phases Errors that may occur while loading backend would vary per backends, and it's unlikely that these errors could be mapped to BackendError variants other than BackendError::Other. So let's extract Other(_) of that kind as a separate type to clarify there would be no other error variants. Perhaps, Backend/Error will be renamed to CommitBackend/Error or CommitStore/Error?, whereas I think BackendInit/LoadError can be shared among store factories.	2023-07-06 20:48:46 +09:00
Yuya Nishihara	e1e75daa8e	backend: make BackendError::Other preserve source error object	2023-07-06 20:48:46 +09:00
Yuya Nishihara	5b78fe75b1	git_backend: propagate load() error to caller #1794	2023-07-06 12:43:49 +09:00

1 2 3

118 commits