mirrors/jj

mirror of https://github.com/martinvonz/jj.git synced 2025-01-31 00:12:06 +00:00

Author	SHA1	Message	Date
Martin von Zweigbergk	2a531832d6	rewrite: make merge_commit_trees() use index for finding common ancestors The index is now always kept up to date and it has functionality for finding common ancestors, so let's use it! This should make merging commits a little faster if their common ancestor is far away (which is rare). It's probably much more important that the index-based algorithm is more correct. Also, it returns multiple common ancestors in the criss-cross case, which lets us do a recursive merge like git does. I'm leaving the recursive merge for later, though.	2021-02-23 20:49:18 -08:00
Martin von Zweigbergk	bb94516175	index: add support for finding common ancestors We currently need to read the commit objects for finding common ancestors. That can be very slow when the common ancestor is far back in history. This patch adds a function for finding common ancestors using the index instead. Unlike the current algorithm, which only returns one common ancestor, the new index-based one correctly handles criss-cross merges. Here are some timings for finding the common ancestors in the git.git repo: \| Without index \| With Index \| \| First run \| Subsequent \| First run \| Subsequent \| v2.30.0-rc0 v2.30.0-rc1 \| 5.68 ms \| 5.94 us \| 40.3 us \| 4.77 us \| v2.25.4 v2.26.1 \| 1.75 ms \| 1.42 us \| 13.8 ms \| 4.29 ms \| v1.0.0 v2.0.0 \| 492 ms \| 2.79 ms \| 23.4 ms \| 6.41 ms \| Finding ancestors of v2.25.4 and v2.26.1 got much slower because the new algorithm finds all common ancestors. Therefore, it also finds v2.24.2, v2.23.2, v2.22.3, v2.21.2, v2.20.3, v2.19.4, v2.18.3, and v2.17.4, which it then filters out because they're all ancestors of v2.25.3. Also note that the result was incorrect before, because the old algorithm would return as soon as it had found a common ancestor, even if it's not the latest common ancestor. For example, for the common ancestor between v1.0.0 and v2.0.0, it returned an ancestor of v1.0.0 because it happened to get there by following some side branch that led there more quickly. The only place we currently need to find the common ancestor is when merging trees, which we only do when the user runs `jj merge`, as well as when operating on existing merge commits (e.g. to diff or rebase them). That means that this change won't be very noticeable. However, it's something we clearly want to do sooner or later, so we might as well get it done.	2021-02-23 17:29:23 -08:00
Martin von Zweigbergk	422d333d4b	index: make heads() return result in index order instead of hash order It's nice to have a non-random order for tests (we can revisit later if it shows up in profiling). I'm changing the order to be the index order so the future caller of `heads_pos()` (not `heads()`) will also get consistent order.	2021-02-23 17:24:55 -08:00
Martin von Zweigbergk	1481935472	index: extract a function for removing ancestors of set based on positions We already have the `heads()` function, which works on `CommitId`s. This just extracts a function that works on positions. I'll use it soon.	2021-02-21 22:28:44 -08:00
Martin von Zweigbergk	5aadbcf6fc	evolve: pass Transaction to listener functions, so they see the updated state	2021-02-21 22:27:13 -08:00
Martin von Zweigbergk	62ce5782b5	index: when writing incremental index, squash into parent file if smaller We currently write a new incremental index file every time. That means that the stack of index files quickly gets deep, which makes it slow to read the index. This commit makes it so that we squash the new index segment into its parent if the parent has fewer commits. That means we'll limit the number of files to O(log n). Writes time will also be O(log n) on average.	2021-02-16 23:47:43 -08:00
Martin von Zweigbergk	a51543b752	index: make first level in stats be the root index I've confused myself a few times already thinking that level 0 is the root, so that's probably more intuitive. It also makes tests simpler because the initial part of the list is unchanged when a new transaction commits.	2021-02-16 23:45:54 -08:00
Martin von Zweigbergk	b122f33312	index: don't write empty incremental index file	2021-02-16 23:45:52 -08:00
Martin von Zweigbergk	a7b6bcfd79	transaction: write incremental index on commit With this change, we start writing the incremental index to disk, so the next reader won't have to re-read the commits and create the index. As of this change, we simply write a new index file for each transaction. That will clearly mean that the stack of files gets deep pretty quickly. For now, the user will have to do `jj debug reindex` when things get slow. I plan to change it so instead of writing an incremental index file every time, we first check if the new index file would have at least as many commits as the parent file, and if it will, we write a combined one instead. That should apply recursively, so we'd have O(log n) index files.	2021-02-15 11:03:41 -08:00
Martin von Zweigbergk	86915f0a6f	index: fix check for adding existing commit to index The check for adding an existing commit to the index only checked if the commit was already in the `MutableIndex`, not if it was already in the parent `ReadonlyIndex`.	2021-02-15 10:28:18 -08:00
Martin von Zweigbergk	37cf6a8395	transaction: don't walk to root when adding on top of non-head I don't know why I made the walk stop at heads instead of indexed commits before. Perhaps I did it because it's cheap to check in the set of head. However, it gets very expensive to walk all the way back to the root if the parents are not in the set of heads.	2021-02-15 10:28:18 -08:00
Martin von Zweigbergk	0f56e014b7	tests: some fixups to test_transaction as a result of reordering commits	2021-02-15 10:28:07 -08:00
Martin von Zweigbergk	3c832cbbbe	index: let index structs keep track of the index directory This matches how it's done for the other struct (View, WorkingCopy).	2021-02-14 01:03:49 -08:00
Martin von Zweigbergk	b77740e58a	index: move function for saving MutableIndex onto the struct	2021-02-14 01:03:49 -08:00
Martin von Zweigbergk	713d32d803	index: keep up to date within transaction With tons of groundwork done, wee can now finally keep the index up to date within a transaction! That means that we can start relying on the index to always be valid, so we can use it e.g. for finding common ancestors within a transaction. That should help speed up `jj evolve` immensely on large repos. We still don't write the updated index to disk when the transaction closes. That will come later.	2021-02-14 00:58:11 -08:00
Martin von Zweigbergk	e19a65cf14	transaction: make add_head() use incremental update of evolution in common case `Transaction::add_head()` currently invalidates the whole evolution state. We've had support for incrementally updating evolution since `4619942a57`. We should start taking advantage of that. Let's add a fast-path in `Transaction::add_head()` for the common case where we add a single commit on top of an existing head. That cheap an simple to check for. However, it won't cover the case of adding a child off of a non-head. It's still a good start.	2021-02-14 00:56:34 -08:00
Martin von Zweigbergk	f05a12d301	index: make CompositeIndex non-public and add new IndexRef enum instead We're getting close to finally having a `RepoRef::index()` method.	2021-02-13 13:56:26 -08:00
Martin von Zweigbergk	face4d637f	index: define methods from CompositeIndex directly on {Readonly,Mutable}Index This is one step towards making `CompositeIndex` non-public (and maybe deleting it). Next, we'll add an `IndexRef` enum similar to `RepoRef` etc.	2021-02-13 13:46:58 -08:00
Martin von Zweigbergk	8dda4b05e4	index: add "segment_" prefix to methods in IndexSegment I'm about to move the functions from `CompositeIndex` to an new `Index` trait implemeted by `ReadonlyIndex` and `MutableIndex`. Since those types already implement `IndexSegment`, the names would conflict and it would get annoying to have to disambiguate them. This commit therefore prepares for that by adding a `segment_` prefix to the functions in `IndexSegment`.	2021-02-13 13:45:31 -08:00
Martin von Zweigbergk	3066381d57	transaction: add accessors for view and evolution directly on transaction	2021-02-13 13:43:48 -08:00
Martin von Zweigbergk	72aebc9da3	view: replace View trait by enum with Readonly and Mutable variants	2021-02-13 08:31:41 -08:00
Martin von Zweigbergk	d1e5f46969	evolution: replace Evolution trait by enum with Readonly and Mutable variants	2021-02-13 08:31:41 -08:00
Martin von Zweigbergk	f1666375bd	repo: replace Repo trait by enum with readonly and mutable variants I want to keep the index updated within the transaction. I tried doing that by adding a `trait Index`, implemented by `ReadonlyIndex` and `MutableIndex`. However, `ReadonlyRepo::index` is of type `Mutex<Option<Arc<IndexFile>>>` (because it is lazily initialized), and we cannot get a `&dyn Index` that lives long enough to be returned from a `Repo::index()` from that. It seems the best solution is to instead create an `Index` enum (instead of a trait), with one readonly and one mutable variant. This commit starts the migration to that design by replacing the `Repo` trait by an enum. I never intended for there there to be more implementations of `Repo` than `ReadonlyRepo` and `MutableRepo` anyway.	2021-02-13 08:31:23 -08:00
Martin von Zweigbergk	a1983ebe96	git: add a ref to each commit we create I just learned that attaching a git note is not enough to keep a commit from being GC'd. I had read `git help gc` before but it was quite misleading (I just sent a patch to clarify it). Since the git note is not enough, we need to create some other reference. This patch makes it so we write refs in `refs/jj/keep/` for every commit we create. We will probably want to remove unnecessary refs (ancestors of commits pointed to by other refs) once we have a `jj gc` command.	2021-02-13 08:16:18 -08:00
Martin von Zweigbergk	dd98f0564e	git: remove git note pointing to conflicts We store conflicts as blobs with JSON data and with a git note pointing to them to prevent GC. These are stored in the git tree as regular files. The only thing that distinguishes them is that their filename ends with `.jjconflict`. Since they are referenced from the tree, there's no need for the git note to prevent GC (which doesn't work anyway, as I just learned), and we don't store any additional data in the note either, so let's just remove it.	2021-02-13 08:15:12 -08:00
Martin von Zweigbergk	fa30cf768f	index: rename UnsavedIndexData to MutableIndex	2021-02-07 23:35:37 -08:00
Martin von Zweigbergk	8170c06573	index: rename IndexFile to ReadonlyIndex	2021-02-07 23:35:22 -08:00
Martin von Zweigbergk	51373b75ff	index: use correct per-level file name in stats (previously always top-level)	2021-02-07 23:34:57 -08:00
Martin von Zweigbergk	302c66825f	working_copy: preserve executable bit on Windows Windows doesn't support recording the executable bit in the file system. Before this commit, the code for reading and writing the executable wouldn't even compile on Windows. This commit at least makes it so we preserve whatever bit has been recorded in the repo. At least I hope that's what it does -- I don't have access to a Windows machine right now.	2021-02-07 00:50:21 -08:00
Martin von Zweigbergk	d4aed83aa6	working_copy: correct comment about stat accuracy We don't take a write lock when writing a file; other processes can modify the file.	2021-02-07 00:48:50 -08:00
Martin von Zweigbergk	3d679de022	working_copy: print warning about ignored symlinks instead of failing build The project doesn't currently build on Windows. One reason is because we had a `unimplemented!()` when trying to write a symlink. Let's print a warning instead, so the project can start building on Windows. (The next patch will fix another build problem on Windows.)	2021-02-07 00:46:17 -08:00
Martin von Zweigbergk	e0112a4be0	transaction: report failure to close transaction only in debug builds When a transaction gets dropped without being committed or explicitly discarded, we currently raise an assertion error. I added that check because I kept forgetting to commit transactions. However, it's quite normal to want to drop transactions in error cases. The current assertion means that we panic and don't report the actual error to the user in such cases. We should probably audit the code paths where we commit transactions and decide for each if we simply want to to discard the transaction or not. In some cases, we may want to commit the transaction without integrating it in the operation log (i.e. without creating a file entry in .jj/views/op_heads/). However, we can do that later. For now, let's just make sure we don't panic when dropping the transaction in release builds.	2021-02-07 00:11:42 -08:00
Martin von Zweigbergk	4ecbd89378	repo: move MutableRepo from transaction module to repo module	2021-01-31 18:15:32 -08:00
Martin von Zweigbergk	2d03b514fc	transaction: move construction of MutableRepo out of Transaction::new() I'm about to move `MutableRepo` to the `repo` module and it will make more sense to have the construction of it there then.	2021-01-31 18:15:32 -08:00
Martin von Zweigbergk	bf53c6c506	transaction: add factory function to MutableRepo This helps finish the encapsulate of the `evolution` field.	2021-01-31 18:15:32 -08:00
Martin von Zweigbergk	5604303954	transaction: avoid direct access to members of MutableRepo I'm about to move `MutableRepo` to the `repo` module so it becomes more important to encapsulate access. Besides, the new functions introduced in this commit reduces some duplication. There's still one access of `MutableRepo::evolution` in `Transaction::new()`. I'll address that next by adding a factory function to `MutableRepo`.	2021-01-31 18:15:32 -08:00
Martin von Zweigbergk	a28fe7b388	transaction: slightly simplify write_commit() by using store()	2021-01-31 18:15:22 -08:00
Martin von Zweigbergk	9ffd35caf8	transaction: when checking out open commit with conflicts, create child commit I've been confused twice that rebasing an open commit so it results in conflicts doesn't show the conflicts in the log output. That's because we create a successor instead if a commit with conflicts is open. I guess I thought it would be expected that a child commit was not created. Since it seems surprising in practice, let's change it and we'll see if the new behavior is more or less surprising.	2021-01-22 11:41:52 -08:00
Martin von Zweigbergk	bb730d8a2b	merge: rewrite code for 3-way merge of files to handle not just trivial cases The most annoying remaining bug is that 3-way merge frequently panics with "unhandled merge case". This commit fixes that by rewriting the merge code. The new code is based on the algorithm used in Mercurial (which was in turn copied from Bazaar): 1. Find "sync" regions, which are regions that are the unchanged in the base and two sides. Note their start end end positions in each version. 2. Produce the output by taking the sync regions and inserting the result of merging the regions between the sync regions. These regions can either be changed on only one side, in which case we use that version, or it can be changed on both sides, in which case we indicate a conflict in the output. It's both more correct and much easier to follow.	2021-01-22 11:41:50 -08:00
Martin von Zweigbergk	7957feca49	diff: make tokenization return slices instead of making copies	2021-01-21 22:42:55 -08:00
Martin von Zweigbergk	30939ca686	view: return &HashSet instead of Iterator We want to be able to be able to do fast `.contains()` checks on the result, so `Iterator` was a bad type. We probably should hide the exact type (currently `HashSet` for both readonly and mutable views), but we can do that later. I actually thought I'd want to use `.contains()` for indiciting public-phase commits in the log output, but of course want to also indicate ancestors as public. This still seem like a step (mostly) in the right direction.	2021-01-16 13:00:05 -08:00
Martin von Zweigbergk	79eecb6119	git: mark imported remote-tracking branches as public	2021-01-16 12:14:42 -08:00
Martin von Zweigbergk	4db3d8d3a6	view: add tracking of "public" heads (copying Mercurial's phase concept) Mercurial's "phase" concept is important for evolution, and it's also useful for filtering out uninteresting commits from log output. Commits are typically marked "public" when they are pushed to a remote. The CLI prevents public commits from being rewritten. Public commits cannot be obsolete (even if they have a successor, they won't be considered obsolete like non-public commits would). This commits just makes space for tracking the public heads in the View.	2021-01-16 11:48:35 -08:00
Martin von Zweigbergk	265f90185e	tests: simplify transaction tests slightly by using testutils more	2021-01-16 11:31:57 -08:00
Martin von Zweigbergk	f43880381f	view: make sure we don't leave a dangling git ref All commits in the view are supposed to be reachable from its heads. If a head is removed and there are git refs pointing to ancestors of it (or to the removed head itself), we should make that ancestor a head.	2021-01-16 11:05:32 -08:00
Martin von Zweigbergk	1f593a4193	view: create helper for enforcing view's invariants The only invariant we currently enforce is that the set of heads does not include any ancestors of other commits in the set. I'm about to make sure that we don't end up with dangling git refs (pointing to commits no reachable from the heads). It will be useful to have a single place to enforce that since we'll need to do the same thing after updating the view as after merging views.	2021-01-16 10:35:46 -08:00
Martin von Zweigbergk	1f27a78957	view: make remove_head() not add parents as heads I think it's better to let the caller decide if the parents should be added. One use case for removing a head is when fetching from a Git remote where a branch has been rewritten. In that case, it's probably the best user experience to remove the old head. With the current semantics of `View::remove_head()`, we would need to walk up the graph to find a commit that's an ancestor and for each commit we remove as head, its parents get temporarily added as heads. It's much easier for callers that want to add the parents as heads to do that.	2021-01-15 01:08:05 -08:00
Martin von Zweigbergk	315818260f	git: slightly simplify a few tests	2021-01-11 00:34:04 -08:00
Martin von Zweigbergk	19b542b318	git: simplify error handling by passing git repo into git module functions	2021-01-11 00:25:39 -08:00
Martin von Zweigbergk	da0bbbe637	view: start tracking git refs Git refs are important at least for understanding where the remote branches are. This commit adds support for tracking them in the view and makes `git::import_refs()` update them. When merging views (either because of concurrent operations or when undoing an earlier operation), there can be conflicts between git ref changes. I ignored that for now and let the later operation win. That will probably be good enough for a while. It's not hard to detect the conflicts, but I haven't yet decided how to handle them. I'm leaning towards representing the conflicting refs in the view just like how we represent conflicting files in the tree.	2021-01-10 20:13:22 -08:00

1 2 3

105 commits