jj/README.md

# Jujube


## Disclaimer

This is not a Google product. It is an experimental version-control system
(VCS). It is not ready for use. It was written by me, Martin von Zweigbergk
(martinvonz@google.com). It is my personal hobby project. It does not indicate
any commitment or direction from Google.


## Introduction

I started the project mostly in order to test the viability of some UX ideas in
practice. I continue to use it for that, but my short-term goal now is to make
it useful as an alternative CLI for Git repos.

The storage design is similar to Git's in that it stores commits, trees, and
blobs. However, the blobs are actually split into three types: normal files,
symlinks (Unicode paths), and conflicts (more about that later).

The command-line tool is called `jj` for now because it's easy to type and easy
to replace (rare in English). The project is called "Jujube" (a fruit) because
that's the first word I could think of that matched "jj".


## Features

The following subsections describe the current features. The text is aimed at
readers who are already familiar with other VCSs.

### Compatible with Git

The tool currently has two backends. One is called "local store" and is very
simple and inefficient. The other backend uses a Git repo as storage. The
commits are stored as regular Git commits. Commits can be read from and written
to an existing Git repo. This makes it possible to create a Jujube repo and use
it as an alternative interface for a Git repo (it will be backed by the Git repo
just like additional Git worktrees are).

### Written as a library

The project consists of two main parts: the lib crate and the main (CLI)
crate. Most of the code lives in the lib crate. The lib crate does not print
anything to the terminal. The separate lib crate should make it relatively
straight-forward to add a GUI.

### Operations are performed repo-first

Almost all operations are done in the repo first and then possibly reflected in
the working copy. The only exception so far is when committing the working copy,
which naturally uses the working copy as input.

This makes it faster because the working copy doesn't need to get updated. It
also means that the working copy won't see spurious changes e.g. during a rebase
operation. It makes it safe to update the working copy while some operation is
running.

### Supports Evolution

Jujube copies the Evolution feature from Mercurial. It keeps track of when a
commit gets rewritten. A commit has a list of predecessors in addition to the
usual list of parents. This lets the tool figure out where to rebase descendant
commits to when a commit has been rewritten (amended, rebased, etc.). See
https://www.mercurial-scm.org/wiki/ChangesetEvolution for more information.

### The working copy is a commit

The working copy gets automatically committed when you interact with the
tool. This simplifies both implementation and UX. It also means that the working
copy is frequently backed up.

Any changes to the working copy stay in place when you check out another
commit. That is different from Git and Mercurial, but I think it's more
intuitive for new users. To replicate the default behavior of Git/Mercurial, use
`jj rebase -r @ -d <destination>` (`@` is a name for the working copy
commit). There is no need to stash/unstash.

Commands become more consistent because the same command can operate on the repo
or another commit. For example, `jj log` includes the working copy (much like
`gitk` and other tools include a node for the working copy). `jj squash`
squashes a commit into its parent, including if it's the working copy (like `git
commit --amend`/`hg amend`).

A commit description can be added to the working copy before "commit". The same
command (`jj describe`) is used for changing the description of any commit.

### Commits can contain conflicts

When a merge conflict happens, it is recorded within the tree object as a
special conflict object (not a file object with conflict markers). Conflicts are
stored as a lists of states to add and another list of states to remove. A
regular 3-way merge adds [B,C] and removes [A] (the common ancestor). A
modify/remove conflict adds [B] and removes [A]. An add/add conflict adds
[B,C]. An octopus merge of N commits adds N states and removes N-1 states. A
non-conflict state A is equivalent to a conflict state that just adds [A]. A
"state" here can be a normal file, a symlink, or a tree. This support for
in-tree conflicts has some interesting effects on both implementation and UX.

It means that there is a consistent way of resolving conflicts: check out a
commit with conflicts in, resolve the conflicts, and amend them into the
conflicted commit. Then evolve descendant commits.

It naturally enables collaborative conflict resolution.

The in-tree conflicts means that there is no need for book-keeping in
rebase-like commands to support continue/abort operations. Instead, the rebase
can simply continue and create the desired new DAG shape.

Conflicts get simplified on rebase by removing pairs of matching states in the
"add" and "remove" lists. For example, let's say commit B is based on A and is
rebased to C, where it results in conflicts, which the user leaves
unresolved. If the commit is then rebased to D, it will be a regular 3-way merge
between B and D with A as base (no trace of C). This means that you can keep old
commits rebased to head without resolving conflicts, and you still won't have
messy recursive conflicts.

The conflict handling also results in some Darcs-/Pijul-like properties. For
example, if you rebase a commit and it results in conflicts, and you then back
out that commit, the conflict will go away. (I plan to make that work even if
there had been unrelated changes in the file, but I haven't gotten around to it
yet.)

The criss-cross merge case becomes simpler. In Git, the virtual ancestor may
have conflicts and you may get nested conflict markers in the working copy. In
Jujube, the result is a merge with multiple parts, which may even get simplified
to not be recursive.

The in-tree conflicts make it natural and easy to define the contents of a merge
commit to be the difference compared to the merged parents (the so-called "evil"
part of the merge), so that's what Jujube does. Rebasing merge commits therefore
works as you would expect (Git and Mercurial both handle rebasing of merge
commits poorly). It's even possible to change the number of parents while
rebasing, so if A is non-merge commit, you can make it a merge commit with `jj
rebase -r A -d B -d C`. `jj diff -r <commit>` will show you the diff compared to
the merged parents.

I intend for commands that present the contents of a tree (such as listing
files) to use the "add" state(s) of the conflict, but that's not yet done.

### Operations are logged

Each write operation is logged to a content-addressed storage, much like the
commit storage. The Operation object has an associated View object, much like
the Commit object has a Tree object. The view object contains all the heads
currently in the repo, as well as the checked-out commit. It will also contain
the refs if I add support for that. The operation object can have multiple
parent operations, so it forms a DAG just like the commit graph does. There is
normally only one parent operation, but there can be multiple parents if
concurrent operations happened.

I added the operation log as a solution for the problem of making concurrent
repo edits safe. When the repo is loaded, it is loaded at a particular
operation, which provides an immutable view of the repo. For a caller of the
library to start making changes, they then have to start a transaction. Once
they are done making changes to the transaction, they commit the
transaction. The operation object is then created. This step cannot fail (except
if the file system runs out of space or such). Pointers to the heads of the
operation DAG are kept as files in a directory (the filename is the operation
id). When a new operation object has been created, its operation id is added to
the directory. The transaction's base operation id is then removed from that
directory. If concurrent operations happened, there would be multiple new
operation ids in the directory and only one base operation id would have been
removed. If a reader sees the repo in this state, it will attempt to merge the
views and create a new operation with multiple parents. If there are conflicts,
the user will have to resolve it (I haven't implemented that yet).

As a nice side-effect of adding the operation log to solve the concurrent-edits
problem, we get some very useful UX features. Many UX features come from mapping
commands that work on the commit graph onto the operation graph. For example, if
you map `git revert`/`hg backout` onto the operation graph, you get an operation
that undoes a previous operation (called `jj op undo`). Note that any operation
can be undone, not just the latest one. If you map `git restore`/`hg revert`
onto the operation graph, you get an operation that rewinds the repo state to an
earlier point (called `jj op restore`).

You can also see what the repo looked like at an earlier point with `jj
--at-op=<operation id> log`. As mentioned earlier, the checkout is also part of
the view, so that command will show you where the working copy was at that
operation. If you do `jj op restore -o <operation id>`, it will also update the
working copy accordingly. This is actually how the working copy is always
updated: we first commit a transaction with a pointer to the new checkout and
then the working copy is updated to reflect that.

## Future plans

TODO
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`# Jujube`
Boilerplate for new Google open source project 2020-12-12 07:37:25 +00:00

replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`## Disclaimer`
Boilerplate for new Google open source project 2020-12-12 07:37:25 +00:00
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`This is not a Google product. It is an experimental version-control system`
			`(VCS). It is not ready for use. It was written by me, Martin von Zweigbergk`
			`(martinvonz@google.com). It is my personal hobby project. It does not indicate`
			`any commitment or direction from Google.`
Boilerplate for new Google open source project 2020-12-12 07:37:25 +00:00

replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`## Introduction`
Boilerplate for new Google open source project 2020-12-12 07:37:25 +00:00
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`I started the project mostly in order to test the viability of some UX ideas in`
			`practice. I continue to use it for that, but my short-term goal now is to make`
			`it useful as an alternative CLI for Git repos.`
Boilerplate for new Google open source project 2020-12-12 07:37:25 +00:00
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`The storage design is similar to Git's in that it stores commits, trees, and`
			`blobs. However, the blobs are actually split into three types: normal files,`
			`symlinks (Unicode paths), and conflicts (more about that later).`
Boilerplate for new Google open source project 2020-12-12 07:37:25 +00:00
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			The command-line tool is called `jj` for now because it's easy to type and easy
			`to replace (rare in English). The project is called "Jujube" (a fruit) because`
			`that's the first word I could think of that matched "jj".`
Boilerplate for new Google open source project 2020-12-12 07:37:25 +00:00

replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`## Features`
Boilerplate for new Google open source project 2020-12-12 07:37:25 +00:00
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`The following subsections describe the current features. The text is aimed at`
			`readers who are already familiar with other VCSs.`
Boilerplate for new Google open source project 2020-12-12 07:37:25 +00:00
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`### Compatible with Git`
Boilerplate for new Google open source project 2020-12-12 07:37:25 +00:00
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`The tool currently has two backends. One is called "local store" and is very`
			`simple and inefficient. The other backend uses a Git repo as storage. The`
			`commits are stored as regular Git commits. Commits can be read from and written`
			`to an existing Git repo. This makes it possible to create a Jujube repo and use`
			`it as an alternative interface for a Git repo (it will be backed by the Git repo`
			`just like additional Git worktrees are).`

			`### Written as a library`

readme: some further touch-up, plus a correction about in-tree conflicts 2021-01-20 07:55:33 +00:00			`The project consists of two main parts: the lib crate and the main (CLI)`
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`crate. Most of the code lives in the lib crate. The lib crate does not print`
			`anything to the terminal. The separate lib crate should make it relatively`
			`straight-forward to add a GUI.`

			`### Operations are performed repo-first`

			`Almost all operations are done in the repo first and then possibly reflected in`
			`the working copy. The only exception so far is when committing the working copy,`
			`which naturally uses the working copy as input.`

			`This makes it faster because the working copy doesn't need to get updated. It`
			`also means that the working copy won't see spurious changes e.g. during a rebase`
			`operation. It makes it safe to update the working copy while some operation is`
			`running.`

			`### Supports Evolution`

			`Jujube copies the Evolution feature from Mercurial. It keeps track of when a`
			`commit gets rewritten. A commit has a list of predecessors in addition to the`
			`usual list of parents. This lets the tool figure out where to rebase descendant`
			`commits to when a commit has been rewritten (amended, rebased, etc.). See`
			`https://www.mercurial-scm.org/wiki/ChangesetEvolution for more information.`

			`### The working copy is a commit`

			`The working copy gets automatically committed when you interact with the`
			`tool. This simplifies both implementation and UX. It also means that the working`
			`copy is frequently backed up.`

readme: some further touch-up, plus a correction about in-tree conflicts 2021-01-20 07:55:33 +00:00			`Any changes to the working copy stay in place when you check out another`
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`commit. That is different from Git and Mercurial, but I think it's more`
			`intuitive for new users. To replicate the default behavior of Git/Mercurial, use`
			`jj rebase -r @ -d <destination>` (`@` is a name for the working copy
			`commit). There is no need to stash/unstash.`

			`Commands become more consistent because the same command can operate on the repo`
			or another commit. For example, `jj log` includes the working copy (much like
			`gitk` and other tools include a node for the working copy). `jj squash`
			squashes a commit into its parent, including if it's the working copy (like `git
			commit --amend`/`hg amend`).

			`A commit description can be added to the working copy before "commit". The same`
			command (`jj describe`) is used for changing the description of any commit.

readme: some very minor improvements 2021-01-06 06:29:38 +00:00			`### Commits can contain conflicts`
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00
			`When a merge conflict happens, it is recorded within the tree object as a`
			`special conflict object (not a file object with conflict markers). Conflicts are`
			`stored as a lists of states to add and another list of states to remove. A`
			`regular 3-way merge adds [B,C] and removes [A] (the common ancestor). A`
			`modify/remove conflict adds [B] and removes [A]. An add/add conflict adds`
			`[B,C]. An octopus merge of N commits adds N states and removes N-1 states. A`
			`non-conflict state A is equivalent to a conflict state that just adds [A]. A`
			`"state" here can be a normal file, a symlink, or a tree. This support for`
			`in-tree conflicts has some interesting effects on both implementation and UX.`

			`It means that there is a consistent way of resolving conflicts: check out a`
			`commit with conflicts in, resolve the conflicts, and amend them into the`
			`conflicted commit. Then evolve descendant commits.`

			`It naturally enables collaborative conflict resolution.`

			`The in-tree conflicts means that there is no need for book-keeping in`
			`rebase-like commands to support continue/abort operations. Instead, the rebase`
			`can simply continue and create the desired new DAG shape.`

			`Conflicts get simplified on rebase by removing pairs of matching states in the`
readme: some further touch-up, plus a correction about in-tree conflicts 2021-01-20 07:55:33 +00:00			`"add" and "remove" lists. For example, let's say commit B is based on A and is`
			`rebased to C, where it results in conflicts, which the user leaves`
			`unresolved. If the commit is then rebased to D, it will be a regular 3-way merge`
			`between B and D with A as base (no trace of C). This means that you can keep old`
			`commits rebased to head without resolving conflicts, and you still won't have`
			`messy recursive conflicts.`
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00
			`The conflict handling also results in some Darcs-/Pijul-like properties. For`
			`example, if you rebase a commit and it results in conflicts, and you then back`
			`out that commit, the conflict will go away. (I plan to make that work even if`
			`there had been unrelated changes in the file, but I haven't gotten around to it`
			`yet.)`

			`The criss-cross merge case becomes simpler. In Git, the virtual ancestor may`
			`have conflicts and you may get nested conflict markers in the working copy. In`
			`Jujube, the result is a merge with multiple parts, which may even get simplified`
			`to not be recursive.`

			`The in-tree conflicts make it natural and easy to define the contents of a merge`
			`commit to be the difference compared to the merged parents (the so-called "evil"`
			`part of the merge), so that's what Jujube does. Rebasing merge commits therefore`
			`works as you would expect (Git and Mercurial both handle rebasing of merge`
			`commits poorly). It's even possible to change the number of parents while`
readme: drop repeated "a merge commit" 2020-12-18 22:35:12 +00:00			rebasing, so if A is non-merge commit, you can make it a merge commit with `jj
			rebase -r A -d B -d C`. `jj diff -r <commit>` will show you the diff compared to
			`the merged parents.`
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00
			`I intend for commands that present the contents of a tree (such as listing`
			`files) to use the "add" state(s) of the conflict, but that's not yet done.`

			`### Operations are logged`

			`Each write operation is logged to a content-addressed storage, much like the`
			`commit storage. The Operation object has an associated View object, much like`
			`the Commit object has a Tree object. The view object contains all the heads`
			`currently in the repo, as well as the checked-out commit. It will also contain`
			`the refs if I add support for that. The operation object can have multiple`
			`parent operations, so it forms a DAG just like the commit graph does. There is`
			`normally only one parent operation, but there can be multiple parents if`
			`concurrent operations happened.`

			`I added the operation log as a solution for the problem of making concurrent`
readme: some very minor improvements 2021-01-06 06:29:38 +00:00			`repo edits safe. When the repo is loaded, it is loaded at a particular`
			`operation, which provides an immutable view of the repo. For a caller of the`
			`library to start making changes, they then have to start a transaction. Once`
			`they are done making changes to the transaction, they commit the`
			`transaction. The operation object is then created. This step cannot fail (except`
			`if the file system runs out of space or such). Pointers to the heads of the`
			`operation DAG are kept as files in a directory (the filename is the operation`
			`id). When a new operation object has been created, its operation id is added to`
			`the directory. The transaction's base operation id is then removed from that`
			`directory. If concurrent operations happened, there would be multiple new`
			`operation ids in the directory and only one base operation id would have been`
			`removed. If a reader sees the repo in this state, it will attempt to merge the`
			`views and create a new operation with multiple parents. If there are conflicts,`
			`the user will have to resolve it (I haven't implemented that yet).`
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00
			`As a nice side-effect of adding the operation log to solve the concurrent-edits`
			`problem, we get some very useful UX features. Many UX features come from mapping`
			`commands that work on the commit graph onto the operation graph. For example, if`
			you map `git revert`/`hg backout` onto the operation graph, you get an operation
			that undoes a previous operation (called `jj op undo`). Note that any operation
readme: fix two more little typos 2020-12-18 23:46:49 +00:00			can be undone, not just the latest one. If you map `git restore`/`hg revert`
			`onto the operation graph, you get an operation that rewinds the repo state to an`
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			earlier point (called `jj op restore`).

			You can also see what the repo looked like at an earlier point with `jj
			--at-op=<operation id> log`. As mentioned earlier, the checkout is also part of
			`the view, so that command will show you where the working copy was at that`
readme: fix two more little typos 2020-12-18 23:46:49 +00:00			operation. If you do `jj op restore -o <operation id>`, it will also update the
replace placeholder README.md with real content 2020-12-12 08:12:04 +00:00			`working copy accordingly. This is actually how the working copy is always`
			`updated: we first commit a transaction with a pointer to the new checkout and`
			`then the working copy is updated to reflect that.`

			`## Future plans`

			`TODO`