docs: add technical doc about a conflict design

We used to have documention about how conflicts are implemented, but I removed that a long time ago when I rewrote the README to target users rather than VCS hackers. Let's have a doc for the VCS hackers (and curious users) as well, though.
2024-12-24 12:48:55 +00:00 · 2021-12-12 00:19:39 -08:00 · 2021-12-12 00:19:39 -08:00 · b64ee147ae
commit b64ee147ae
parent 625f15cb3e
2 changed files with 54 additions and 1 deletions
--- a/docs/conflicts.md
+++ b/docs/conflicts.md
@ -10,7 +10,8 @@ the rebased commit and the rebase operation will succeed. You can then resolve
 the conflict whenever you want. Conflicted states can be further rebased,
 merged, or backed out. Note that what's stored in the commit is a logical
 representation of the conflict, not conflict *markers*; rebasing a conflict
-doesn't result in a nested conflict markers.
+doesn't result in a nested conflict markers (see
+[technical doc](technical/conflicts.md) for how this works).


 ## Advantages
--- a/docs/technical/conflicts.md
+++ b/docs/technical/conflicts.md
@ -0,0 +1,52 @@
+# First-class conflicts
+
+## Introduction
+
+Conflicts can happen when two changes are applied to some state. This document
+is about conflicts between changes to files (not about [conflicts between
+changes to branch targets](concurrency.md), for example).
+
+For example, if you merge two branches in a repo, there may be conflicting
+changes between the two branches. Most DVCSs require you to resolve those
+conflicts before you can finish the merge operation. Jujutsu instead records
+the conflicts in the commit and lets you resolve the conflict when you feel like
+it. 
+
+
+## Data model
+
+When a merge conflict happens, it is recorded within the tree object as a
+special conflict object (not a file object with conflict markers). Conflicts
+are stored as a lists of states to add and another list of states to remove. A
+"state" here can be a normal file, a symlink, or a tree. These two lists
+together can be a viewed as a simple algebraic expression of positive and
+negative terms. The order of terms is undefined.
+
+For example, a regular 3-way merge between B and C, with A as base, is `B+C-A`
+(`{ adds=[B,C], removes=[A] }`). A modify/remove conflict is `B-A`. An add/add
+conflict is `B+C`. An octopus merge of N commits usually has N positive terms
+and N-1 negative terms. A non-conflict state A is equivalent to a conflict state
+containing just the term `A`. An empty expression indicates absence of any
+content at that path. A conflict can thus encode a superset of what can be
+encoded in a regular path state.
+
+
+## Conflict simplification
+
+Remember that a 3-way merge can be written `B+C-A`. If one of those states is
+itself a conflict, then we simply insert the conflict expression there. Then we
+simplify by removing canceling terms.
+
+For example, let's say commit B is based on A and is rebased to C, where it
+results in conflicts (`B+C-A`), which the user leaves unresolved. If the commit
+is then rebased to D, the result will be `(B+C-A)+(D-C)` (`D-C` comes from
+changing the base from C to D). That expression can be simplified to `B+D-A`,
+which is a regular 3-way merge between B and D with A as base (no trace of C).
+This is what lets the user keep old commits rebased to head without resolving
+conflicts and still not get messy recursive conflicts.
+
+As another example, let's go through what happens when you back out a conflicted
+commit. Let's say we have the usual `B+C-A` conflict on top of non-conflict
+state C. We then back out that change. Backing out ("reverting" in Git-speak) a
+change means applying its reverse diff, so the result is `(B+C-A)+(A-(B+C-A))`,
+which we can simplify to just `A` (i.e. no conflict).