From 707d82245f326f82ec422c1ec42f8268d0c59ba0 Mon Sep 17 00:00:00 2001 From: Glen Choo Date: Mon, 20 Mar 2023 13:14:35 -0700 Subject: [PATCH] docs/technical: add aspirational Git submodules doc This document is meant to be a record of how we think about Git submodules (and not a _current_ implementation of submodules). We will fill it out incrementally as we get a clearer idea of what we want submodules to look like. As an initial version, I started with (IMO) the least controversial points: - We want to support most workflows Git submodules users are accustomed to. - A roadmap that allows us to incrementally roll out Git submodule functionality (instead of having to boil the ocean). --- docs/design/git-submodules.md | 152 ++++++++++++++++++++++++++++++++++ 1 file changed, 152 insertions(+) create mode 100644 docs/design/git-submodules.md diff --git a/docs/design/git-submodules.md b/docs/design/git-submodules.md new file mode 100644 index 000000000..d87614300 --- /dev/null +++ b/docs/design/git-submodules.md @@ -0,0 +1,152 @@ +# Git submodules + +This is an aspirational document that describes how jj _will_ support Git +submodules. Readers are assumed to have some familiarity with Git and Git +submodules. + +This document is a work in progress; submodules are a big feature, and relevant +details will be filled in incrementally. + +## Objective + +This proposal aims to replicate the workflows users are used to with Git +submodules, e.g.: + +- Cloning submodules +- Making new submodule commits and updating the superproject +- Fetching and pushing updates to the submodule's remote +- Viewing submodule history + +When it is convenient, this proposal will also aim to make submodules easier to +use than Git's implementation. + +### Non-goals + +- Non-Git 'submodules' (e.g. native jj submodules, other VCSes) +- Non-Git backends (e.g. Google internal backend) +- Changing how Git submodules are implemented in Git + +## Background + +We mainly want to support Git submodules for feature parity, since Git +submodules are a standard feature in Git and are popular enough that we have +received user requests for them. Secondarily (and distantly so), Git submodules +are notoriously difficult to use, so there is an opportunity to improve the UX +over Git's implementation. + +### Intro to Git Submodules + +[Git submodules](https://git-scm.com/docs/gitsubmodules) are a feature of Git +that allow a repository (submodule) to be embedded inside another repository +(the superproject). Notably, a submodule is a full repository, complete with its +own index, object store and ref store. It can be interacted with like any other +repository, regardless of the superproject. + +In a superproject commit, submodule information is captured in two places: + +- A `gitlink` entry in the commit's tree, where the value of the `gitlink` entry + is the submodule commit id. This tells Git what to populate in the working + tree. + +- A top level `.gitmodules` file. This file is in Git's config syntax and + entries take the form `submodule..*`. These include many + settings about the submodules, but most importantly: + + - `submodule.path` contains the path from the root of the tree + to the `gitlink` being described. + + - `submodule.url` contains the url to clone the submodule + from. + +In the working tree, Git notices the presence of a submodule by the `.git` entry +(signifying the root of a Git repository working tree). This is either the +submodule's actual Git directory (an "old-form" submodule), or a `.git` file +pointing to `/modules/`. The latter +is sometimes called the "absorbed form", and is Git's preferred mode of +operation. + +## Roadmap + +Git submodules should be implemented in an order that supports an increasing set +of workflows, with the goal of getting feedback early and often. When support is +incomplete, jj should not crash, but instead provide fallback behavior and warn +the user where needed. + +The goal is to land good support for pure Jujutsu repositories, while colocated +repositories will be supported when convenient. + +This section should be treated as a set of guidelines, not a strict order of +work. + +### Phase 1: Readonly submodules + +This includes work that inspects submodule contents but does not create new +objects in the submodule. This requires a way to store submodules in a jj +repository that supports readonly operations. + +#### Outcomes + +- Submodules can be cloned anew +- New submodule commits can be fetched +- Submodule history and branches can be viewed +- Submodule contents are populated in the working copy +- Superproject gitlink can be updated to an existing submodule commit +- Conflicts in the superproject gitlink can be resolved to an existing submodule + commit + +### Phase 2: Snapshotting new changes + +This allows a user to write new contents to a submodule and its remote. + +#### Outcomes + +- Changes in the working copy can be recorded in a submodule commit +- Submodule branches can be modified +- Submodules and their branches can be pushed to their remote + +### Phase 3: Merging/rebasing/conflicts + +This allows merging and rebasing of superproject commits in a content-aware way +(in contrast to Git, where only the gitlink commit ids are compared), as well as +workflows that make resolving conflicts easy and sensible. + +This can be done in tandem with Phase 2, but will likely require a significant +amount of design work on its own. + +#### Outcomes + +- Merged/rebased submodules result in merged/rebased working copy content +- Merged/rebased working copy content can be committed, possibly by creating + sensible merged/rebased submodule commits +- Merge/rebase between submodule and non-submodule gives a sensible result +- Merge/rebase between submodule A and submodule B gives a sensible result + +### Phase ?: An ideal world + +I.e. outcomes we would like to see if there were no constraints whatsoever. + +- Rewriting submodule commits rewrites descendants correctly and updates + superproject gitlinks. +- Submodule conflicts automatically resolve to the 'correct' submodule commits, + e.g. a merge between superproject commits creating a merge of the submodule + commits. +- Nested submodules are as easy to work with as non-nested submodules. +- The operation log captures changes in the submodule. + +## Design + +### Guiding principles + +TODO + +### On-disk representation + +TODO + +### Snapshotting new submodule changes + +TODO + +### Merging/rebasing with submodules + +TODO