forked from mirrors/jj
docs/design: Move the run
doc to github.
This ticks another box in #1869. Co-Authored-By: arxanas <me@waleedkhan.name> Co-Authored-By: hooper <hooper@google.com> Co-Authored-By: martinvonz <martinvonz@google.com>
This commit is contained in:
parent
2c74fa8c7c
commit
5b729a90a9
2 changed files with 278 additions and 0 deletions
277
docs/design/run.md
Normal file
277
docs/design/run.md
Normal file
|
@ -0,0 +1,277 @@
|
|||
# Introducing JJ run
|
||||
|
||||
Authors: [Philip Metzger](mailto:philipmetzger@bluewin.ch), [Martin von Zweigberk](mailto:martinvonz@google.com), [Danny Hooper](mailto:hooper@google.com), [Waleed Khan](mailto:me@waleedkhan.name)
|
||||
|
||||
Initial Version, 10.12.2022 (view full history [here](https://docs.google.com/document/d/14BiAoEEy_e-BRPHYpXRFjvHMfgYVKh-pKWzzTDi-v-g/edit))
|
||||
|
||||
|
||||
**Summary:** This Document documents the design of a new `run` command for
|
||||
Jujutsu which will be used to seamlessly integrate with build systems, linters
|
||||
and formatters. This is achieved by running a user-provided command or script
|
||||
across multiple revisions. For more details, read the
|
||||
[Use-Cases of jj run](#Use-Cases-of-jj-run).
|
||||
|
||||
## Preface
|
||||
|
||||
The goal of this Design Document is to specify the correct behavior of `jj run`.
|
||||
The points we decide on here I (Philip Metzger) will try to implement. There
|
||||
exists some prior work in other DVCS:
|
||||
* `git test`: part of [git-branchless]. Similar to this proposal for `jj run`.
|
||||
* `hg run`: Google's internal Mercurial extension. Similar to this proposal for
|
||||
`jj run`.
|
||||
Details not available.
|
||||
* `hg fix`: Google's open source Mercurial extension: [source code][fix-src]. A
|
||||
more specialized approach to rewriting file content without full context of the
|
||||
working directory.
|
||||
* `git rebase -x`: runs commands opportunistically as part of rebase.
|
||||
* `git bisect run`: run a command to determine which commit introduced a bug.
|
||||
|
||||
## Context and Scope
|
||||
|
||||
The initial need for some kind of command runner integrated in the VCS, surfaced
|
||||
in a [github discussion][pre-commit]. In a [discussion on discord][hooks] about
|
||||
the git-hook model, there was consensus about not repeating their mistakes.
|
||||
|
||||
For `jj run` there is prior art in Mercurial, git branchless and Google's
|
||||
internal Mercurial. Currently git-branchless `git test` and `hg fix` implement
|
||||
some kind of command runner. While the Google internal `hg run` works in
|
||||
conjunction with CitC (Clients in the Cloud) which allows it to lazily apply
|
||||
the current command to any affected file. The base Jujutsu backend does not
|
||||
have a fancy virtual filesystem supporting it, so we can't apply this
|
||||
optimization.
|
||||
|
||||
## Goals and Non-Goals
|
||||
|
||||
### Goals
|
||||
|
||||
* We should be able to apply the command to any revision, published or unpublished.
|
||||
* We should be able to parallelize running the actual command, while preserving a
|
||||
good console output.
|
||||
* The run command should be able to work in the working copy.
|
||||
* There should exist some way to signal hard failure.
|
||||
* The command should build enough infrastructure for `jj test`, `jj fix` and
|
||||
`jj format`.
|
||||
* The main goal is to be good enough, as we can always expand the functionality
|
||||
in the future.
|
||||
|
||||
### Non-Goals
|
||||
|
||||
* While we should build a base for `jj test`, `jj format` and `jj fix`, we
|
||||
shouldn't mash their use-cases into `jj run`.
|
||||
* The command shouldn't be too smart, as too many assumptions about workflows
|
||||
makes the command confusing for users.
|
||||
* The smart caching of outputs, as user input commands can be unpredictable.
|
||||
* Fine grained user facing configuration, as it's unwarranted complexity.
|
||||
* A `fix` subcommand as it cuts too much design space.
|
||||
|
||||
## Use-Cases of jj run
|
||||
|
||||
**Linting and Formatting:**
|
||||
|
||||
- `jj run 'pre-commit run' -r $revset`
|
||||
- `jj run 'cargo clippy' -r $revset`
|
||||
- `jj run 'cargo +nightly fmt'`
|
||||
|
||||
**Large scale changes across repositories, local and remote:**
|
||||
|
||||
- `jj run 'sed s/some/test' -r 'draft() & ~remote_branches()'`
|
||||
- `jj run '$rewrite-tool' -r '$revset'`
|
||||
|
||||
**Build systems:**
|
||||
|
||||
- `jj run 'bazel build //some/target:somewhere'`
|
||||
- `jj run 'ninja check-lld'`
|
||||
|
||||
Some of these use-cases should get a specialized command, as this allows
|
||||
further optimization. A command could be `jj format`, which runs a list of
|
||||
formatters over a subset of a file in a revision. Another command could be
|
||||
`jj fix`, which runs a command like `rustfmt --fix` or `cargo clippy --fix` over
|
||||
a subset of a file in a revision.
|
||||
|
||||
## Design
|
||||
|
||||
### Base Design
|
||||
|
||||
All the work will be done in the `.jj/` directory. This allows us to hide all
|
||||
complexity from the users, while preserving the user's current workspace.
|
||||
|
||||
We will copy the approach from git-branchless's `git test` of creating a
|
||||
temporary working copy for each parallel command. The working copies will be
|
||||
reused between `jj run` invocations. They will also be reused within `jj run`
|
||||
invocation if there are more commits to run on than there are parallel jobs.
|
||||
|
||||
We will leave ignored files in the temporary directory between runs. That
|
||||
enables incremental builds (e.g by letting cargo reuse its `target/` directory).
|
||||
However, it also means that runs potentially become less reproducible. We will
|
||||
provide a flag for removing ignored files from the temporary working copies to
|
||||
address that.
|
||||
|
||||
Another problem with leaving ignored files in the temporary directories is that
|
||||
they take up space. That is especially problematic in the case of cargo (the
|
||||
`target/` directory often takes up tens of GBs). The same flag for cleaning up
|
||||
ignored files can be used to address that. We may want to also have a flag for
|
||||
cleaning up temporary working copies *after* running the command.
|
||||
|
||||
An early version of the command will directly use [Treestate] to
|
||||
to manage the temporary working copies. That means that running `jj` inside the
|
||||
temporary working copies will not work . We can later extend that to use a full
|
||||
[Workspace]. To prevent operations in the working copies from
|
||||
impacting the repo, we can use a separate [OpHeadsStore] for it.
|
||||
|
||||
### Modifying the Working Copy
|
||||
|
||||
Since the subprocesses will run in temporary working copies by default, they
|
||||
won't interfere with the user's working copy. The user can therefore continue
|
||||
to work in it while `jj run` is running.
|
||||
|
||||
We want subprocesses to be able to make changes to the repo by updating their
|
||||
assigned working copy. Let's say the user runs `jj run` on just commits A and
|
||||
B, where B's parent is A. Any changes made on top of A would be squashed into
|
||||
A, forming A'. Similarly B' would be formed by squasing it into B. We can then
|
||||
either do a normal rebase of B' onto A', or we can simply update its parent to
|
||||
A'. The former is useful, e.g when the subprocess only makes a partial update
|
||||
of the tree based on the parent commit. In addition to these two modes, we may
|
||||
want to have an option to ignore any changes made in the subprocess's working
|
||||
copy.
|
||||
|
||||
### Modifying the Repo
|
||||
|
||||
Once we give the subprocess access to a fork of the repo via separate
|
||||
[OpHeadsStore], it will be able to create new operations in its fork.
|
||||
If the user runs `jj run -r foo` and the subprocess checks out another commit,
|
||||
it's not clear what that should do. We should probably just verify that the
|
||||
working-copy commit's parents are unchanged after the subprocess returns. Any
|
||||
operations created by the subprocess will be ignored.
|
||||
|
||||
### Rewriting the revisions
|
||||
|
||||
We should handle public and private revisions differently. We choose to operate
|
||||
on an immutable history by default.
|
||||
|
||||
### Public revisions
|
||||
|
||||
For published revisions, we will not allow `jj run` to modify them and then
|
||||
immediately error out, as published history should be immutable. We may want to
|
||||
support a `--force` flag for an override but it won't be available in the first
|
||||
iteration of the command.
|
||||
|
||||
### Private/Draft revisions
|
||||
|
||||
For private/draft revisions, we just amend the changes, as Jujutsu usually does.
|
||||
We also expose the actual behavior as a command option.
|
||||
|
||||
## Execution order/parallelism
|
||||
|
||||
It may be useful to execute commands in topological order. For example,
|
||||
commands with costs proportional to incremental changes, like build systems.
|
||||
There may also be other revelant heuristics, but topological order is an easy
|
||||
and effective way to start.
|
||||
|
||||
Parallel execution of commands on different commits may choose to schedule
|
||||
commits to still reduce incremental changes in the working copy used by each
|
||||
execution slot/"thread". However, running the command on all commits
|
||||
concurrently should be possible if desired.
|
||||
|
||||
Executing commands in topological order allows for more meaningful use of any
|
||||
potential features that stop execution "at the first failure". For example,
|
||||
when running tests on a chain of commits, it might be useful to proceed in
|
||||
topological/chronological order, and stop on the first failure, because it
|
||||
might imply that the remaining executions will be undesirable because they will
|
||||
also fail.
|
||||
|
||||
## Dealing with failure
|
||||
|
||||
It will be useful to have multiple strategies to deal with failures on a single
|
||||
or multiple revisions. The reason for these strategies is to allow customized
|
||||
conflict handling. These strategies then can be exposed in the ui with a
|
||||
matching command.
|
||||
|
||||
**Continue:** If any subprocess fails, we will continue the work on child
|
||||
revisions. Notify the user on exit about the failed revisions.
|
||||
|
||||
**Stop:** Signal a fatal failure and cancel any scheduled work that has not
|
||||
yet started running, but let any already started subprocess finish. Notify the
|
||||
user about the failed command and display the generated error from the
|
||||
subprocess.
|
||||
|
||||
**Fatal:** Signal a fatal failure and immediately stop processing and kill any
|
||||
running processes. Notify the user that we failed to apply the command to the
|
||||
specific revision.
|
||||
|
||||
We will leave any affected commit in its current state, if any subprocess fails.
|
||||
This allows us provide a better user experience, as leaving revisions in an
|
||||
undesirable state, e.g partially formatted, may confuse users.
|
||||
|
||||
## Resource constraints
|
||||
|
||||
It will be useful to constrain the execution to prevent resource exhaustion.
|
||||
Relevant resources could include:
|
||||
- CPU and memory available on the machine running the commands. `jj run` can
|
||||
provide some simple mitigations like limiting parallelism to "number of CPUs"
|
||||
by default, and limiting parallelism by dividing "available memory" by some
|
||||
estimate or measurement of per-invocation memory use of the commands.
|
||||
- External resources that are not immediately known to jj. For example,
|
||||
commands run in parallel may wish to limit the total number of connections
|
||||
to a server. We might choose to defer any handling of this to the
|
||||
implementation of the command being invoked, instead of trying to
|
||||
communicate that information to jj.
|
||||
|
||||
|
||||
## Command Options
|
||||
|
||||
The base command of any jj command should be usable. By default `jj run` works
|
||||
on the `@` the current working copy.
|
||||
* --command, explicit name of the first argument
|
||||
* -x, for git compatibility (may alias another command)
|
||||
* -j, --jobs, the amount of parallelism to use
|
||||
* -k, --keep-going, continue on failure (may alias another command)
|
||||
* --show, display the diff for an affected revision
|
||||
* --dry-run, do the command execution without doing any work, logging all
|
||||
intended files and arguments
|
||||
* --rebase, rebase all parents on the consulitng diff (may alias another
|
||||
command)
|
||||
* --reparent, change the parent of an effected revision to the new change
|
||||
(may alias another command)
|
||||
* --clean, remove existing workspaces and remove the ignored files
|
||||
* --readonly, ignore changes across multiple run invocations
|
||||
* --error-strategy=`continue|stop|fatal`, see [Dealing with failure](#Dealing-with-failure)
|
||||
|
||||
### Integrating with other commands
|
||||
|
||||
`jj log`: No special handling needed
|
||||
`jj diff`: No special handling needed
|
||||
`jj st`: For now reprint the final output of `jj run`
|
||||
`jj op log`: No special handling needed, but awaits further discussion in
|
||||
[#963][issue]
|
||||
`jj undo/jj op undo`: No special handling needed
|
||||
|
||||
|
||||
## Open Points
|
||||
|
||||
Should the command be backend specific?
|
||||
How do we manage the Processes which the command will spawn?
|
||||
Configuration options, User and Repository Wide?
|
||||
|
||||
## Future possibilities
|
||||
|
||||
- We could rewrite the file in memory, which is a neat optimization
|
||||
- Exposing some internal state, to allow preciser resource constraints
|
||||
- Integration options for virtual filesystems, which allow them to cache the
|
||||
needed working copies.
|
||||
- A Jujutsu wide concept for a cached working copy, as they could be expensive
|
||||
to materialize.
|
||||
- Customized failure messages, this maybe useful for bots, it could be similar
|
||||
to Bazel's `select(..., message = "arch not supported for $project")`.
|
||||
- Make `jj run` asynchronous by spawning a `main` process, directly return to the
|
||||
user and incrementally updating the output of `jj st`.
|
||||
|
||||
|
||||
|
||||
[git-branchless]: https://github.com/arxanas/git-branchless
|
||||
[issue]: https://github.com/martinvonz/jj/issues/963
|
||||
[fix-src]: https://repo.mercurial-scm.org/hg/file/tip/hgext/fix.py
|
||||
[hooks]: https://discord.com/channels/968932220549103686/969829516539228222/1047958933161119795
|
||||
[OpHeadsStore]: https://github.com/martinvonz/jj/blob/main/lib/src/op_heads_store.rs
|
||||
[pre-commit]: https://github.com/martinvonz/jj/issues/405
|
||||
[Treestate]: https://github.com/martinvonz/jj/blob/af85f552b676d66ed0e9ae0d401cd0c4ffbbeb21/lib/src/working_copy.rs#L117
|
||||
[Workspace]: https://github.com/martinvonz/jj/blob/af85f552b676d66ed0e9ae0d401cd0c4ffbbeb21/lib/src/workspace.rs#L54
|
|
@ -76,6 +76,7 @@ nav:
|
|||
- 'Design docs':
|
||||
- 'git-submodules': 'design/git-submodules.md'
|
||||
- 'git-submodule-storage': 'design/git-submodule-storage.md'
|
||||
- 'JJ run': 'design/run.md'
|
||||
- 'Tracking branches': 'design/tracking-branches.md'
|
||||
|
||||
|
||||
|
|
Loading…
Reference in a new issue