Commit graph

3259 commits

Author SHA1 Message Date
Yuya Nishihara
f166fd0726 revset: add at_operation(op, expression)
This can be used in order to refer old working-copy commit, for example. If
we find it's useful, maybe we can add an infix syntax later.

Closes 
2024-10-12 07:57:55 +09:00
Yuya Nishihara
303564ca2b repo: add generic Repo::base_repo() to access current operation from revset
This isn't fancy, but I couldn't find a better abstraction. Note that
MutableRepo::base_repo() isn't removed as it has to return &Arc<_>, whereas
<ReadonlyRepo as Repo>::base_repo() can't return &Arc<_>.
2024-10-12 07:57:55 +09:00
dploch
4f15ca41bf squash: move the core functionality to jj_lib
This allows 'squash' to be executed more easily from a programmatic context
2024-10-09 10:15:57 -04:00
Yuya Nishihara
09d91efea5 id_prefix: propagate error from disambiguation index
The id.shortest() template prints a warning and falls back to repo-global
resolution. This seems better than erroring out. There are a few edge cases
in which the short-prefixes resolution can fail unexpectedly. For example, the
trunk() revision might not exist in operations before "jj git clone".
2024-10-09 14:07:48 +09:00
Yuya Nishihara
3ff1f985f3 revset: pass separate repo to disambiguation index
The idea is that the disambiguation index can be loaded from a repo which is
different from the symbol resolution context.

Suppose we add at_operation(op, expr) revset, a symbol inside at_operation()
expression will have to be resolved within that operation, whereas the
disambiguation index is cached globally by WorkspaceCommandHelper. We could
build temporary disambiguation index for each at-op repo, but that would be
complicated implementation-wise, and wouldn't be useful. For example, a query
"x | at_operation(@-, x)" might be resolved to "xy | at_operation(@-, xz)"
if disambiguation index were reloaded for the @- operation. Instead, the
short change ID "x" can be disambiguated to "xy", then resolved to the
corresponding commit IDs at each operation.
2024-10-09 14:07:48 +09:00
Yuya Nishihara
3d41237efe id_prefix: add empty disambiguation index for convenience
Callers might want to fall back to an empty index if context.populate() failed.
2024-10-09 14:07:48 +09:00
Yuya Nishihara
bf6620d8d9 id_prefix: add explicit method that loads disambiguation index
This unblocks reuse of a symbol resolver instance for a different repo view
specified by at_operation() revset. See later commits for details. It's also
easier to handle error if there is a single function that can fail.
2024-10-09 14:07:48 +09:00
Yuya Nishihara
760134f9c1 revset: error out on unindexed commit ID instead of panicking
It no longer makes sense to handle missing root commit by the revset frontend,
but panicking wouldn't be good either. Let's make it error out.
2024-10-08 13:21:03 +09:00
Yuya Nishihara
ae62b5b946 repo: teach OpStore about the root commit id
This removes an invalid View state from the root operation.

Note that the root index will have to be reindexed in order to resolve "root()"
in the root operation. I don't think this would practically matter, so this
patch doesn't bump the index version to invalidate the existing indexes.

See also 48a9f9ef56 "repo: use Transaction for creating repo-init operation."
2024-10-08 13:21:03 +09:00
Yuya Nishihara
2df442299d repo: plumbing to initialize root view with root commit id
See the next patch for why. It might look odd that OpStore depends on the root
CommitId, but that seems okay because OpStore manages Views, and a View is
basically a set of CommitIds.
2024-10-08 13:21:03 +09:00
Yuya Nishihara
33f0472fcf repo: add convenient methods to load operation object
It's hosted by RepoLoader for now. I'm not sure if we'll need a higher-level
abstraction like Store.
2024-10-08 13:21:03 +09:00
Yuya Nishihara
30a348344b repo: pack common ReadonlyRepo fields into RepoLoader
I'll add a few helper methods to RepoLoader. It seems also nicer that
repo.loader() doesn't allocate new RepoLoader.
2024-10-08 13:21:03 +09:00
Yuya Nishihara
47ff6f18aa repo: fix type name in ReadonlyRepo debug output, show ellipsis 2024-10-08 13:21:03 +09:00
Matt Stark
a524f1f996 refactor: Allow the aliases map to map to arbitrary types.
For , we will have aliases such as:
```toml
'upload(revision)' = [
  ["fix", "-r", "$revision"],
  ["lint", "-r", "$revision"],
  ["git", "push", "-r", "$revision"],
]
```

Template aliases:
1) Start as Config::Value
2) Are converted to String
3) Are placed in the alias map
4) Expand to a TemplateExpression type via expand_defn.

However, command aliases:
1) Start as Config::Value
2) Are converted to Vec<Vec<String>>
3) Are placed in an alias map
4) Do not expand

Thus, AliasesMap will need to support non-string values.
2024-10-08 10:19:14 +11:00
Yuya Nishihara
24ff802f58 revset: use Self to refer to expressions of the same type
Just because it's shorter.
2024-10-06 23:03:38 +09:00
Yuya Nishihara
383cca4c4d diff: return matching hunk contents from all inputs
We're likely to use the right (or new) context lines in rendered diffs, but
it's odd that the hunks iterator choose which context hunk to return. We'll
also need both contents to calculate left/right line numbers.

Since the hunk content types are the same, I also split enum DiffHunk into
{ kind, contents } pair.
2024-10-06 09:45:27 +09:00
Yuya Nishihara
c5f926103a diff: use low-level HashTable in Histogram
This change made some diff benches slow, maybe because the generated code
becomes slightly worse due to the added abstraction? I'll revisit the
performance problem later. There are a couple of ways to mitigate it.

```
group                             new                     old
-----                             ---                     ---
bench_diff_git_git_read_tree_c    1.02     61.0±0.23µs    1.00     59.7±0.38µs
bench_diff_lines/modified/10k     1.00     41.6±0.24ms    1.02     42.3±0.22ms
bench_diff_lines/modified/1k      1.00      3.8±0.07ms    1.00      3.8±0.03ms
bench_diff_lines/reversed/10k     1.29     23.4±0.20ms    1.00     18.2±0.26ms
bench_diff_lines/reversed/1k      1.05    517.2±5.55µs    1.00   493.7±59.72µs
bench_diff_lines/unchanged/10k    1.00      3.9±0.10ms    1.08      4.2±0.10ms
bench_diff_lines/unchanged/1k     1.01    356.8±2.33µs    1.00    353.7±1.99µs
```
(I don't get stable results on my noisy machine, so the results would vary.)
2024-10-05 08:12:30 +09:00
Yuya Nishihara
de137c8f9a diff: implement some ignore-space rules
The added comparison functions correspond to --ignore-all-space and
--ignore-space-change. --ignore-space-at-eol can be combined with the other
flags, so it will have to be implemented as a preprocessing function.
--ignore-blank-lines will also require some change in the tokenizer function.
2024-10-05 08:12:30 +09:00
Yuya Nishihara
f672c92509 diff: add trait for bytes comparison
This could be implemented as a newtype `Wrapper<'a>(&'a [u8])`, but a lifetime
of the wrap function couldn't be specified correctly:

  fn diff(left: &[u8], right: &[u8], wrap_fn: F, ..)
  where
    F: for<'a> Fn(&'a [u8]) -> W<'a>, // F::Output<'a> can't be specified
    W: Copy + Eq + Hash

If the wrapper were of `&Wrapper([u8])` type, `Fn(&[u8]) -> &W` works. However,
it means we can no longer set comparison parameter (such as Regex) dynamically.

Another idea is to add some filter function of `Fn(&[u8]) -> Cow<'_, [u8]>`
type, but I don't think we would want to pay the allocation cost in
hashing/comparison code. `Fn(&[u8]) -> impl Iterator<Item = &[u8]>` might work,
but it would be equally complex.
2024-10-05 08:12:30 +09:00
Yuya Nishihara
dfaa52c88a cargo: add hashbrown dependency
We'll use low-level HashTable to customize Eq/Hash without implementing newtype
wrappers.

Unneeded default features are disabled for now. Note that the new default
hasher, foldhash, is released under the Zlib license, which isn't currently
included in the allow list.
2024-10-05 08:12:30 +09:00
Samuel Tardieu
43711de61c style: use .filter_map() instead of .flat_map() where appropriate 2024-10-04 22:29:13 +02:00
Samuel Tardieu
12f4d6d17b style: avoid using .to_owned()/.to_vec() on owned objects
`.clone()` is more explicit when we already have an object
of the right type.
2024-10-04 22:29:13 +02:00
Samuel Tardieu
3f2ef2ee04 style: add semicolon at the end of expressions used as statements 2024-10-04 22:29:13 +02:00
Samuel Tardieu
46e2723464 style: inline variables into format strings 2024-10-04 22:29:13 +02:00
Samuel Tardieu
62f582e6ab style: remove useless uses of .iter()
Most collection references implement `.into_iter()` or its mutable version,
so it is possible to iterate over the elements without using an explicit
method to do so.
2024-10-04 22:29:13 +02:00
Samuel Tardieu
3f0703ca2c cargo: inherit lints configuration from workspace 2024-10-04 22:29:13 +02:00
Samuel Tardieu
8ba33b7383 lib: add short method summary to its documentation 2024-10-04 17:09:54 +02:00
Samuel Tardieu
87840a5c2c testutils: add short method summary to its documentation 2024-10-04 17:09:54 +02:00
Samuel Tardieu
ac95a86584 proc-macros: add short method summary to its documentation 2024-10-04 17:09:54 +02:00
Matt Stark
6878b5047b Fix: Disallow revset function names starting with a number. 2024-10-04 15:25:11 +10:00
Yuya Nishihara
4f8ae69367 diff: keep absolute ranges to support unchanged regions of different lengths
We have two options to achieve "diff --ignore-*-space":
 a. preprocess contents to be diffed, then translate hunk ranges back
 b. add hooks to customize eq and hash functions
I originally thought (a) would be easier, but actually, there aren't many
changes needed to implement (b). And (b) should have a fewer logic errors.

This patch removes assumption that each unchanged region has the same content
length. It won't be true if whitespace characters are ignored.
2024-10-01 21:24:02 +09:00
Yuya Nishihara
e28fdac48e diff: extract helper that returns texts between two UnchangedRanges 2024-10-01 21:24:02 +09:00
Yuya Nishihara
055f15b6a8 diff: don't destructure UnchangedRange in a loop, rename base_range field
I'll replace offsets with absolute [Range, ..] to support unchanged regions of
different lengths. All fields in UnchangedRange will be ranges.
2024-10-01 21:24:02 +09:00
Yuya Nishihara
db226d9f64 revset: fix crash on "log --at-op 00000000 -r 'root()'"
Spotted while refactoring IdPrefixContext.
2024-10-01 21:23:47 +09:00
Yuya Nishihara
0ac6df7073 revset: make present(unknown@) recover from missing working copy error
Missing working-copy commit is similar situation to unknown ref, and should
be caught by present().
2024-10-01 20:04:06 +09:00
Yuya Nishihara
68176d965e diff: do not translate word-range indices by collect_unchanged_ranges()
Intersection of unchanged ranges becomes a simple merge-join loop, so I've
removed the existing tests. I also added a fast path for the common 2-way
diffs in which we don't have to build vec![(pos, vec![pos])].

One source of confusion introduced by this change is that WordPosition means
both global and local indices. This is covered by the added tests, but I might
add separate local/global types later.
2024-10-01 06:31:22 +09:00
Yuya Nishihara
483db9d7d2 diff: reuse result vec when recurse into unchanged_ranges()
It's silly that we build new Vec for each recursion stack and merge elements
back. I don't see a measurable performance difference in the diff bench, but
this change will help simplify the next patch. If a result vec were created for
each unchanged_ranges() invocation, it would probably make more sense to return
a list of "local" word positions. Then, callers would have to translate the
returned positions to the caller's local positions.
2024-10-01 06:31:22 +09:00
Yuya Nishihara
f6277bbdb8 diff: remove redundant check for empty ranges from unchanged_ranges() recursion
It's cheap to create an empty Vec, and I'm going to remove it anyway.
2024-10-01 06:31:22 +09:00
Yuya Nishihara
64e1ae277d diff: remove redundant hash map lookup of uncommon shared words 2024-09-28 07:49:28 +09:00
Yuya Nishihara
5c52b4ec13 diff: omit construction of count-to-words map for right-side histogram
This also allows us to borrow Vec<WordPositions> from &self.
2024-09-28 07:49:28 +09:00
Yuya Nishihara
493f610fd5 diff: build left-right index map without using (word, occurrence) hash keys
We can assign a unique integer to each (word, occurrence) pair instead. As a
bonus, HashMap can be replaced with Vec.

```
group                             new                     old
-----                             ---                     ---
bench_diff_git_git_read_tree_c    1.00     72.5±3.25µs    1.08     78.5±0.48µs
bench_diff_lines/modified/10k     1.00     45.1±1.18ms    1.10     49.8±1.85ms
bench_diff_lines/modified/1k      1.00      4.1±0.07ms    1.11      4.5±0.34ms
bench_diff_lines/reversed/10k     1.00     19.0±0.12ms    1.12     21.2±1.26ms
bench_diff_lines/reversed/1k      1.00   558.5±37.42µs    1.17   655.6±16.27µs
bench_diff_lines/unchanged/10k    1.00      5.3±0.78ms    1.33      7.0±0.89ms
bench_diff_lines/unchanged/1k     1.00   422.0±16.68µs    1.28   540.7±13.96µs
```
2024-09-28 07:49:28 +09:00
Yuya Nishihara
6cc76ba543 revset: fix copy-paste error in conflict() deprecation message 2024-09-25 16:26:49 +09:00
Yuya Nishihara
dd93e8f60b diff: introduce newtype that represents word-range index
There are usize text indices/ranges and word-range indices. Let's make them
somewhat distinct.
2024-09-25 07:39:41 +09:00
Yuya Nishihara
739a5d8617 diff: pack (text, ranges) pair in a struct
I'll add a few more helper methods there. It might also make sense to cache
precomputed hash values.

unchanged_ranges() is made private since there are no external callers, and
I'm going to add more private types.
2024-09-25 07:39:41 +09:00
Yuya Nishihara
1b469321e2 diff: sort word occurrences only by positions
Since uncommon_shared_words are unique, their occurrence positions should also
be unique.
2024-09-25 07:39:41 +09:00
Yuya Nishihara
5842267c73 diff: use iter::zip() instead of slice indexing 2024-09-25 07:39:41 +09:00
Essien Ita Essien
895d53f395 Rename conflict and file revsets to conflicts and files.
See discussion thread in linked issue.

With this PR, all revset functions in [BUILTIN_FUNCTION_MAP](8d166c7642/lib/src/revset.rs (L570))
that return multiple values are either named in plural or the naming is hard to misunderstand (e.g. `reachable`)

Fixes: 
2024-09-24 20:02:49 +01:00
Samuel Tardieu
90280ad2fd repo: introduce MutableRepo::reparent_descendants() 2024-09-24 09:30:28 +02:00
Samuel Tardieu
736163c8d3 repo: add MutableRepo::rebase_descendants documentation 2024-09-24 09:30:28 +02:00
Yuya Nishihara
49e45cc245 revset, templater: add deprecation warnings 2024-09-23 07:07:07 +09:00