ok/jj - ok.software

ok/jj

Author	SHA1	Message	Date
Yuya Nishihara	8268af9b4f	merge: add helper function to match Option<impl Borrow<TreeValue>> More callers will be added by the next commit.	2024-08-12 23:01:46 +09:00
Yuya Nishihara	accd1e337a	merge: add .cloned() method that maps inner Option<&T> to Option<T> MergedTreeVal::to_merge() will be replaced with this.	2024-08-12 23:01:46 +09:00
Benjamin Tan	e2ab6d4f42	rewrite: migrate `move_commits` function from `rebase` command	2024-08-12 21:48:17 +08:00
Benjamin Tan	9c1b627f9b	jj_lib: include `indexmap` as dependency This is in preparation for shifting of `move_commits` function to `jj_lib::rewrite`.	2024-08-12 21:48:17 +08:00
Yuya Nishihara	fd52efa0ba	merged_tree: leverage Merge<Tree> entries iterator in all_tree_entries()	2024-08-12 10:20:34 +09:00
Yuya Nishihara	88018e84fc	merged_tree: micro-optimize Merge<Tree> entries iterator to return &TreeValue try_resolve_file_conflict() is also updated. It could be a generic function, but there are only two callers, and the legacy tree one is used only in tests.	2024-08-12 10:20:34 +09:00
Yuya Nishihara	6d6f5990de	merged_tree: add merge-join iterator over Merge<Tree> entries For the same reason as `2cb7e91d` "merged_tree: do not re-look up non-conflicting tree values by name." This appears to bring a similar performance improvement. I assume this change is/will be covered by test_merged_tree.rs. I considered adding a few unit tests, but constructing Tree object isn't trivial, and the iterator implementation is relatively straightforward.	2024-08-12 10:20:34 +09:00
Matt Kulukundis	5911e5c9b2	copy-tracking: Add copy tracking as a post iteration step - force each diff command to explicitly enable copy tracking - enable copy tracking in diff_summary - post-process for diff iterator - post-process for diff stream - update changelog	2024-08-11 17:01:45 -04:00
Matt Kulukundis	0349d9ead3	copy-tracking: extract next_impl from next in diff iter/stream	2024-08-11 17:01:45 -04:00
Matt Kulukundis	34b0f87584	copy-tracking: plumb CopyRecordMap through diff method	2024-08-11 17:01:45 -04:00
Matt Kulukundis	6bae5eaf9d	copy-tracking: create a MaterializedTreeDiffEntry type	2024-08-11 17:01:45 -04:00
Matt Kulukundis	e123eb21b9	copy-tracking: add `source` field to `TreeDiffEntry` - add the field and make it compile, but don't use it yet	2024-08-11 17:01:45 -04:00
Matt Kulukundis	8e84c60157	copy-tracking: create an explicit TreeDiffEntry struct	2024-08-11 17:01:45 -04:00
Matt Kulukundis	ee6b922144	copy-tracking: create CopyRecordMap and add it to diff summaries	2024-08-11 17:01:45 -04:00
Matt Kulukundis	e667a2b403	copy-tracking: adjust backend signature - use a single commit instead of an array of them. This simplifies the implementation. A higher level api can wrap this when an array of commits is desired and those semantics are figured out. - since this API is directly 1-1 on parents, there are no conflicts - if we introduce a higher level API that handles lists of commits, we may need to restore the conflict/resolved distinction, but for now simplify	2024-08-11 17:01:45 -04:00
Yuya Nishihara	c9e147c425	merged_tree: allow to postpone resolution of intermediate trees This allows us to diff trees without fully resolving conflicts: let from_tree = merge_no_resolve(..); for (path, (from, to)) in from_tree.diff(to_tree, matcher) { let from = resolve_conflicts(from); if from == to { continue; // resolved file may be identical ... I originally considered adding a matcher argument to merge() functions, but the resulting API looked misleading. If merge() took a matcher, callers might expect unmatched trees and files were omitted, not left unresolved. It's also slower than diffing unresolved trees because merge(.., matcher) would have to write partially resolved trees to the store. Since "ancestor_tree" isn't resolved by itself, this patch has subtle behavior change. For example, "jj diff -r9eaef582" in the "git" repository is no longer empty. I think the new behavior is also technically correct, but I'm not pretty sure.	2024-08-11 18:23:21 +09:00
Yuya Nishihara	5d141befc2	tests: evaluate file()/diff_contains() revset against merged parents These tests would fail if trees are compared without resolving file conflicts.	2024-08-11 18:23:21 +09:00
Yuya Nishihara	dac04960f0	rewrite: remove redundant commit_id.clone() from merge_commit_trees*()	2024-08-11 18:23:21 +09:00
Yuya Nishihara	ed1c07e73e	tree: fill in valid id to null tree, rename function to empty() If a null tree were written to the store, GitBackend would crash because of invalid hash length.	2024-08-11 18:23:21 +09:00
Yuya Nishihara	2cb7e91dc7	merged_tree: do not re-look up non-conflicting tree values by name While measuring file(path) query, I noticed BTreeMap lookup appears in perf. It actually has a measurable cost if the history is linear and parent trees don't have to be merged dynamically. For merge-heavy history, the cost of tree merges is more significant. I'll address that separately. ``` % hyperfine --sort command --warmup 3 --runs 50 -L bin jj-1,jj-2 \ 'target/release-with-debug/{bin} -R ~/mirrors/git --ignore-working-copy \ log -r "::trunk() & ~merges() & file(root:builtin)" --no-graph -n100' Benchmark 1: target/release-with-debug/jj-1 .. Time (mean ± σ): 239.7 ms ± 7.1 ms [User: 192.1 ms, System: 46.5 ms] Range (min … max): 222.2 ms … 249.7 ms 50 runs Benchmark 2: target/release-with-debug/jj-2 .. Time (mean ± σ): 201.7 ms ± 6.9 ms [User: 153.7 ms, System: 46.6 ms] Range (min … max): 184.2 ms … 211.1 ms 50 runs Relative speed comparison 1.19 ± 0.05 target/release-with-debug/jj-1 .. 1.00 target/release-with-debug/jj-2 .. ```	2024-08-09 00:17:37 +09:00
Yuya Nishihara	19b62d29ba	merged_tree: leverage .to_tree_merge() in TreeDiffIterator	2024-08-08 23:05:37 +09:00
Yuya Nishihara	6fc7cec4a5	merged_tree: make TreeDiffIterator accept trees as &Merge<Tree> For the same reason as the patch for TreeEntriesIterator. It's probably better to assume that MergedTree represents the root tree.	2024-08-08 23:05:37 +09:00
Yuya Nishihara	9378adedb7	merged_tree: hold store globally by TreeDiffIterator Since TreeDiffDirItem is now calculated eagerly, it doesn't make sense to keep MergedTree in it.	2024-08-08 23:05:37 +09:00
Yuya Nishihara	37c41d0eaf	tests: do not pass in commit objects loaded from different store Otherwise the assertion would fail in the next patch.	2024-08-08 23:05:37 +09:00
Yuya Nishihara	8b72dad095	merged_tree: replace explicit .is_tree() call in TreeEntriesIterator The value here shouldn't be absent, so .is_tree() is equivalent to .to_tree_merge().is_some().	2024-08-08 23:05:37 +09:00
Yuya Nishihara	12434b49b8	merged_tree: make TreeEntriesIterator accept trees as &Merge<Tree> Suppose we add copy information to MergedTree, a MergedTree can be considered a root tree representation plus global metadata. I think Merge<Tree> is a better type for sub trees.	2024-08-08 23:05:37 +09:00
Yuya Nishihara	8a3e4ad966	merged_tree: hold store globally by TreeEntriesIterator Since TreeEntriesDirItem is now calculated eagerly, it doesn't make sense to keep MergedTree in it.	2024-08-08 23:05:37 +09:00
Martin von Zweigbergk	ec7725064b	merged_tree: make `MergedTree` a struct I considered making `MergedTree` just a newtype (1-tuple) but I went with a struct instead because we may want to add copy information in a separate field in the future.	2024-08-08 05:32:16 -07:00
Martin von Zweigbergk	7596935285	merged_tree: make `ConflictIterator` a struct	2024-08-08 05:32:16 -07:00
Martin von Zweigbergk	109391f9c7	merged_tree: delete `MergedTree::Legacy`	2024-08-08 05:32:16 -07:00
Martin von Zweigbergk	10aab1bdc3	conflicts: always promote legacy trees to merged trees In order to remove the `MergedTree::Legacy` form, we need to stop creating such instances. This patch removes the last place we create them, which is in `Store::get_root_tree()`. The main practical consequence of this change is that loading legacy trees gets a lot slower on large repos. However, since the default log template includes the `conflict` keyword, we ended up scanning all paths in `jj log` anyway, so I'm not sure many people will notice.	2024-08-08 05:32:16 -07:00
Yuya Nishihara	202fb533f4	merged_tree: remove .diff() method in favor of .diff_stream() It's unlikely we'll need the iterator version of .diff() except for testing the stream implementation.	2024-08-08 10:45:59 +09:00
Yuya Nishihara	24b8934b14	tests: migrate .diff() callers to .diff_stream()	2024-08-08 10:45:59 +09:00
Yuya Nishihara	63e254d052	tests: use pollster instead of futures::executor::block_on() It doesn't matter in tests and I have no preference over these, but we tend to use .block_on().	2024-08-08 10:45:59 +09:00
Yuya Nishihara	26f744ab2d	revset: use .diff_stream() in file() evaluation, handle backend error This is the last .diff() caller in non-test code. Though it wouldn't be important to use async version here, this change helps remove .diff() API.	2024-08-08 10:45:59 +09:00
Yuya Nishihara	7bdb28f1fe	cli: make "op abandon" not fail with multiple op heads Since "op abandon" just rewrites DAG, it works no matter if the heads are merged or not. This change will help crash recovery. "op abandon --at-op=<one-of-the-heads>" can't be used because ancestor operations would be preserved by the other head.	2024-08-07 10:51:44 +09:00
Yuya Nishihara	399110b1fc	op_walk: allow to resolve operation expression from multiple heads I'll make "op abandon" work without merging op heads.	2024-08-07 10:51:44 +09:00
Yuya Nishihara	4e3a5de6e9	op_walk: sort current heads to stabilize multiple ops error message	2024-08-07 10:51:44 +09:00
Yuya Nishihara	f7836aa687	cli: obslog: show diffs from all predecessors, not first predecessor Suppose a squash node in obslog is analogous to a merge in revisions log, it makes sense to show diffs from auto-merge (or auto-squash) parents. This basically means a non-partial squash node no longer shows diffs. This also fixes missing diffs at the root predecessors if there were.	2024-08-07 10:51:23 +09:00
Yuya Nishihara	d061c3782f	merged_tree: remove .diff_summary() There are no non-test callers since `452fecb7c4` "cli: colorize diff summary and sort by path."	2024-08-06 10:15:44 +09:00
Yuya Nishihara	d435a8a793	tests: compare trees without using .diff_summary() I don't think modification types matter here. Testing paths should be good enough.	2024-08-06 10:15:44 +09:00
Yuya Nishihara	b290af8e29	op_walk: include operation ids in multiple match error	2024-08-03 09:22:26 +09:00
Stephen Jennings	6c41b1bef8	revset: add author_date and committer_date revset functions Author dates and committer dates can be filtered like so: committer_date(before:"1 hour ago") # more than 1 hour ago committer_date(after:"1 hour ago") # 1 hour ago or less A date range can be created by combining revsets. For example, to see any revisions committed yesterday: committer_date(after:"yesterday") & committer_date(before:"today")	2024-08-01 09:04:07 -07:00
Stephen Jennings	ff9e739798	revset: create DatePattern type Creates a DatePattern type that can be created by parsing a string in any format supported by the chrono-english crate, including: - 2024-03-25 - 2024-03-25T00:00:00 - 2024-03-25T00:00:00-08:00 - 2 weeks ago - 5 minutes ago - yesterday - yesterday 5pm - yesterday 10:30 - yesterday 15:30 - tomorrow A `kind` can be specified to indicate whether the pattern should match dates at or after (`after`) or strictly before (`before`) the given instant. chrono-english supports US and UK dialects to disambiguate mm/dd/yy from dd/mm/yy, but for now we default to US. This should probably be a config setting.	2024-08-01 09:04:07 -07:00
dploch	bfa1ce8936	workspace: make the constructor public This allows constructing a workspace in a custom environment where the standard filesystem API cannot be used	2024-07-31 19:45:37 -04:00
dploch	5f7e3883e8	repo: define a public constructor for RepoLoader This enables the creation of Repo objects in environments without standard filesystem support, by allowing the caller to load the store objects however they see fit. This confines interaction with the filesystem to the WorkingCopy abstractions.	2024-07-31 19:45:37 -04:00
Yuya Nishihara	d2f933eed3	commit_builder: remove unneeded &mut from .write_hidden() Since the backend::Commit has to be cloned, .write_hidden() doesn't mutate the self.commit object.	2024-07-25 22:39:00 +09:00
Martin von Zweigbergk	d740f1801b	conflicts: use non-legacy `MergedTreeId` for root commit This is part of migrating away from legacy trees (with path-level conflicts). I can't think of any practical impact (we already compare the tree ids equal).	2024-07-24 14:33:05 +02:00
Martin von Zweigbergk	352ca72314	tests: make helpers create non-legacy trees Extracted and modified from #3746 by @ilyagr.	2024-07-24 14:33:05 +02:00
Yuya Nishihara	bafb357209	git: on abandoning unreachable commits, don't count HEAD ref This basically reverts `20eb9ecec1` "git: don't abandon HEAD commit when it loses a branch." I think the new behavior is more consistent because the Git HEAD is equivalent to @- in jj, so it shouldn't be considered a named ref. Note that we've made old HEAD branch not considered at `92cfffd843` "git: on external HEAD move, do not abandon old branch." #4108	2024-07-24 21:22:26 +09:00
Yuya Nishihara	da221eb888	repo: load index eagerly to simplify error handling If readonly_index() and index() returned Result, it would propagate to many call sites. That seems bad for API ergonomics. Suppose most "repo" commands depend on an index, I think it's okay to load index eagerly: - "jj config" doesn't load repo (nor index) - "jj workspace root" doesn't load repo (nor index) - some other mutation commands load index when printing commit summary - many other commands load index when resolving revset	2024-07-23 18:26:16 +09:00
Yuya Nishihara	626aa90610	repo: use DetachedCommitBuilder constructors I think this makes it clear that the builder doesn't add any rewrite records to the mut_repo.	2024-07-23 18:22:40 +09:00
Yuya Nishihara	337dcef6ee	commit_builder: add public interface that writes temporary commit to store In order to render description template, we'll need a Commit object that represents the old state (with new tree and parents) before updating the commit description. The added functions will help generate an intermediate Commit object. Alternatively, we can create an in-memory Commit object with some fake CommitId. It should be lightweight, but might cause weird issue because the fake id wouldn't be found in the store. I think it's okay to write a temporary commit and rely on GC as we do for merge trees. However, I should note that temporary commits are more likely to be preserved as they are pinned by no-gc refs until "jj util gc".	2024-07-23 18:22:40 +09:00
Yuya Nishihara	b4bf1358a5	commit_builder: extract inner builder which isn't lifetimed by mut_repo This allows us to construct a builder, format description template with an intermediate commit, then write() a final commit object to the repo. I originally considered removing mut_repo from CommitBuilder at all, but rewriter APIs rely on that CommitBuilder has &mut_repo, and splitting them would make call sites uglier. The inner builder methods are based on &mut Self instead of Self, because it's easier to wrap, and users of the inner builder will bind it to a named variable anyway.	2024-07-23 18:22:40 +09:00
Yuya Nishihara	6516a40c19	commit_builder: extract free function that sets up signing and write commit I'll add another write() method that doesn't consume self, which will have to clone self.commit.	2024-07-23 18:22:40 +09:00
Yuya Nishihara	fab310f53f	commit_builder: keep Store internally I'm going to extract an inner builder that is free from &mut MutableRepo lifetime.	2024-07-23 18:22:40 +09:00
Yuya Nishihara	49d92a0480	commit_builder: remove redundant boxing from signing fn closure	2024-07-23 18:22:40 +09:00
Yuya Nishihara	209e076bfc	commit_builder: use .clone_from() to silence nightly clippy warning It's not wrong that String::clone_from() could potentially be cheaper.	2024-07-23 18:22:40 +09:00
Benjamin Tan	58a813cb18	repo: add `RepoLoader::merge_operations`	2024-07-22 19:16:42 +08:00
Benjamin Tan	87ea9102f0	repo: add `MutableRepo::merge_index`	2024-07-22 19:16:42 +08:00
Yuya Nishihara	ddc601fbf9	str_util: add regex pattern This patch adds minimal support for the regex pattern. We might have to add "regex-i:" for completeness, but it can be achieved by "regex:'(?i)..'".	2024-07-22 12:00:52 +09:00
Yuya Nishihara	845793a7ad	str_util: remove Eq + PartialEq from StringPattern I'm going to add regex support, and compiled Regex object isn't comparable.	2024-07-22 12:00:52 +09:00
Yuya Nishihara	5783631271	tests: use assert_matches!() to compare StringPattern	2024-07-22 12:00:52 +09:00
Yuya Nishihara	9d5eda107d	commit_builder: inline mut_repo.write_commit() As the doc comment says, it's called only from CommitBuilder. Let's clarify that. I'm also planning to extract a builder that only writes to the store (without mutably borrowing a mut_repo.) It will help implement description template.	2024-07-20 09:06:46 +09:00
Ilya Grigoriev	e2f12d91cc	conflicts: switch to multi-line regex, fix minor bug The multi-line regex will be used for other purposes soon.	2024-07-18 18:42:40 -07:00
Ilya Grigoriev	d095570718	conflicts: demo minor bug	2024-07-18 18:42:40 -07:00
Ilya Grigoriev	f3de66e603	conflicts: demo failure to materialize if conflicts don't end in a newline #3968	2024-07-18 18:42:40 -07:00
Matt Kulukundis	6ffe05290d	copy-tracking: move unit tests into backend specific file	2024-07-18 05:44:56 -04:00
Yuya Nishihara	5649ee4f45	fileset: parse glob characters as identifier It's inconvenient that we have to quote glob patterns as 'glob:"*.rs"'. Suppose filesets are usually specified in shell, it's better to allow unquoted strings if possible. This change also means we'll probably abandon #2101 "make the parsing of string arguments stricter." Note that we can no longer introduce ? operator or [] subscript syntax in filesets. Closes #4053	2024-07-18 13:49:10 +09:00
Yuya Nishihara	1a387489d9	files: relax requirement of merge() inputs Most callers have Merge<ContentHunk> or Merge<Vec<u8>>.	2024-07-18 11:34:43 +09:00
Yuya Nishihara	e5b49c7d52	files: extract pre-processing part from merge() I'll make the first half generic over T: AsRef<[u8]>.	2024-07-18 11:34:43 +09:00
Yuya Nishihara	895eead4b8	revset: add diff_contains(text[, files]) to search diffs The text pattern is applied prior to comparison as we do in Mercurial. This might affect hunk selection, but is much faster than computing diff of full file contents. For example, the following hunk wouldn't be caught by diff_contains("a") because the line "b\n" is filtered out: - a b + a Closes #2933	2024-07-18 01:01:16 +09:00
Yuya Nishihara	eabff4c0b4	revset: propagate BackendError from inner file() predicate function We should probably add error propagation path to Revset iterator, and predicate functions will return Result<bool, RevsetEvaluationError>.	2024-07-18 01:01:16 +09:00
Yuya Nishihara	a6a67fa8fd	revset: pass Commit object to inner file() predicate function Commit object extraction is common across predicate functions.	2024-07-18 01:01:16 +09:00
Yuya Nishihara	a9af8d21f8	diff: move materialized_diff_stream() to jj_lib::conflicts module New diff_contains() revset function will use this helper.	2024-07-18 01:01:16 +09:00
Matt Kulukundis	3043b83a8f	copy-tracking: add `get_copy_records` to `Store`	2024-07-16 13:18:49 -04:00
Anton Älgmyr	c7eac90200	Enable the new graph nodes by default. It's been tested in various places now, so this is probably mature enough to be the default.	2024-07-16 12:54:24 +02:00
Yuya Nishihara	a757fddcf1	revset: parse file() argument as fileset expression Since fileset and revset languages are syntactically close, we can reparse revset expression as a fileset. This might sound a bit scary, but helps eliminate nested quoting like file("~glob:'*.rs'"). One oddity exists in alias substitution, though. Another possible problem is that we'll need to add fake operator parsing rules if we introduce incompatibility in fileset, or want to embed revset expressions in a fileset. Since "file(x, y)" is equivalent to "file(x\|y)", the former will be deprecated. I'll probably add a mechanism to collect warnings during parsing.	2024-07-16 10:18:57 +09:00
Yuya Nishihara	c74cf2d80d	revset: use .ok_or_else() to handle missing workspace context FWIW, this might be changed to non-error so that "file()" revset can be used at server side.	2024-07-16 10:18:57 +09:00
Matt Kulukundis	df021083c9	diff: add unit tests for copy tracking in the git backend	2024-07-15 16:49:10 -04:00
Yuya Nishihara	6f6381d06e	diff: leverage BStr for better debug printing	2024-07-14 23:26:29 +09:00
Yuya Nishihara	502547d6a5	diff: add generic DiffHunk constructors For the same reason as the previous patch. I'm going to make DiffHunk leverage BStr wrapper instead of custom Debug impl. b"" literals in tests are changed to &str to get around type incompatibility between &[u8; N].	2024-07-14 23:26:29 +09:00
Yuya Nishihara	59daef2351	diff: accept diff inputs by generic iterator This helps migrate internal [u8] variables to BStr. b"" literals in tests are changed to &str to get around potential type incompatibility between &[u8; N].	2024-07-14 23:26:29 +09:00
Yuya Nishihara	2ca3bad0ee	diff: split non-generic part from Diff::for_tokenizer()	2024-07-14 23:26:29 +09:00
Yuya Nishihara	5601fb40f8	cargo: add "bstr" dependency I'm going to replace some Debug impls with BStr, and we already depend on "bstr" through "gix".	2024-07-14 23:26:29 +09:00
Yuya Nishihara	ac2bddbc3d	cargo: remove unused "bytes" dependency	2024-07-14 23:26:29 +09:00
Scott Taylor	2dd75b5c53	revset: add tracked/untracked_remote_branches() Adds support for revset functions `tracked_remote_branches()` and `untracked_remote_branches()`. I think this would be especially useful for configuring `immutable_heads()` because rewriting untracked remote branches usually wouldn't be desirable (since it wouldn't update the remote branch). It also makes it easy to hide branches that you don't care about from the log, since you could hide untracked branches and then only track branches that you care about.	2024-07-13 10:43:21 -05:00
Tim Janik	11f56800fa	test_gpg: fix warnings ending up on stdout Signed-off-by: Tim Janik <timj@gnu.org>	2024-07-12 10:32:13 +09:00
Tim Janik	219a63540f	local_working_copy: fix warnings ending up on stdout As suggested by @crackcomm on discord, use eprintln!() to print warnings to avoid messing up template output, e.g.: jj --no-pager --ignore-working-copy show --tool true -T change_id -r rv... rv...ignoring git submodule at "some/submodule" Signed-off-by: Tim Janik <timj@gnu.org>	2024-07-12 10:32:13 +09:00
Yuya Nishihara	ffd7b41d2b	revset: rename expect_literal_with() to expect_expression_with() The return type T doesn't have to be a literal, and I'm going to use this function to reparse fileset expression. We might also want to add another expect_literal_with() helper that parses enum-like string value.	2024-07-12 10:31:45 +09:00
Yuya Nishihara	415c831e30	revset: flatten union nodes in AST to save recursion stack Maybe it'll also be good to keep RevsetExpression::Union(_) flattened, but that's not needed to get around stack overflow. The constructed expression tree is balanced. test_expand_symbol_alias() is slightly adjusted since there are more than one representation for "a\|b\|c" now. Fixes #4031	2024-07-11 11:20:25 +09:00
Yuya Nishihara	f90b061808	fileset: flatten union nodes in AST to save recursion stack This is somewhat similar to templater where "x ++ y" operator is special cased.	2024-07-11 11:20:25 +09:00
Emily	93d76e5d8f	str_util: support case‐insensitive string patterns Partially resolve a 1.5‐year‐old TODO comment. Add opt‐in syntax for case‐insensitive matching, suffixing the pattern kind with `-i`. Not every context supports case‐insensitive patterns (e.g. Git branch fetch settings). It may make sense to make this the default in at least some contexts (e.g. the commit signature and description revsets), but it would require some thought to avoid more confusing context‐sensitivity. Make `mine()` match case‐insensitively unconditionally, since email addresses are conventionally case‐insensitive and it doesn’t take a pattern anyway. This currently only handles ASCII case folding, due to the complexities of case‐insensitive Unicode comparison and the `glob` crate’s lack of support for it. This is unlikely to matter for email addresses, which very rarely contain non‐ASCII characters, but is unfortunate for names and descriptions. However, the current matching behaviour is already seriously deficient for non‐ASCII text due to the lack of any normalization, so this hopefully shouldn’t be a blocker to adding the interface. An expository comment has been left in the code for anyone who wants to try and address this (perhaps a future version of myself).	2024-07-10 05:58:34 +01:00
Emily	a146145adb	revset: fix `mine()` test comments	2024-07-10 05:58:34 +01:00
Emily	3567cba8c9	revset: fix email matching tests The comments say “Can find a unique match by either name or email”, but these weren’t checking for an email match.	2024-07-10 05:58:34 +01:00
Emily	0802a1502b	revset: fix `RevsetFilterPredicate` comments These support more types of pattern than just substring matching.	2024-07-10 05:58:34 +01:00
Yuya Nishihara	6a1d9262a0	diff: add short for Diff::for_tokenizer(_, find_line_ranges) Line-by-line diff is common. Let's add a helper method for convenience.	2024-07-10 10:05:31 +09:00
Yuya Nishihara	dce3ec7320	diff: simplify trimming of leading/trailing ranges to not rely on recursion I don't think there's a possibility that uncommon_shared_words can become non-empty by trimming the same amount of lines from both sides. Well, there's an edge case regarding max_occurrences, but that shouldn't matter in practice.	2024-07-09 20:35:36 +09:00
Martin von Zweigbergk	fefe07b3c3	diff: consider uncommon words to match only if they have the same count Patience diff starts by lining up unique elements (e.g. lines) to find matching segments of the inputs. After that, it refines the non-matching segments by repeating the process. Histogram expands on that by not just considering unique elements but by continuing with elements of count 2, then 3, etc. Before this commit, when diffing "a b a b b" against "a b a b a b", we would match the two "a"s in the first input against the first two "a"s in the second input. After this patch, we ignore the "a"s because their counts differ, so we try to align the "b"s instead. I have had this commit lying around since I wrote the histogram diff implementation in `1e657c5331`. I vaguely remember thinking that the way I had implemented it (without this commit) was a bit weird, but I wasn't sure if this commit would be an improvement or not. The bug report from @chooglen today of a case where we behave differently from Git is enough to make me think that we make this change after all. #761	2024-07-09 20:35:36 +09:00
Yuya Nishihara	831bbc0b11	diff: match up leading/trailing ranges if no match found by uncommon lcs This is adapted from Breezy/Python patiencediff. AFAICT, Git implementation is slightly different (and maybe more efficient?), but it's not super easy to integrate with our diff logic. I'm not sure which one is better overall, but I think the result is good so long as "uncommon LCS" matching is attempted first. `a9a3e4edc3/patiencediff/_patiencediff_py.py (L108)` This patch prevents some weird test changes that would otherwise be introduced by the next patch.	2024-07-09 20:35:36 +09:00

1 2 3 4 5 ...

3152 commits