zed/crates/rope/Cargo.toml

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

34 lines
667 B
TOML
Raw Normal View History

2022-10-11 22:25:54 +00:00
[package]
name = "rope"
version = "0.1.0"
edition = "2021"
publish = false
license = "GPL-3.0-or-later"
[lints]
workspace = true
2022-10-11 22:25:54 +00:00
[lib]
path = "src/rope.rs"
[dependencies]
arrayvec = "0.7.1"
log.workspace = true
Speed up point translation in the Rope (#19913) This pull request introduces an index of Unicode codepoints, newlines and UTF-16 codepoints. Benchmarks worth a thousand words: ``` push/4096 time: [467.06 µs 470.07 µs 473.24 µs] thrpt: [8.2543 MiB/s 8.3100 MiB/s 8.3635 MiB/s] change: time: [-4.1462% -3.0990% -2.0527%] (p = 0.00 < 0.05) thrpt: [+2.0957% +3.1981% +4.3255%] Performance has improved. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low mild 2 (2.00%) high mild push/65536 time: [1.4650 ms 1.4796 ms 1.4922 ms] thrpt: [41.885 MiB/s 42.242 MiB/s 42.664 MiB/s] change: time: [-3.2871% -2.3489% -1.4555%] (p = 0.00 < 0.05) thrpt: [+1.4770% +2.4054% +3.3988%] Performance has improved. Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) low severe 3 (3.00%) low mild append/4096 time: [729.00 ns 730.57 ns 732.14 ns] thrpt: [5.2103 GiB/s 5.2215 GiB/s 5.2327 GiB/s] change: time: [-81.884% -81.836% -81.790%] (p = 0.00 < 0.05) thrpt: [+449.16% +450.53% +452.01%] Performance has improved. Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) low mild 6 (6.00%) high mild 2 (2.00%) high severe append/65536 time: [504.44 ns 505.58 ns 506.77 ns] thrpt: [120.44 GiB/s 120.72 GiB/s 121.00 GiB/s] change: time: [-94.833% -94.807% -94.782%] (p = 0.00 < 0.05) thrpt: [+1816.3% +1825.8% +1835.5%] Performance has improved. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe slice/4096 time: [29.661 µs 29.733 µs 29.816 µs] thrpt: [131.01 MiB/s 131.38 MiB/s 131.70 MiB/s] change: time: [-48.833% -48.533% -48.230%] (p = 0.00 < 0.05) thrpt: [+93.161% +94.298% +95.440%] Performance has improved. slice/65536 time: [588.00 µs 590.22 µs 592.17 µs] thrpt: [105.54 MiB/s 105.89 MiB/s 106.29 MiB/s] change: time: [-45.599% -45.347% -45.099%] (p = 0.00 < 0.05) thrpt: [+82.147% +82.971% +83.821%] Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low severe 1 (1.00%) high mild bytes_in_range/4096 time: [3.8630 µs 3.8811 µs 3.8994 µs] thrpt: [1001.8 MiB/s 1006.5 MiB/s 1011.2 MiB/s] change: time: [+0.0600% +0.6000% +1.1833%] (p = 0.03 < 0.05) thrpt: [-1.1695% -0.5964% -0.0600%] Change within noise threshold. bytes_in_range/65536 time: [98.178 µs 98.545 µs 98.931 µs] thrpt: [631.75 MiB/s 634.23 MiB/s 636.60 MiB/s] change: time: [-0.6513% +0.7537% +2.2265%] (p = 0.30 > 0.05) thrpt: [-2.1780% -0.7481% +0.6555%] No change in performance detected. Found 11 outliers among 100 measurements (11.00%) 8 (8.00%) high mild 3 (3.00%) high severe chars/4096 time: [878.91 ns 879.45 ns 880.06 ns] thrpt: [4.3346 GiB/s 4.3376 GiB/s 4.3403 GiB/s] change: time: [+9.1679% +9.4000% +9.6304%] (p = 0.00 < 0.05) thrpt: [-8.7844% -8.5923% -8.3979%] Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 3 (3.00%) high severe chars/65536 time: [15.615 µs 15.691 µs 15.757 µs] thrpt: [3.8735 GiB/s 3.8899 GiB/s 3.9087 GiB/s] change: time: [+5.4902% +5.9345% +6.4044%] (p = 0.00 < 0.05) thrpt: [-6.0190% -5.6021% -5.2045%] Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) low mild clip_point/4096 time: [29.677 µs 29.835 µs 30.019 µs] thrpt: [130.13 MiB/s 130.93 MiB/s 131.63 MiB/s] change: time: [-46.306% -45.866% -45.436%] (p = 0.00 < 0.05) thrpt: [+83.272% +84.728% +86.240%] Performance has improved. Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe clip_point/65536 time: [1.5933 ms 1.6116 ms 1.6311 ms] thrpt: [38.318 MiB/s 38.782 MiB/s 39.226 MiB/s] change: time: [-30.388% -29.598% -28.717%] (p = 0.00 < 0.05) thrpt: [+40.286% +42.040% +43.653%] Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 7 filtered out; finished in 0.00s point_to_offset/4096 time: [14.493 µs 14.591 µs 14.707 µs] thrpt: [265.61 MiB/s 267.72 MiB/s 269.52 MiB/s] change: time: [-71.990% -71.787% -71.588%] (p = 0.00 < 0.05) thrpt: [+251.96% +254.45% +257.01%] Performance has improved. Found 9 outliers among 100 measurements (9.00%) 5 (5.00%) high mild 4 (4.00%) high severe point_to_offset/65536 time: [700.72 µs 713.75 µs 727.26 µs] thrpt: [85.939 MiB/s 87.566 MiB/s 89.194 MiB/s] change: time: [-61.778% -61.015% -60.256%] (p = 0.00 < 0.05) thrpt: [+151.61% +156.51% +161.63%] Performance has improved. ``` Calling `Rope::chars` got slightly slower but I don't think it's a big issue (we don't really call `chars` for an entire `Rope`). In a future pull request, I want to use the tab index (which we're not yet using) and the char index to make `TabMap` a lot faster. Release Notes: - N/A
2024-10-30 09:59:03 +00:00
rayon.workspace = true
smallvec.workspace = true
sum_tree.workspace = true
Fix caret movement issue for some special characters (#10198) Currently in Zed, certain characters require pressing the key twice to move the caret through that character. For example: "❤️" and "y̆". The reason for this is as follows: Currently, Zed uses `chars` to distinguish different characters, and calling `chars` on `y̆` will yield two `char` values: `y` and `\u{306}`, and calling `chars` on `❤️` will yield two `char` values: `❤` and `\u{fe0f}`. Therefore, consider the following scenario (where ^ represents the caret): - what we see: ❤️ ^ - the actual buffer: ❤ \u{fe0f} ^ After pressing the left arrow key once: - what we see: ❤️ ^ - the actual buffer: ❤ ^ \u{fe0f} After pressing the left arrow key again: - what we see: ^ ❤️ - the actual buffer: ^ ❤ \u{fe0f} Thus, two left arrow key presses are needed to move the caret, and this PR fixes this bug (or this is actually a feature?). I have tried to keep the scope of code modifications as minimal as possible. In this PR, Zed handles such characters as follows: - what we see: ❤️ ^ - the actual buffer: ❤ \u{fe0f} ^ After pressing the left arrow key once: - what we see: ^ ❤️ - the actual buffer: ^ ❤ \u{fe0f} Or after pressing the delete key: - what we see: ^ - the actual buffer: ^ Please note that currently, different platforms and software handle these special characters differently, and even the same software may handle these characters differently in different situations. For example, in my testing on Chrome on macOS, GitHub treats `y̆` as a single character, just like in this PR; however, in Rust Playground, `y̆` is treated as two characters, and pressing the delete key does not delete the entire `y̆` character, but instead deletes `\u{306}` to yield the character `y`. And they both treat `❤️` as a single character, pressing the delete key will delete the entire `❤️` character. This PR is based on the principle of making changes with the smallest impact on the code, and I think that deleting the entire character with the delete key is more intuitive. Release Notes: - Fix caret movement issue for some special characters --------- Co-authored-by: Conrad Irwin <conrad.irwin@gmail.com> Co-authored-by: Thorsten <thorsten@zed.dev> Co-authored-by: Bennet <bennetbo@gmx.de>
2024-04-10 19:01:25 +00:00
unicode-segmentation.workspace = true
util.workspace = true
2022-10-11 22:25:54 +00:00
[dev-dependencies]
ctor.workspace = true
env_logger.workspace = true
gpui = { workspace = true, features = ["test-support"] }
rand.workspace = true
util = { workspace = true, features = ["test-support"] }
criterion = { version = "0.5", features = ["html_reports"] }
Reduce memory usage to represent buffers by up to 50% (#10321) This should help with some of the memory problems reported in https://github.com/zed-industries/zed/issues/8436, especially the ones related to large files (see: https://github.com/zed-industries/zed/issues/8436#issuecomment2037442695), by **reducing the memory required to represent a buffer in Zed by ~50%.** ### How? Zed's memory consumption is dominated by the in-memory representation of buffer contents. On the lowest level, the buffer is represented as a [Rope](https://en.wikipedia.org/wiki/Rope_(data_structure)) and that's where the most memory is used. The layers above — buffer, syntax map, fold map, display map, ... — basically use "no memory" compared to the Rope. Zed's `Rope` data structure is itself implemented as [a `SumTree` of `Chunks`](https://github.com/zed-industries/zed/blob/8205c52d2bc204b8234f9306562d9000b1691857/crates/rope/src/rope.rs#L35-L38). An important constant at play here is `CHUNK_BASE`: `CHUNK_BASE` is the maximum length of a single text `Chunk` in the `SumTree` underlying the `Rope`. In other words: It determines into how many pieces a given buffer is split up. By changing `CHUNK_BASE` we can adjust the level of granularity withwhich we index a given piece of text. Theoretical maximum is the length of the text, theoretical minimum is 1. Sweet spot is somewhere inbetween, where memory use and performance of write & read access are optimal. We started with `16` as the `CHUNK_BASE`, but that wasn't the result of extensive benchmarks, more the first reasonable number that came to mind. ### What This changes `CHUNK_BASE` from `16` to `64`. That reduces the memory usage, trading it in for slight reduction in performance in certain benchmarks. ### Benchmarks I added a benchmark suite for `Rope` to determine whether we'd regress in performance as `CHUNK_BASE` goes up. I went from `16` to `32` and then to `64`. While `32` increased performance and reduced memory usage, `64` had one slight drop in performance, increases in other benchmarks and substantial memory savings. | `CHUNK_BASE` from `16` to `32` | `CHUNK_BASE` from `16` to `64` | |-------------------|--------------------| | ![chunk_base_16_to_32](https://github.com/zed-industries/zed/assets/1185253/fcf1f9c6-4f43-4e44-8ef5-29c1e5d8e2b9) | ![chunk_base_16_to_64](https://github.com/zed-industries/zed/assets/1185253/d82a0478-eeef-43d0-9240-e0aa9df8d946) | ### Real World Results We tested this by loading a 138 MB `*.tex` file (parsed as plain text) into Zed and measuring in `Instruments.app` the allocation. #### standard allocator Before, with `CHUNK_BASE: 16`, the memory usage was ~827MB after loading the buffer. | `CHUNK_BASE: 16` | |---------------------| | ![memory_consumption_chunk_base_16_std_alloc](https://github.com/zed-industries/zed/assets/1185253/c1e04c34-7d1a-49fa-bb3c-6ad10aec6e26) | After, with `CHUNK_BASE: 64`, the memory usage was ~396MB after loading the buffer. | `CHUNK_BASE: 64` | |---------------------| | ![memory_consumption_chunk_base_64_std_alloc](https://github.com/zed-industries/zed/assets/1185253/c728e134-1846-467f-b20f-114a582c7b5a) | #### `mimalloc` `MiMalloc` by default and that seems to be pretty aggressive when it comes to growing memory. Whereas the std allocator would go up to ~800mb, MiMalloc would jump straight to 1024MB. I also can't get `MiMalloc` to work properly with `Instruments.app` (it always shows 15MB of memory usage) so I had to use these `Activity Monitor` screenshots: | `CHUNK_BASE: 16` | |---------------------| | ![memory_consumption_chunk_base_16_mimalloc](https://github.com/zed-industries/zed/assets/1185253/1e6e05e9-80c2-4ec7-9b0e-8a6fa78836eb) | | `CHUNK_BASE: 64` | |---------------------| | ![memory_consumption_chunk_base_64_mimalloc](https://github.com/zed-industries/zed/assets/1185253/8a47e982-a675-4db0-b690-d60f1ff9acc8) | ### Release Notes Release Notes: - Reduced memory usage for files by up to 50%. --------- Co-authored-by: Antonio <antonio@zed.dev>
2024-04-09 16:07:53 +00:00
[[bench]]
name = "rope_benchmark"
harness = false