Commit graph

17 commits

Author SHA1 Message Date
Kyle Kelley
49371b44cb
Semantic Index (#10329)
This introduces semantic indexing in Zed based on chunking text from
files in the developer's workspace and creating vector embeddings using
an embedding model. As part of this, we've created an embeddings
provider trait that allows us to work with OpenAI, a local Ollama model,
or a Zed hosted embedding.

The semantic index is built by breaking down text for known
(programming) languages into manageable chunks that are smaller than the
max token size. Each chunk is then fed to a language model to create a
high dimensional vector which is then normalized to a unit vector to
allow fast comparison with other vectors with a simple dot product.
Alongside the vector, we store the path of the file and the range within
the document where the vector was sourced from.

Zed will soon grok contextual similarity across different text snippets,
allowing for natural language search beyond keyword matching. This is
being put together both for human-based search as well as providing
results to Large Language Models to allow them to refine how they help
developers.

Remaining todo:

* [x] Change `provider` to `model` within the zed hosted embeddings
database (as its currently a combo of the provider and the model in one
name)


Release Notes:

- N/A

---------

Co-authored-by: Nathan Sobo <nathan@zed.dev>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Conrad Irwin <conrad@zed.dev>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
Co-authored-by: Antonio <antonio@zed.dev>
2024-04-12 11:40:59 -06:00
Kirill Bulatov
525ff6bf74 Remove zed -> ... -> semantic_index -> zed Cargo dependency cycle 2023-10-13 10:27:08 +03:00
Kirill Bulatov
4b15a2bd63 Rebase fixes 2023-10-11 12:56:29 +03:00
Mikayla Maki
591ec02cea
Add support for the experimental Next LS for Elixir (#3024)
This is a PR I built for a friend of a friend at StrangeLoop, who is
making a much better LSP for elixir that elixir folks want to experiment
with. This PR also improves the our debug log viewer to handle LSP
restarts.

TODO:
- [ ] Make sure NextLS binary loading works.

Release Notes:

- Added support for the experimental Next LS for Elxir, to enable it add
the following field to your settings to enable:

```json
"elixir": {
    "next": "on"
}
```
2023-09-25 12:52:56 -05:00
KCaverly
68c37ca2a4 move embedding provider to ai crate 2023-09-22 09:33:59 -04:00
Mikayla
0cceb3fdf1
Get nextLS running 2023-09-20 06:55:24 -07:00
KCaverly
11b3bfdc99 fix warnings 2023-09-19 19:05:26 -04:00
KCaverly
25cb79e475 remove git2 dependency for repository cloning in semantic_index eval 2023-09-19 18:55:15 -04:00
KCaverly
d85acceeec move git2 to workspace dependency globally 2023-09-19 16:13:47 -04:00
KCaverly
25bd357426 add recall and precision to semantic index 2023-09-18 18:25:02 -04:00
KCaverly
566bb9f71b add map to evaluation suite for semantic_index 2023-09-18 09:57:52 -04:00
KCaverly
04bd107ada add ndcg@k to evaluate metrics 2023-09-15 10:36:21 -04:00
KCaverly
137dda3ee6 wip eval framework for semantic index 2023-09-14 09:30:19 -04:00
KCaverly
0c1b2e5aa6 cleaned up warnings 2023-09-13 20:04:53 -04:00
KCaverly
eff44f9aa4 semantic index eval, indexing appropriately 2023-09-13 20:02:15 -04:00
KCaverly
6f29582fb0 progress on eval 2023-09-13 10:32:36 -04:00
KCaverly
66c967da88 start work on eval script for semantic_index 2023-09-12 16:25:31 -04:00