Commit graph

43 commits

Author SHA1 Message Date
Kyle Kelley
49371b44cb
Semantic Index (#10329)
This introduces semantic indexing in Zed based on chunking text from
files in the developer's workspace and creating vector embeddings using
an embedding model. As part of this, we've created an embeddings
provider trait that allows us to work with OpenAI, a local Ollama model,
or a Zed hosted embedding.

The semantic index is built by breaking down text for known
(programming) languages into manageable chunks that are smaller than the
max token size. Each chunk is then fed to a language model to create a
high dimensional vector which is then normalized to a unit vector to
allow fast comparison with other vectors with a simple dot product.
Alongside the vector, we store the path of the file and the range within
the document where the vector was sourced from.

Zed will soon grok contextual similarity across different text snippets,
allowing for natural language search beyond keyword matching. This is
being put together both for human-based search as well as providing
results to Large Language Models to allow them to refine how they help
developers.

Remaining todo:

* [x] Change `provider` to `model` within the zed hosted embeddings
database (as its currently a combo of the provider and the model in one
name)


Release Notes:

- N/A

---------

Co-authored-by: Nathan Sobo <nathan@zed.dev>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Conrad Irwin <conrad@zed.dev>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
Co-authored-by: Antonio <antonio@zed.dev>
2024-04-12 11:40:59 -06:00
Nathan Sobo
8ae5a3b61a
Allow AI interactions to be proxied through Zed's server so you don't need an API key (#7367)
Co-authored-by: Antonio <antonio@zed.dev>

Resurrected this from some assistant work I did in Spring of 2023.
- [x] Resurrect streaming responses
- [x] Use streaming responses to enable AI via Zed's servers by default
(but preserve API key option for now)
- [x] Simplify protobuf
- [x] Proxy to OpenAI on zed.dev
- [x] Proxy to Gemini on zed.dev
- [x] Improve UX for switching between openAI and google models
- We current disallow cycling when setting a custom model, but we need a
better solution to keep OpenAI models available while testing the google
ones
- [x] Show remaining tokens correctly for Google models
- [x] Remove semantic index
- [x] Delete `ai` crate
- [x] Cloud front so we can ban abuse
- [x] Rate-limiting
- [x] Fix panic when using inline assistant
- [x] Double check the upgraded `AssistantSettings` are
backwards-compatible
- [x] Add hosted LLM interaction behind a `language-models` feature
flag.

Release Notes:

- We are temporarily removing the semantic index in order to redesign it
from scratch.

---------

Co-authored-by: Antonio <antonio@zed.dev>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Thorsten <thorsten@zed.dev>
Co-authored-by: Max <max@zed.dev>
2024-03-19 19:22:26 +01:00
Marshall Bowers
22fe03913c
Move Clippy configuration to the workspace level (#8891)
This PR moves the Clippy configuration up to the workspace level.

We're using the [`lints`
table](https://doc.rust-lang.org/cargo/reference/workspaces.html#the-lints-table)
to configure the Clippy ruleset in the workspace's `Cargo.toml`.

Each crate in the workspace now has the following in their own
`Cargo.toml` to inherit the lints from the workspace:

```toml
[lints]
workspace = true
```

This allows for configuring rust-analyzer to show Clippy lints in the
editor by using the following configuration in your Zed `settings.json`:

```json
{
  "lsp": {
    "rust-analyzer": {
      "initialization_options": {
        "check": {
          "command": "clippy"
        }
      }
    }
  }
```

Release Notes:

- N/A
2024-03-05 12:01:17 -05:00
Dzmitry Malyshau
a44fc24445
Clean up many small dependencies (part 3) (#8425)
Follow-up to #8353

Release Notes:
- N/A
2024-02-26 11:08:57 +02:00
Piotr Osiewicz
743f9b345f
chore: Move workspace dependencies to workspace.dependencies (#7454)
We should prefer referring to local deps via `.workspace = true` from
now on.

Release Notes:

- N/A
2024-02-06 20:41:36 +01:00
Marshall Bowers
dbb5fad147
Fix some formatting issues in Cargo.toml files (#7127)
This PR fixes some formatting issues in some of the `Cargo.toml` files.

I tried to fix most of these in #7126, but there were a few that I
missed.

Release Notes:

- N/A
2024-01-30 22:01:35 -05:00
Marshall Bowers
e338f34097
Sort dependencies in Cargo.toml files (#7126)
This PR sorts the dependency lists in our `Cargo.toml` files so that
they are in alphabetical order.

This should make them easier to visually scan when looking for a
dependency.

Apologies in advance for any merge conflicts 🙈 

Release Notes:

- N/A
2024-01-30 21:41:29 -05:00
Piotr Osiewicz
e6ebe7974d
gpui: Add Global marker trait (#7095)
This should prevent a class of bugs where one queries the wrong type of
global, which results in oddities at runtime.

Release Notes:

- N/A

---------

Co-authored-by: Marshall <marshall@zed.dev>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
2024-01-30 14:08:20 -05:00
Marshall Bowers
0cb8b0e451
Clean up Cargo.toml files (#7044)
This PR cleans up some inconsistencies in the `Cargo.toml` files that
were driving me crazy.

Release Notes:

- N/A
2024-01-29 23:47:20 -05:00
Piotr Osiewicz
0a0a866dd5
Licenses: change license fields in Cargo.toml to AGPL-3.0-or-later. (#5535)
Release Notes:
- N/A
2024-01-27 13:51:16 +01:00
Julian Braha
fd6f71d287 Replace tempdir crate with tempfile 2024-01-24 17:58:09 +00:00
Piotr Osiewicz
f2ff7fa4d5
chore: Change AGPL-licensed crates to GPL (except for collab) (#4231)
- [x] Fill in GPL license text.
- [x] live_kit_client depends on live_kit_server as non-dev dependency,
even though it seems to only be used for tests. Is that an issue?

Release Notes:
- N/A
2024-01-24 00:26:58 +01:00
Piotr Osiewicz
21e6b09361
Remove license-file from Cargo.toml as it is apparently redundant (#4218)
Release Notes:

- N/A
2024-01-23 17:40:30 +01:00
Piotr Osiewicz
678bdddd7d
chore: Add crate licenses. (#4158)
- GPUI and all dependencies: Apache 2
- Everything else: AGPL

Here's a script that I've generated for it:
https://gist.github.com/osiewicz/6afdd6626e517da24a2092807e6f0b6e

Release Notes:
- N/A

---------

Co-authored-by: David <david@zed.dev>
2024-01-23 16:56:22 +01:00
Piotr Osiewicz
1c77104050 chore: Enable asset compression
This reduces size of release binary by ~20% from 134MB to 107MB without noticeable slowdown on startup. Assets are decompressed granularly, on first access
2024-01-10 13:54:25 +01:00
Max Brunsfeld
f5ba22659b Remove 2 suffix from gpui
Co-authored-by: Mikayla <mikayla@zed.dev>
2024-01-03 12:59:39 -08:00
Max Brunsfeld
0cf65223ce Remove 2 suffix for collab, rope, settings, menu
Co-authored-by: Mikayla <mikayla@zed.dev>
2024-01-03 12:29:16 -08:00
Max Brunsfeld
5ddd298b4d Remove 2 suffix for fs, db, semantic_index, prettier
Co-authored-by: Mikayla <mikayla@zed.dev>
2024-01-03 12:09:42 -08:00
Joseph T. Lyons
516a8790b9 Add gpt-4-1106-preview model 2023-11-14 08:28:57 -05:00
KCaverly
3447a9478c updated authentication for embedding provider 2023-10-26 11:18:16 +02:00
Kirill Bulatov
525ff6bf74 Remove zed -> ... -> semantic_index -> zed Cargo dependency cycle 2023-10-13 10:27:08 +03:00
Mikayla
6007c8705c
Upgrade SeaORM to latest version, also upgrade sqlite bindings, rustqlite, and remove SeaQuery
co-authored-by: Max <max@zed.dev>
2023-10-03 12:16:53 -07:00
KCaverly
abefa2738b removed blas and increase batch size for vector search 2023-09-27 09:43:23 -04:00
KCaverly
e75f56a0f2 move to system blas 2023-09-26 12:39:22 -04:00
KCaverly
86ec0b1d9f implement new search strategy 2023-09-25 13:44:19 -04:00
KCaverly
68c37ca2a4 move embedding provider to ai crate 2023-09-22 09:33:59 -04:00
KCaverly
25cb79e475 remove git2 dependency for repository cloning in semantic_index eval 2023-09-19 18:55:15 -04:00
KCaverly
d85acceeec move git2 to workspace dependency globally 2023-09-19 16:13:47 -04:00
KCaverly
3a661c5977 catchup with main 2023-09-15 09:31:33 -04:00
Antonio Scandurra
ae85a520f2 Refactor semantic searching of modified buffers 2023-09-15 12:12:20 +02:00
KCaverly
eff44f9aa4 semantic index eval, indexing appropriately 2023-09-13 20:02:15 -04:00
KCaverly
66c967da88 start work on eval script for semantic_index 2023-09-12 16:25:31 -04:00
Antonio Scandurra
d4cff68475 🎨 2023-09-05 15:52:36 +02:00
KCaverly
4f8b95cf0d add proper handling for open ai rate limit delays 2023-08-29 15:44:51 -04:00
KCaverly
3d89cd10a4 added sha1 encoding for each document 2023-08-21 16:35:57 +02:00
KCaverly
599f674827 add php support for semantic search 2023-07-31 16:36:09 -04:00
KCaverly
ca4e21881e add ruby support for semantic search 2023-07-31 10:54:30 -04:00
KCaverly
a5dd8dd0a9 add lua embedding query for semantic search 2023-07-31 10:02:28 -04:00
KCaverly
e02d6bc0d4 add glob filtering functionality to semantic search 2023-07-20 13:46:27 -04:00
KCaverly
efe973ebe2 add embedding query for json with nested arrays and strings
Co-authored-by: maxbrunsfeld <max@zed.dev>
2023-07-19 16:52:44 -04:00
KCaverly
9809ec3d70 update treesitter parsing to accomodate for collapsed nested functions
Co-authored-by: maxbrunsfeld <max@zed.dev>
2023-07-19 15:47:05 -04:00
Max Brunsfeld
afc4c10ec1 Start work on exposing semantic search via project search view
Co-authored-by: Kyle <kyle@zed.dev>
2023-07-17 18:10:51 -07:00
KCaverly
8b42f5b1b3 rename vector_store crate to semantic_index 2023-07-17 17:06:10 -04:00
Renamed from crates/vector_store/Cargo.toml (Browse further)