This PR adds a bit of telemetry for Anthropic models, in order to
understand model health. With this logging, we can monitor and diagnose
dips in performance, for example due to model rollouts.
Release Notes:
- N/A
---------
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Before this change, we'd see a ton of requests from the Ollama provider
trying to fetch models:
```
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: https://api.zed.dev/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
[2024-10-28T15:00:52+01:00 DEBUG reqwest::connect] starting new connection: http://localhost:11434/
```
Turns out we'd send a request on *every* change to settings.
Now, with this change, we only send a single request.
Release Notes:
- N/A
Co-authored-by: Bennet <bennet@zed.dev>
This PR depends on #19547
This PR adds support for tools from context servers. Context servers are
free to expose tools that Zed can pass to models. When called by the
model, Zed forwards the request to context servers. This allows for some
interesting techniques. Context servers can easily expose tools such as
querying local databases, reading or writing local files, reading
resources over authenticated APIs (e.g. kubernetes, asana, etc).
This is currently experimental.
Things to discuss
* I want to still add a confirm dialog asking people if a server is
allows to use the tool. Should do this or just use the tool and assume
trustworthyness of context servers?
* Can we add tool use behind a local setting flag?
Release Notes:
- N/A
---------
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
- Closes: https://github.com/zed-industries/zed/issues/19609
Switches us to using `-latest` tags with Anthropic models instead of
pinning to a specific date version.
See: [Anthropic Model
Docs](https://docs.anthropic.com/en/docs/about-claude/models)
This is a no-op for:
- Claude 3 Opus (`claude-3-opus-20240229`)
- Claude 3 Sonnet (`claude-3-sonnet-20240229`)
- Claude 3 Haiku (`claude-3-haiku-20240307`)
For Claude 3.5 Sonnet this will update us from
`claude-3-5-sonnet-20240620` to `claude-3-5-sonnet-20241022`. We will
also pickup any subsequent model updates automatically when Anthropic
updates the `latest` tag.
This matches the behavior for OpenAI where use `gpt-4o` as the
model_name and not `gpt-4o-2024-08-06`.
This PR adds a new `zed_urls` module to the `client` crate.
This module contains functions for constructing URLs to Zed properties,
such as zed.dev.
The URLs produced by this module will respect the server URL set via
settings or the `ZED_SERVER_URL` environment variable. This allows them
to correctly reflect the current environment (such as when testing Zed
against a local collab/zed.dev).
Release Notes:
- N/A
This PR adds support to the assistant for display billing-related
errors.
Pulling this out of #19081 to make it easier to cherry-pick.
Release Notes:
- N/A
Co-authored-by: Antonio <antonio@zed.dev>
Co-authored-by: Richard <richard@zed.dev>
This PR adds a bit more metadata for assistant logging.
Release Notes:
- Assistant: Added `language_name` and `model_provider` fields to
telemetry events.
---------
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
Co-authored-by: Max <max@zed.dev>
Users of our http_client crate knew they were interacting with isahc as
they set its extensions on the request. This change adds our own
equivalents for their APIs in preparation for changing the default http
client.
Release Notes:
- N/A
Release Notes:
- Allow Anthropic custom models to override "temperature"
This also centralized the defaulting of "temperature" to be inside of
each model's `into_x` call instead of being sprinkled around the code.
Release Notes:
- Added a new `assistant.inline_alternatives` setting to configure
additional models that will be used to perform inline assists in
parallel.
---------
Co-authored-by: Nathan <nathan@zed.dev>
Co-authored-by: Roy <roy@anthropic.com>
Co-authored-by: Adam <wolffiex@anthropic.com>
This PR does a little bit of a touch-up on the copywriting on the
assistant config UI. I had friends reporting to me that some of the
writing could be clearer, and hopefully, this goes into that direction!
Release Notes:
- N/A
Release Notes:
- Added support for OpenAI o1-mini and o1-preview models.
---------
Co-authored-by: Jason Mancuso <7891333+jvmncs@users.noreply.github.com>
Co-authored-by: Bennet <bennet@zed.dev>
This is a barebones modification of the OpenAI provider code to
accommodate non-streaming completions. This is specifically for the o1
models, which do not support streaming. Tested that this is working by
running a `/workflow` with the following (arbitrarily chosen) settings:
```json
{
"language_models": {
"openai": {
"version": "1",
"available_models": [
{
"name": "o1-preview",
"display_name": "o1-preview",
"max_tokens": 128000,
"max_completion_tokens": 30000
},
{
"name": "o1-mini",
"display_name": "o1-mini",
"max_tokens": 128000,
"max_completion_tokens": 20000
}
]
}
},
}
```
Release Notes:
- Changed `low_speed_timeout_in_seconds` option to `600` for OpenAI
provider to accommodate recent o1 model release.
---------
Co-authored-by: Peter <peter@zed.dev>
Co-authored-by: Bennet <bennet@zed.dev>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
Add `/auto` behind a feature flag that's disabled for now, even for
staff.
We've decided on a different design for context inference, but there are
parts of /auto that will be useful for that, so we want them in the code
base even if they're unused for now.
Release Notes:
- N/A
---------
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
This PR updates the message content for an LLM request to allow it
contain tool uses.
We need to send the tool uses back to the model in order for it to
recognize the subsequent tool results.
Release Notes:
- N/A
This PR makes it so we propagate the `stop_reason` from Anthropic up to
the Assistant so that we can take action based on it.
The `extract_content_from_events` function was moved from `anthropic` to
the `anthropic` module in `language_model` since it is more useful if it
is able to name the `LanguageModelCompletionEvent` type, as otherwise
we'd need an additional layer of plumbing.
Release Notes:
- N/A
This PR adjusts the approach we use to encoding tool uses in the
completion response to use a structured format rather than simply
injecting it into the response stream as text.
In #17170 we would encode the tool uses as XML and insert them as text.
This would require then re-parsing the tool uses out of the buffer in
order to use them.
The approach taken in this PR is to make `stream_completion` return a
stream of `LanguageModelCompletionEvent`s. Each of these events can be
either text, or a tool use.
A new `stream_completion_text` method has been added to `LanguageModel`
for scenarios where we only care about textual content (currently,
everywhere that isn't the Assistant context editor).
Release Notes:
- N/A
This PR updates the Assistant with support for receiving tool uses from
Anthropic models and capturing them as text in the context editor.
This is just laying the foundation for tool use. We don't yet fulfill
the tool uses yet, or define any tools for the model to use.
Here's an example of what it looks like using the example `get_weather`
tool from the Anthropic docs:
<img width="644" alt="Screenshot 2024-08-30 at 1 51 13 PM"
src="https://github.com/user-attachments/assets/3614f953-0689-423c-8955-b146729ea638">
Release Notes:
- N/A
This PR splits the `Content` type for Anthropic into two new types:
`RequestContent` and `ResponseContent`.
As I was going through the Anthropic API docs it seems that there are
different types of content that can be sent in requests vs what can be
returned in responses.
Using a separate type for each case tells the story a bit better and
makes it easier to understand, IMO.
Release Notes:
- N/A
This PR makes it so the model's cache configuration gets passed through
from the base model when using the Zed provider.
Release Notes:
- Fixed caching for Anthropic models when using the Zed provider.
### Pull Request Title
Introduce `max_output_tokens` Field for OpenAI Models
https://platform.deepseek.com/api-docs/news/news0725/#4-8k-max_tokens-betarelease-longer-possibilities
### Description
This commit introduces a new field `max_output_tokens` to the OpenAI
models, which allows specifying the maximum number of tokens that can be
generated in the output. This field is now integrated into the request
handling across multiple crates, ensuring that the output token limit is
respected during language model completions.
Changes include:
- Adding `max_output_tokens` to the `Custom` variant of the
`open_ai::Model` enum.
- Updating the `into_open_ai` method in `LanguageModelRequest` to accept
and use `max_output_tokens`.
- Modifying the `OpenAiLanguageModel` and `CloudLanguageModel`
implementations to pass `max_output_tokens` when converting requests.
- Ensuring that the `max_output_tokens` field is correctly serialized
and deserialized in relevant structures.
This enhancement provides more control over the output length of OpenAI
model responses, improving the flexibility and accuracy of language
model interactions.
### Changes
- Added `max_output_tokens` to the `Custom` variant of the
`open_ai::Model` enum.
- Updated the `into_open_ai` method in `LanguageModelRequest` to accept
and use `max_output_tokens`.
- Modified the `OpenAiLanguageModel` and `CloudLanguageModel`
implementations to pass `max_output_tokens` when converting requests.
- Ensured that the `max_output_tokens` field is correctly serialized and
deserialized in relevant structures.
### Related Issue
https://github.com/zed-industries/zed/pull/16358
### Screenshots / Media
N/A
### Checklist
- [x] Code compiles correctly.
- [x] All tests pass.
- [ ] Documentation has been updated accordingly.
- [ ] Additional tests have been added to cover new functionality.
- [ ] Relevant documentation has been updated or added.
### Release Notes
- Added `max_output_tokens` field to OpenAI models for controlling
output token length.
This makes it easier to debug why resetting a key doesn't work. We now
show when the key is set via an env var and if so, we disable the
reset-key button and instead give instructions.
![screenshot-2024-08-19-11 22
05@2x](https://github.com/user-attachments/assets/6c75dc82-cb61-4661-9647-f77fca8fdf41)
Release Notes:
- N/A
Co-authored-by: Bennet <bennet@zed.dev>
This commit adds a custom icon for Anthropic hosted models.
![CleanShot 2024-08-18 at 15 40
38@2x](https://github.com/user-attachments/assets/d467ccab-9628-4258-89fc-782e0d4a48d4)
![CleanShot 2024-08-18 at 15 40
34@2x](https://github.com/user-attachments/assets/7efaff9c-6a58-47ba-87ea-e0fe0586fedc)
- Adding a new SVG icon for Anthropic hosted models.
- The new icon is located at: `assets/icons/ai_anthropic_hosted.svg`
- Updating the LanguageModel trait to include an optional icon method
- Implementing the icon method for CloudModel to return the custom icon
for Anthropic hosted models
- Updating the UI components to use the model-specific icon when
available
- Adding a new IconName variant for the Anthropic hosted icon
We should change the non-hosted icon in some small way to distinguish it
from the hosted version. I duplicated the path for now so we can
hopefully add it for the next release.
Release Notes:
- N/A
Release Notes:
- Adds support for Prompt Caching in Anthropic. For models that support
it this can dramatically lower cost while improving performance.
This PR is just a refactor, to pave the way toward adding a view for
workflow step resolution. The entity carries the state of the tool
call's streaming output.
Release Notes:
- N/A
This PR is a refactor to pave the way for allowing the user to view and
edit workflow step resolutions. I've made tool calls work more like
normal streaming completions for all providers. The `use_any_tool`
method returns a stream of strings (which contain chunks of JSON). I've
also done some minor cleanup of language model providers in general,
removing the duplication around handling streaming responses.
Release Notes:
- N/A