This pull request introduces several improvements to the semantic search
experience. We're still missing collaboration and searching modified
buffers, which we'll tackle after we take a detour into reducing the
number of tokens used to generate embeddings.
Release Notes:
- Fixed a bug that could prevent semantic search from working when
deploying right after opening a project.
- Fixed a panic that could sometimes occur when using semantic search
while simultaneously changing a file.
- Fixed a bug that prevented semantic search from including new
worktrees when adding them to a project.
Prior to this PR, we assumed that all call events needed a room_id, but
we now have call-based actions that don't need a room_id - for instance,
you can right click a channel and view the notes while not in a call. In
this case, there is no room_id. We want to be able to track these
events, which requires removing the restriction that requires a room_id.
Release Notes:
- N/A
* [x] Re-send operations that weren't sent while disconnected
* [x] Apply other clients' operations that were missed while
disconnected
* [x] Update collaborators that joined / left while disconnected
* [x] Inform current collaborators that your peer id has changed
* [x] Refresh channel buffer collaborators on server restart
* [x] randomized test
This PR updates our client code to connect to preview whenever we're not
on stable. This will make it more likely that we'll be able to
collaborate on a dev build, but obviously won't work if there's a
protocol change on main that hasn't made its way to preview yet.
This PR ships a series of optimizations for the semantic search engine.
Mostly focused on removing invalid states, optimizing requests to
OpenAI, and reducing token usage.
Release Notes (Preview-Only):
- Added eager incremental indexing in the background on a debounce.
- Added a local embeddings cache for reducing redundant calls to OpenAI.
- Moved to an Embeddings Queue model which ensures optimal batch sizes
at the token level, and atomic file & document writes.
- Adjusted OpenAI Embedding API requests to use provided backoff delays
during Rate Limiting.
- Removed flush races between parsing files step and embedding queue
steps.
- Moved truncation to parsing step reducing the probability that OpenAI
encounters bad data.