Improve content generation prompt to reduce over-generation (#16333)

I focused on cases where we're inserting doc comments or annotations
above symbols.

I added 5 new examples to the content generation prompt, covering
various scenarios:

1. Inserting documentation for a Rust struct
2. Writing docstrings for a Python class
3. Adding comments to a TypeScript method
4. Adding a derive attribute to a Rust struct
5. Adding a decorator to a Python class

These examples demonstrate how to handle different languages and common
tasks like adding documentation, attributes, and decorators.

To improve context integration, I've made the following changes:

1. Added a `transform_context_range` that includes 3 lines before and
after the transform range
2. Introduced `rewrite_section_prefix` and `rewrite_section_suffix` to
provide more context around the section being rewritten
3. Updated the prompt template to include this additional context in a
separate code snippet

Release Notes:

- Reduced instances of over-generation when inserting docs or
annotations above a symbol.
This commit is contained in:
Nathan Sobo 2024-08-15 22:20:11 -06:00 committed by GitHub
parent bac39d7743
commit ad44b459cd
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 434 additions and 33 deletions

View file

@ -1,52 +1,426 @@
Here's a text file that I'm going to ask you to make an edit to.
You are an expert developer assistant working in an AI-enabled text editor.
Your task is to rewrite a specific section of the provided document based on a user-provided prompt.
{{#if language_name}}
The file is in {{language_name}}.
{{/if}}
You need to rewrite a portion of it.
The section you'll need to edit is marked with <rewrite_this></rewrite_this> tags.
<guidelines>
1. Scope: Modify only content within <rewrite_this> tags. Do not alter anything outside these boundaries.
2. Precision: Make changes strictly necessary to fulfill the given prompt. Preserve all other content as-is.
3. Seamless integration: Ensure rewritten sections flow naturally with surrounding text and maintain document structure.
4. Tag exclusion: Never include <rewrite_this>, </rewrite_this>, <edit_here>, or <insert_here> tags in the output.
5. Indentation: Maintain the original indentation level of the file in rewritten sections.
6. Completeness: Rewrite the entire tagged section, even if only partial changes are needed. Avoid omissions or elisions.
7. Insertions: Replace <insert_here></insert_here> tags with appropriate content as specified by the prompt.
8. Code integrity: Respect existing code structure and functionality when making changes.
9. Consistency: Maintain a uniform style and tone throughout the rewritten text.
</guidelines>
<examples>
<example>
<input>
<document>
{{{document_content}}}
use std::cell::Cell;
use std::collections::HashMap;
use std::cmp;
<rewrite_this>
<insert_here></insert_here>
</rewrite_this>
pub struct LruCache<K, V> {
/// The maximum number of items the cache can hold.
capacity: usize,
/// The map storing the cached items.
items: HashMap<K, V>,
}
// The rest of the implementation...
</document>
<prompt>
doc this
</prompt>
</input>
<incorrect_output failure="Over-generation. The text starting with `pub struct AabbTree<T> {` is *after* the rewrite_this tag">
/// Represents an Axis-Aligned Bounding Box (AABB) tree data structure.
///
/// This structure is used for efficient spatial queries and collision detection.
/// It organizes objects in a hierarchical tree structure based on their bounding boxes.
///
/// # Type Parameters
///
/// * `T`: The type of data associated with each node in the tree.
pub struct AabbTree<T> {
root: Option<usize>,
</incorrect_output>
<corrected_output improvement="Generation stops before repeating content after the rewrite_this section">
/// Represents an Axis-Aligned Bounding Box (AABB) tree data structure.
///
/// This structure is used for efficient spatial queries and collision detection.
/// It organizes objects in a hierarchical tree structure based on their bounding boxes.
///
/// # Type Parameters
///
/// * `T`: The type of data associated with each node in the tree.
</corrected_output>
</example>
<example>
<input>
<document>
import math
def calculate_circle_area(radius):
"""Calculate the area of a circle given its radius."""
return math.pi * radius ** 2
<rewrite_this>
<insert_here></insert_here>
</rewrite_this>
class Circle:
def __init__(self, radius):
self.radius = radius
def area(self):
return math.pi * self.radius ** 2
def circumference(self):
return 2 * math.pi * self.radius
# Usage example
circle = Circle(5)
print(f"Area: {circle.area():.2f}")
print(f"Circumference: {circle.circumference():.2f}")
</document>
<prompt>
write docs
</prompt>
</input>
<incorrect_output failure="Over-generation. The text starting with `class Circle:` is *after* the rewrite_this tag">
"""
Represents a circle with methods to calculate its area and circumference.
This class provides a simple way to work with circles in a geometric context.
It allows for the creation of Circle objects with a specified radius and
offers methods to compute the circle's area and circumference.
Attributes:
radius (float): The radius of the circle.
Methods:
area(): Calculates and returns the area of the circle.
circumference(): Calculates and returns the circumference of the circle.
"""
class Circle:
</incorrect_output>
<corrected_output improvement="Generation stops before repeating content after the rewrite_this section">
"""
Represents a circle with methods to calculate its area and circumference.
This class provides a simple way to work with circles in a geometric context.
It allows for the creation of Circle objects with a specified radius and
offers methods to compute the circle's area and circumference.
Attributes:
radius (float): The radius of the circle.
Methods:
area(): Calculates and returns the area of the circle.
circumference(): Calculates and returns the circumference of the circle.
"""
</corrected_output>
</example>
<example>
<input>
<document>
class BankAccount {
private balance: number;
constructor(initialBalance: number) {
this.balance = initialBalance;
}
<rewrite_this>
<insert_here></insert_here>
</rewrite_this>
deposit(amount: number): void {
if (amount > 0) {
this.balance += amount;
}
}
withdraw(amount: number): boolean {
if (amount > 0 && this.balance >= amount) {
this.balance -= amount;
return true;
}
return false;
}
getBalance(): number {
return this.balance;
}
}
// Usage
const account = new BankAccount(1000);
account.deposit(500);
console.log(account.getBalance()); // 1500
account.withdraw(200);
console.log(account.getBalance()); // 1300
</document>
<prompt>
//
</prompt>
</input>
<incorrect_output failure="Over-generation. The text starting with `deposit(amount: number): void {` is *after* the rewrite_this tag">
/**
* Deposits the specified amount into the bank account.
*
* @param amount The amount to deposit. Must be a positive number.
* @throws Error if the amount is not positive.
*/
deposit(amount: number): void {
if (amount > 0) {
this.balance += amount;
} else {
throw new Error("Deposit amount must be positive");
}
}
</incorrect_output>
<corrected_output improvement="Generation stops before repeating content after the rewrite_this section">
/**
* Deposits the specified amount into the bank account.
*
* @param amount The amount to deposit. Must be a positive number.
* @throws Error if the amount is not positive.
*/
</corrected_output>
</example>
<example>
<input>
<document>
use std::collections::VecDeque;
pub struct BinaryTree<T> {
root: Option<Node<T>>,
}
<rewrite_this>
<insert_here></insert_here>
</rewrite_this>
struct Node<T> {
value: T,
left: Option<Box<Node<T>>>,
right: Option<Box<Node<T>>>,
}
</document>
<prompt>
derive clone
</prompt>
</input>
<incorrect_output failure="Over-generation below the rewrite_this tags. Extra space between derive annotation and struct definition.">
#[derive(Clone)]
struct Node<T> {
value: T,
left: Option<Box<Node<T>>>,
right: Option<Box<Node<T>>>,
}
</incorrect_output>
<incorrect_output failure="Over-generation above the rewrite_this tags">
pub struct BinaryTree<T> {
root: Option<Node<T>>,
}
#[derive(Clone)]
</incorrect_output>
<incorrect_output failure="Over-generation below the rewrite_this tags">
#[derive(Clone)]
struct Node<T> {
value: T,
left: Option<Box<Node<T>>>,
right: Option<Box<Node<T>>>,
}
impl<T> Node<T> {
fn new(value: T) -> Self {
Node {
value,
left: None,
right: None,
}
}
}
</incorrect_output>
<corrected_output improvement="Only includes the new content within the rewrite_this tags">
#[derive(Clone)]
</corrected_output>
</example>
<example>
<input>
<document>
import math
def calculate_circle_area(radius):
"""Calculate the area of a circle given its radius."""
return math.pi * radius ** 2
<rewrite_this>
<insert_here></insert_here>
</rewrite_this>
class Circle:
def __init__(self, radius):
self.radius = radius
def area(self):
return math.pi * self.radius ** 2
def circumference(self):
return 2 * math.pi * self.radius
# Usage example
circle = Circle(5)
print(f"Area: {circle.area():.2f}")
print(f"Circumference: {circle.circumference():.2f}")
</document>
<prompt>
add dataclass decorator
</prompt>
</input>
<incorrect_output failure="Over-generation. The text starting with `class Circle:` is *after* the rewrite_this tag">
@dataclass
class Circle:
radius: float
def __init__(self, radius):
self.radius = radius
def area(self):
return math.pi * self.radius ** 2
</incorrect_output>
<corrected_output improvement="Generation stops before repeating content after the rewrite_this section">
@dataclass
</corrected_output>
</example>
<example>
<input>
<document>
interface ShoppingCart {
items: string[];
total: number;
}
<rewrite_this>
<insert_here></insert_here>class ShoppingCartManager {
</rewrite_this>
private cart: ShoppingCart;
constructor() {
this.cart = { items: [], total: 0 };
}
addItem(item: string, price: number): void {
this.cart.items.push(item);
this.cart.total += price;
}
getTotal(): number {
return this.cart.total;
}
}
// Usage
const manager = new ShoppingCartManager();
manager.addItem("Book", 15.99);
console.log(manager.getTotal()); // 15.99
</document>
<prompt>
add readonly modifier
</prompt>
</input>
<incorrect_output failure="Over-generation. The line starting with ` items: string[];` is *after* the rewrite_this tag">
readonly interface ShoppingCart {
items: string[];
total: number;
}
class ShoppingCartManager {
private readonly cart: ShoppingCart;
constructor() {
this.cart = { items: [], total: 0 };
}
</incorrect_output>
<corrected_output improvement="Only includes the new content within the rewrite_this tags and integrates cleanly into surrounding code">
readonly interface ShoppingCart {
</corrected_output>
</example>
</examples>
With these examples in mind, edit the following file:
<document language="{{ language_name }}">
{{{ document_content }}}
</document>
{{#if is_truncated}}
The context around the relevant section has been truncated (possibly in the middle of a line) for brevity.
The provided document has been truncated (potentially mid-line) for brevity.
{{/if}}
Rewrite the section of {{content_type}} in <rewrite_this></rewrite_this> tags based on the following prompt:
<prompt>
{{{user_prompt}}}
</prompt>
Here's the section to edit based on that prompt again for reference:
<rewrite_this>
{{{rewrite_section}}}
</rewrite_this>
You'll rewrite this entire section, but you will only make changes within certain subsections.
<instructions>
{{#if has_insertion}}
Insert text anywhere you see it marked with with <insert_here></insert_here> tags. Do not include <insert_here> tags in your output.
Insert text anywhere you see marked with <insert_here></insert_here> tags. It's CRITICAL that you DO NOT include <insert_here> tags in your output.
{{/if}}
{{#if has_replacement}}
Edit edit text that you see surrounded with <edit_here></edit_here> tags. Do not include <edit_here> tags in your output.
Edit text that you see surrounded with <edit_here>...</edit_here> tags. It's CRITICAL that you DO NOT include <edit_here> tags in your output.
{{/if}}
Make no changes to the rewritten content outside these tags.
<snippet language="Rust" annotated="true">
{{{ rewrite_section_prefix }}}
<rewrite_this>
{{{rewrite_section_with_selections}}}
{{{ rewrite_section_with_edits }}}
</rewrite_this>
{{{ rewrite_section_suffix }}}
</snippet>
Only make changes that are necessary to fulfill the prompt, leave everything else as-is. All surrounding {{content_type}} will be preserved. Do not output the <rewrite_this></rewrite this> tags or anything outside of them.
Rewrite the lines enclosed within the <rewrite_this></rewrite_this> tags in accordance with the provided instructions and the prompt below.
Start at the indentation level in the original file in the rewritten {{content_type}}. Don't stop until you've rewritten the entire section, even if you have no more changes to make. Always write out the whole section with no unnecessary elisions.
<prompt>
{{{ user_prompt }}}
</prompt>
Do not include <insert_here> or <edit_here> annotations in your output. Here is a clean copy of the snippet without annotations for your reference.
<snippet>
{{{ rewrite_section_prefix }}}
{{{ rewrite_section }}}
{{{ rewrite_section_suffix }}}
</snippet>
</instructions>
<guidelines_reminder>
1. Focus on necessary changes: Modify only what's required to fulfill the prompt.
2. Preserve context: Maintain all surrounding content as-is, ensuring the rewritten section seamlessly integrates with the existing document structure and flow.
3. Exclude annotation tags: Do not output <rewrite_this>, </rewrite_this>, <edit_here>, or <insert_here> tags.
4. Maintain indentation: Begin at the original file's indentation level.
5. Complete rewrite: Continue until the entire section is rewritten, even if no further changes are needed.
6. Avoid elisions: Always write out the full section without unnecessary omissions. NEVER say `// ...` or `// ...existing code` in your output.
7. Respect content boundaries: Preserve code integrity.
</guidelines_reminder>
Immediately start with the following format with no remarks:
```
\{{REWRITTEN_CODE}}
{{REWRITTEN_CODE}}
```

View file

@ -45,6 +45,7 @@ use std::{
task::{self, Poll},
time::{Duration, Instant},
};
use text::OffsetRangeExt as _;
use theme::ThemeSettings;
use ui::{prelude::*, CheckboxWithLabel, IconButtonShape, Popover, Tooltip};
use util::{RangeExt, ResultExt};
@ -2354,6 +2355,15 @@ impl Codegen {
return Err(anyhow::anyhow!("invalid transformation range"));
};
let mut transform_context_range = transform_range.to_point(&transform_buffer);
transform_context_range.start.row = transform_context_range.start.row.saturating_sub(3);
transform_context_range.start.column = 0;
transform_context_range.end =
(transform_context_range.end + Point::new(3, 0)).min(transform_buffer.max_point());
transform_context_range.end.column =
transform_buffer.line_len(transform_context_range.end.row);
let transform_context_range = transform_context_range.to_offset(&transform_buffer);
let selected_ranges = self
.selected_ranges
.iter()
@ -2376,6 +2386,7 @@ impl Codegen {
transform_buffer,
transform_range,
selected_ranges,
transform_context_range,
)
.map_err(|e| anyhow::anyhow!("Failed to generate content prompt: {}", e))?;

View file

@ -16,7 +16,9 @@ pub struct ContentPromptContext {
pub document_content: String,
pub user_prompt: String,
pub rewrite_section: String,
pub rewrite_section_with_selections: String,
pub rewrite_section_prefix: String,
pub rewrite_section_suffix: String,
pub rewrite_section_with_edits: String,
pub has_insertion: bool,
pub has_replacement: bool,
}
@ -173,6 +175,7 @@ impl PromptBuilder {
buffer: BufferSnapshot,
transform_range: Range<usize>,
selected_ranges: Vec<Range<usize>>,
transform_context_range: Range<usize>,
) -> Result<String, RenderError> {
let content_type = match language_name {
None | Some("Markdown" | "Plain Text") => "text",
@ -202,6 +205,7 @@ impl PromptBuilder {
for chunk in buffer.text_for_range(truncated_before) {
document_content.push_str(chunk);
}
document_content.push_str("<rewrite_this>\n");
for chunk in buffer.text_for_range(transform_range.clone()) {
document_content.push_str(chunk);
@ -217,7 +221,17 @@ impl PromptBuilder {
rewrite_section.push_str(chunk);
}
let rewrite_section_with_selections = {
let mut rewrite_section_prefix = String::new();
for chunk in buffer.text_for_range(transform_context_range.start..transform_range.start) {
rewrite_section_prefix.push_str(chunk);
}
let mut rewrite_section_suffix = String::new();
for chunk in buffer.text_for_range(transform_range.end..transform_context_range.end) {
rewrite_section_suffix.push_str(chunk);
}
let rewrite_section_with_edits = {
let mut section_with_selections = String::new();
let mut last_end = 0;
for selected_range in &selected_ranges {
@ -254,7 +268,9 @@ impl PromptBuilder {
document_content,
user_prompt,
rewrite_section,
rewrite_section_with_selections,
rewrite_section_prefix,
rewrite_section_suffix,
rewrite_section_with_edits,
has_insertion,
has_replacement,
};