ProTeGi: Prompt Optimization with Textual Gradients

ProTeGi improves prompts by identifying failures, generating critiques, and applying targeted fixes using Textual Gradients for systematic optimization.

ProTeGi (Prompt optimization with Textual Gradients) systematically improves prompts by identifying failure patterns, generating targeted critiques, and applying specific fixes. It uses beam search to maintain multiple candidate prompts and progressively refines them.

When to Use ProTeGi

✅ Best For

Debugging specific failure modes
Systematic error correction
Tasks with clear failure patterns
Iterative refinement workflows

❌ Not Ideal For

Quick experiments (multi-stage process)
Tasks where failures are random
Very small datasets
Budget-constrained projects

How It Works

ProTeGi follows a structured expansion and selection process:

Identify Failures

Run current prompts and identify examples with low scores

Generate Critiques

Teacher model analyzes failures and generates multiple specific critiques (“gradients”)

Apply Improvements

For each critique, generate improved prompt variations

Beam Selection

Evaluate all candidates and keep top N prompts

Iterate

Repeat expansion from the best performing prompts

Note

ProTeGi maintains a “beam” of candidate prompts throughout optimization, preventing premature convergence to local optima.

Basic Usage

from fi.opt.optimizers import ProTeGi
from fi.opt.generators import LiteLLMGenerator
from fi.opt.datamappers import BasicDataMapper
from fi.opt.base.evaluator import Evaluator

# Setup teacher model
teacher = LiteLLMGenerator(
    model="gpt-4o",
    prompt_template="{prompt}"
)

# Setup evaluator
evaluator = Evaluator(
    eval_template="context_relevance",
    eval_model_name="turing_flash",
    fi_api_key="your_key",
    fi_secret_key="your_secret"
)

# Setup data mapper
data_mapper = BasicDataMapper(
    key_map={"input": "question", "output": "generated_output"}
)

# Create optimizer
optimizer = ProTeGi(
    teacher_generator=teacher,
    num_gradients=4,
    errors_per_gradient=4,
    prompts_per_gradient=1,
    beam_size=4
)

# Run optimization
result = optimizer.optimize(
    evaluator=evaluator,
    data_mapper=data_mapper,
    dataset=dataset,
    initial_prompts=["Answer the question: {question}"],
    num_rounds=3,
    eval_subset_size=32
)

Parameters

Parameter	Type	Default	Description
`teacher_generator`	LiteLLMGenerator	required	Model for critiques and improved prompts (e.g. gpt-4o)
`num_gradients`	int	4	Critiques to generate per prompt
`errors_per_gradient`	int	4	Failed examples shown to teacher per critique
`prompts_per_gradient`	int	1	New prompts per critique (2–3 for more exploration)
`beam_size`	int	4	Top prompts to keep each round
`num_rounds`	int	3	Rounds (passed to `optimize()`)
`eval_subset_size`	int	None	Examples per round; None = full dataset

Tips: Use a strong teacher; beam_size 3–4 is a good default. Plateau: increase beam_size or num_gradients. Slow: set eval_subset_size=20 or reduce beam_size.

Underlying Research

ProTeGi introduces a gradient-inspired approach to prompt optimization, adapting concepts from numerical optimization to natural language.

Core paper: Automatic Prompt Optimization with “Gradient Descent” and Beam Search details how to create “textual gradients” (critiques) to guide prompt improvement.
Extensions: Momentum-Aided Gradient Descent Prompt Optimization incorporates momentum to accelerate convergence.
Classification: In surveys on automatic prompt engineering, ProTeGi is categorized as a pioneering gradient-based method for error-driven refinement.

Next steps

Try PromptWizard

For exploration-first refinement

Compare Optimizers

See all optimization strategies.

Was this page helpful?

Questions & Discussion