Skip to main content
This guide provides a comprehensive walkthrough of how to use the agent-opt library to automate the improvement of your workflows. You’ll learn how to set up the necessary components, choose the right optimization strategy, run the process, and analyze the results.

1. Installation

First, install the agent-opt library using pip:
pip install agent-opt
You will also need to have your API keys for the desired language models set as environment variables.
export FI_API_KEY="your_api_key"
export FI_SECRET_KEY="your_secret_key"

2. Core Concepts

The library is built around a few key components that work together:

Optimizer

The engine that drives the improvement process. You choose an optimizer based on your specific task (e.g., BayesianSearchOptimizer for few-shot tasks or GEPAOptimizer for complex reasoning).

Evaluator

The component responsible for scoring the quality of prompt outputs. It uses a specified model and an evaluation template to judge how well a prompt is performing.

DataMapper

A utility that maps the fields from your dataset to the keys expected by the optimizer and evaluator, ensuring the data flows correctly through the system.

Dataset

A simple list of dictionaries that serves as the ground truth for your optimization. Each item in the list represents a data point for evaluation.

3. Step-by-Step Guide to Optimization

Let’s walk through a complete example of optimizing a summarization workflow.

Step 1: Prepare Your Dataset

Your dataset is a standard Python list of dictionaries. Each dictionary should contain the necessary fields for your task. For a summarization task, you might have an article and a target_summary.
dataset = [
    {
        "article": "The James Webb Space Telescope has captured stunning new images of the Pillars of Creation...",
        "target_summary": "The JWST has taken new pictures of the Pillars of Creation."
    },
    {
        "article": "Researchers have discovered a new enzyme that can break down plastics at record speed...",
        "target_summary": "A new enzyme that rapidly breaks down plastics has been found."
    },
    # ... more data points
]

Step 2: Configure the Evaluator

The Evaluator scores the outputs generated by your prompts. You need to provide it with an evaluation template and the model to use for scoring.
from fi.opt.base.evaluator import Evaluator

evaluator = Evaluator(
    eval_template="summary_quality",  # A built-in template for summarization
    eval_model_name="turing_flash",   # The model to perform the evaluation
    fi_api_key="your_key",
    fi_secret_key="your_secret"
)

Step 3: Configure the DataMapper

The DataMapper tells the optimizer how to find the input and output values within your dataset.
from fi.opt.datamappers import BasicDataMapper

data_mapper = BasicDataMapper(
    key_map={
        "input": "article",          # Maps the 'input' to the 'article' field in the dataset
        "output": "generated_output" # The key for the model's generated text
    }
)

Step 4: Choose and Initialize an Optimizer

Select an optimizer that fits your use case. For general-purpose refinement, MetaPromptOptimizer is a great choice.
Not sure which optimizer to use? Check out our Optimizers Overview for a detailed comparison.
from fi.opt.optimizers import MetaPromptOptimizer
from fi.opt.generators import LiteLLMGenerator

# The teacher model is a powerful LLM that guides the optimization
teacher = LiteLLMGenerator(model="gpt-4o", prompt_template="{prompt}")

optimizer = MetaPromptOptimizer(
    teacher_generator=teacher,
    num_rounds=5  # Number of refinement iterations
)

Step 5: Run the Optimization

Now, pass all the components to the optimize method.
initial_prompt = "Summarize the following article: {article}"

result = optimizer.optimize(
    evaluator=evaluator,
    data_mapper=data_mapper,
    dataset=dataset,
    initial_prompts=[initial_prompt],
    task_description="Generate a concise, one-sentence summary of the article.",
    eval_subset_size=10  # Use a subset of the data for faster evaluation per round
)

Step 6: Analyze the Results

The result object contains everything you need to understand the outcome.
# Print the final score and the best prompt found
print(f"Final Score: {result.final_score:.4f}")
print(f"Best Prompt:\n{result.best_generator.get_prompt_template()}")

# Review the history of the optimization
for i, iteration in enumerate(result.history):
    print(f"\n--- Round {i+1} ---")
    print(f"Score: {iteration.average_score:.4f}")
    print(f"Prompt: {iteration.prompt}")

4. Examples for Different Optimizers

Different tasks benefit from different optimization strategies.

Bayesian Search for Few-Shot Optimization

If your task benefits from few-shot examples (e.g., classification, structured data extraction), BayesianSearchOptimizer is the ideal choice. It intelligently finds the best number and combination of examples.
from fi.opt.optimizers import BayesianSearchOptimizer

# Dataset with examples for a classification task
dataset = [
    {"text": "This movie was fantastic!", "label": "Positive"},
    {"text": "I would not recommend this product.", "label": "Negative"},
    # ... more examples
]

# Initialize the optimizer to search for 2 to 5 few-shot examples
bayesian_optimizer = BayesianSearchOptimizer(
    inference_model_name="gpt-4o-mini",
    n_trials=20,          # Number of configurations to test
    min_examples=2,
    max_examples=5,
    example_template="Text: {text}\nSentiment: {label}" # How to format each example
)

# Run the optimization
result = bayesian_optimizer.optimize(
    evaluator=evaluator,
    data_mapper=BasicDataMapper(key_map={"input": "text", "output": "generated_output"}),
    dataset=dataset,
    initial_prompts=["Classify the sentiment of the following text:"]
)

print(f"Best few-shot prompt:\n{result.best_generator.get_prompt_template()}")

ProTeGi for Systematic Error Correction

If you have a prompt that fails in specific, identifiable ways, ProTeGi can systematically debug it. It generates critiques (“textual gradients”) of the failures and applies targeted fixes.
from fi.opt.optimizers import ProTeGi

protegi_optimizer = ProTeGi(
    teacher_generator=LiteLLMGenerator(model="gpt-4o", prompt_template="{prompt}"),
    num_gradients=4,       # Number of critiques to generate per failure
    beam_size=4,           # Number of candidate prompts to maintain
    num_rounds=3
)

# Run the optimization
result = protegi_optimizer.optimize(
    evaluator=evaluator,
    data_mapper=data_mapper,
    dataset=dataset,
    initial_prompts=["Your initial prompt with known issues."],
    eval_subset_size=20
)

print(f"Refined prompt after error correction:\n{result.best_generator.get_prompt_template()}")

5. Next Steps

I