Using the Python SDK
Run prompt optimization from code with the agent-opt Python library.
What it is
Using the Python SDK means running prompt optimization programmatically with the agent-opt library (pip install agent-opt). You write Python that defines a dataset (list of dicts), an Evaluator (eval template + model for scoring), a DataMapper (dataset keys → eval inputs), and an Optimizer (e.g. Random Search, Meta-Prompt, ProTeGi, GEPA, Bayesian Search, PromptWizard). You call optimizer.optimize(...) and get back the best prompt and scores. The SDK gives you full control over which optimizer and parameters to use, so you can automate runs, plug into CI, or try multiple strategies in code—unlike the platform UI, everything is in your script.
Use cases
- Automation — Run optimization from scripts or CI; no UI.
- Choice of optimizer — Use Random Search, Bayesian, Meta-Prompt, ProTeGi, GEPA, or PromptWizard and tune their parameters in code.
- Custom data — Keep your dataset in code (list of dicts) or load it from your own storage.
- Reproducibility — Version your optimization config and dataset with your repo.
- Advanced config — Set eval subset size, initial prompts, task description, and optimizer-specific options.
Core concepts
The library is built around four components that work together:
| Component | Role |
|---|---|
| Optimizer | Drives the improvement process. You pick one (e.g. RandomSearchOptimizer, MetaPromptOptimizer, GEPAOptimizer) based on your task. |
| Evaluator | Scores prompt outputs using a specified eval template and model (e.g. Future AGI’s turing_flash). |
| DataMapper | Maps your dataset fields to the keys the optimizer and evaluator expect (e.g. input → article, output → generated_output). |
| Dataset | A list of dicts; each item is one example (e.g. {"article": "...", "target_summary": "..."}). |
How to
Install and set API keys
Install the library and set environment variables so the evaluator can call Future AGI (and so your generator can call your LLM provider if needed).
pip install agent-optexport FI_API_KEY="your_api_key"
export FI_SECRET_KEY="your_secret_key"You can also pass fi_api_key and fi_secret_key into the Evaluator instead of using env vars.
Prepare your dataset
Build a list of dicts. Each dict is one example; keys should match what your prompt and DataMapper use (e.g. article, target_summary for summarization).
dataset = [
{"article": "The James Webb Space Telescope has captured...", "target_summary": "The JWST has taken new pictures."},
{"article": "Researchers have discovered a new enzyme...", "target_summary": "A new enzyme that rapidly breaks down plastics has been found."},
# ... more rows
] Configure the Evaluator and DataMapper
The Evaluator scores each prompt’s outputs. The DataMapper maps your dataset keys to the eval’s expected keys (input, output, etc.).
from fi.opt.base.evaluator import Evaluator
from fi.opt.datamappers import BasicDataMapper
evaluator = Evaluator(
eval_template="summary_quality",
eval_model_name="turing_flash",
fi_api_key="your_key", # or rely on env FI_API_KEY
fi_secret_key="your_secret" # or rely on env FI_SECRET_KEY
)
data_mapper = BasicDataMapper(
key_map={"input": "article", "output": "generated_output"}
) Choose and initialize an optimizer
Pick an optimizer that fits your task (e.g. Random Search for a quick baseline, Meta-Prompt for deep refinement, GEPA for production-grade results). Some optimizers need a generator or teacher model; others take model names and config.
Example: Random Search (simple baseline)
from fi.opt.optimizers import RandomSearchOptimizer
from fi.opt.generators import LiteLLMGenerator
initial_generator = LiteLLMGenerator(
model="gpt-4o-mini",
prompt_template="Summarize this: {article}"
)
optimizer = RandomSearchOptimizer(
generator=initial_generator,
teacher_model="gpt-4o",
num_variations=5
)Example: Meta-Prompt (deep reasoning)
from fi.opt.optimizers import MetaPromptOptimizer
from fi.opt.generators import LiteLLMGenerator
teacher = LiteLLMGenerator(model="gpt-4o", prompt_template="{prompt}")
optimizer = MetaPromptOptimizer(teacher_generator=teacher, num_rounds=5)For a detailed comparison, see the Optimizers overview.
Run the optimization
Call optimizer.optimize() with the evaluator, data mapper, dataset, and any optimizer-specific options (e.g. initial_prompts, eval_subset_size, task_description).
rinitial_prompt = "Summarize the following article: {article}"
result = optimizer.optimize(
evaluator=evaluator,
data_mapper=data_mapper,
dataset=dataset,
initial_prompts=[initial_prompt],
task_description="Generate a concise, one-sentence summary of the article.",
eval_subset_size=10 # Use a subset of the data for faster evaluation per round
) Analyze the results
Use the returned result object: result.final_score, result.best_generator.get_prompt_template(), and result.history for each round or variation.
# Print the final score and the best prompt found
print(f"Final Score: {result.final_score:.4f}")
print(f"Best Prompt:\n{result.best_generator.get_prompt_template()}")
# Review the history of the optimization
for i, iteration in enumerate(result.history):
print(f"\n--- Round {i+1} ---")
print(f"Score: {iteration.average_score:.4f}")
print(f"Prompt: {iteration.prompt}") What you can do next
Optimize your first prompt
Minimal end-to-end example with Random Search.
Optimizers overview
Compare algorithms and choose the right one.
Using the platform
Run optimization from the UI instead of code.
Optimization overview
Platform vs SDK and when to use which.
agent-opt on GitHub
Source code, advanced features, and contributing.