Bayesian Search Optimizer
Use Bayesian optimization for few-shot prompt tuning: learns from trials to pick better example sets and configurations.
Bayesian Search uses Bayesian optimization (via Optuna) to explore the space of few-shot prompt configurations. It learns from each trial to choose which examples and configurations to try next, so it converges faster than random search.
When to Use Bayesian Search
✅ Best For
- Few-shot learning tasks
- Structured Q&A or classification
- Limited evaluation budget
❌ Not Ideal For
- Tasks without examples in dataset
- Purely zero-shot or very creative tasks
- Tiny datasets (< 10 examples)
How It Works
Define search space
Set the range of few-shot examples (e.g. 2–8) and optional formatting.
Sample configuration
The optimizer suggests how many examples and which ones to use.
Build prompt and evaluate
Selected examples are formatted with the base prompt; outputs are scored on an eval subset.
Update and repeat
Results feed the next suggestion until the trial budget is used.
Basic Usage
from fi.opt.optimizers import BayesianSearchOptimizer
from fi.opt.datamappers import BasicDataMapper
from fi.opt.base.evaluator import Evaluator
# Setup evaluator
evaluator = Evaluator(
eval_template="summary_quality",
eval_model_name="turing_flash",
fi_api_key="your_key",
fi_secret_key="your_secret"
)
# Setup data mapper
data_mapper = BasicDataMapper(
key_map={"input": "text", "output": "generated_output"}
)
# Create optimizer
optimizer = BayesianSearchOptimizer(
inference_model_name="gpt-4o-mini",
n_trials=20,
min_examples=2,
max_examples=8
)
# Run optimization
result = optimizer.optimize(
evaluator=evaluator,
data_mapper=data_mapper,
dataset=dataset,
initial_prompts=["Summarize: {text}"]
)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
min_examples | int | 2 | Minimum few-shot examples per trial |
max_examples | int | 8 | Maximum few-shot examples per trial |
allow_repeats | bool | false | Same example can appear multiple times in few-shot block |
fixed_example_indices | List[int] | [] | Example indices always included (e.g. [0, 5]) |
n_trials | int | 10 | Number of configurations to try |
seed | int | 42 | Random seed |
direction | str | ”maximize" | "maximize” for scores, “minimize” for loss |
inference_model_name | str | gpt-4o-mini | Model for generated outputs |
example_template | str | None | Template per example, e.g. “Q: {question}\nA: {answer}“ |
example_separator | str | ”\n” | String between examples in few-shot block |
few_shot_position | str | ”append" | "append” or “prepend” |
infer_example_template_via_teacher | bool | false | Use teacher to infer example format (adds API cost) |
teacher_model_name | str | gpt-5 | Model for template inference when enabled |
eval_subset_size | int | None | Examples to evaluate per trial; None = full dataset |
eval_subset_strategy | str | ”random" | "random”, “first”, or “all” |
Key concepts
- Result and history:
result.final_scoreandresult.best_generator.get_prompt_template()give the best run.result.historyholds per-trial scores and prompts for analysis. - Template inference: Set
infer_example_template_via_teacher=Truewhen you’re unsure how to format examples; the teacher proposes a format. You can reuse that format in later runs withexample_templateto save cost. - Fixed examples: Use
fixed_example_indices=[0, 5]to always include specific examples while the optimizer varies the rest.
Tips: Start with n_trials=10, then 20–30 for production. Use eval_subset_size=20 on large datasets. Template errors: ensure fields in example_template exist in your data. Plateaus: try infer_example_template_via_teacher=True or increase max_examples.
Research: A Bayesian approach for prompt optimization; used in DSPy and few-shot surveys.
Underlying Research
Bayesian Search builds on established principles of Bayesian optimization, adapted for the unique challenges of prompt engineering.
- Core Concept: The method is detailed in papers like “A Bayesian approach for prompt optimization in pre-trained models”, which explores mapping discrete prompts to continuous embeddings for more efficient searching.
- Few-Shot Learning: Its application in few-shot scenarios is highlighted by tools like Comet’s OPik, which features a “Few-Shot Bayesian Optimizer”.
- Advanced Implementations: Recent research, such as “Searching for Optimal Solutions with LLMs via Bayesian Optimization (BOPRO)”, investigates using Bayesian optimization to navigate complex LLM search spaces. The popular
BayesianOptimizationlibrary on GitHub provides the foundational Gaussian process-based modeling.
This approach is noted for its efficiency in prominent frameworks like DSPy and is recognized in surveys for its effectiveness in few-shot learning contexts.