When to Use Bayesian Search
✅ Best For
- Few-shot learning tasks
- Efficient exploration
- Structured Q&A or classification
- Limited evaluation budget
❌ Not Ideal For
- Tasks without examples in dataset
- Purely zero-shot scenarios
- Very creative/open-ended tasks
- Tiny datasets (< 10 examples)
How It Works
- Few-Shot Selection: Intelligently samples different numbers and combinations of examples from your dataset
- Template Optimization: Can automatically infer the best way to format examples (optional)
- Bayesian Learning: Uses previous trial results to guide future selections
- Efficient Search: Converges faster than random search by learning from history
1
Initialize Search Space
Define range of few-shot examples (e.g., 2-8 examples) and other configurations
2
Sample Configuration
Bayesian optimizer suggests number of examples and which ones to use
3
Build Prompt
Format selected examples and combine with base prompt
4
Evaluate
Generate outputs and score them on eval subset
5
Update & Repeat
Optimizer learns from results and suggests next configuration
Basic Usage
Configuration Parameters
Search Space
Minimum number of few-shot examples to try
Maximum number of few-shot examples to try
Whether the same example can be used multiple times in few-shot block
Specific example indices that must always be included
Optimization Control
Number of different configurations to try. More trials = better results but higher cost.
Random seed for reproducibility
Optimization direction. Use
"maximize"
for scores, "minimize"
for loss/error rates.Model Configuration
Model used to generate outputs during optimization
Additional arguments passed to the inference model
Example Formatting
Template string for formatting examples using Python
.format()
syntaxList of fields to include when no template is provided
Custom labels for fields in examples
String used to separate multiple examples in the few-shot block
Where to place few-shot examples:
"append"
(after base prompt) or "prepend"
(before)Optional title/header for the few-shot examples section
Teacher-Guided Template Inference
Use a teacher model to automatically infer the best example format from your data
Powerful model used for template inference
Arguments for the teacher model
Number of dataset examples to show the teacher for template inference
Template inference is powerful but costs extra API calls. Use it when you’re unsure how to format examples.
Evaluation Controls
Number of examples to evaluate per trial (for speed). If
None
, uses entire dataset.How to select eval subset:
"random"
, "first"
, or "all"
Underlying Research
Bayesian Search builds on established principles of Bayesian optimization, adapted for the unique challenges of prompt engineering.- Core Concept: The method is detailed in papers like “A Bayesian approach for prompt optimization in pre-trained models”, which explores mapping discrete prompts to continuous embeddings for more efficient searching.
- Few-Shot Learning: Its application in few-shot scenarios is highlighted by tools like Comet’s OPik, which features a “Few-Shot Bayesian Optimizer”.
- Advanced Implementations: Recent research, such as “Searching for Optimal Solutions with LLMs via Bayesian Optimization (BOPRO)”, investigates using Bayesian optimization to navigate complex LLM search spaces. The popular
BayesianOptimization
library on GitHub provides the foundational Gaussian process-based modeling.
Advanced Examples
With Automatic Template Inference
Let the teacher model determine the best example format:With Custom Example Formatting
Full control over example formatting:With Custom Prompt Builder
Control how few-shot examples integrate with base prompt:With Fixed Examples
Always include certain critical examples:Understanding the Results
Analyzing Optimization History
Extracting Best Configuration
Performance Tips
Start with fewer trials
Start with fewer trials
Begin with
n_trials=10
to validate your setup, then increase to 20-30 for production.Use eval subsets for large datasets
Use eval subsets for large datasets
Set
eval_subset_size=20
when you have 50+ examples to speed up optimization significantly.Adjust example range based on task
Adjust example range based on task
- Classification:
min_examples=2, max_examples=5
- Complex reasoning:
min_examples=3, max_examples=8
- Creative tasks:
min_examples=1, max_examples=4
Let teacher infer template first
Let teacher infer template first
Run a quick optimization with
infer_example_template_via_teacher=True
, save the inferred template, then use it explicitly in future runs to save costs.Common Patterns
Question Answering with Context
Text Classification
Data Extraction
Troubleshooting
Template formatting errors
Template formatting errors
Problem:
KeyError
when formatting examplesSolution: Ensure all fields in example_template
exist in your dataset examples. Use example_template_fields
to explicitly list available fields.Optimization plateaus quickly
Optimization plateaus quickly
Problem: Scores stop improving after few trialsSolution:
- Increase
max_examples
to explore larger few-shot sizes - Try
infer_example_template_via_teacher=True
- Check if your dataset has sufficient diversity
Very slow optimization
Very slow optimization
Problem: Each trial takes too longSolution:
- Set
eval_subset_size=10
or smaller - Use a faster inference model
- Reduce
max_examples
Few-shot examples don't help
Few-shot examples don't help
Problem: Adding examples doesn’t improve scoresSolution:
- Verify examples are high-quality and diverse
- Check that
example_template
formats them clearly - Your task might not benefit from few-shot (try Meta-Prompt instead)