Optimize My Agent - Future AGI Documentation

After running simulations and analyzing your agent’s performance, Future AGI provides intelligent optimization suggestions to systematically improve your agent’s quality, reduce failures, and enhance overall performance. The platform leverages advanced optimization algorithms to refine your agent’s prompts and configurations.

Overview

Agent optimization in Future AGI is a data-driven approach to improving your AI agent’s behavior based on actual simulation results. Instead of manually tweaking prompts through trial and error, the platform:

Analyzes simulation performance metrics and call data
Identifies specific issues and failure patterns
Suggests targeted improvements with priority levels
Optimizes agent prompts using advanced algorithms
Validates improvements through iterative refinement

This process combines the power of simulation testing with state-of-the-art prompt optimization techniques to deliver measurable improvements in agent performance.

Accessing Optimization Suggestions

After running a simulation, you can access optimization insights directly from the execution results page.

Step 1: Navigate to Simulation Results

Once your simulation run completes, you’ll see the execution details page with performance metrics including:

Call Details: Total calls, connected calls, connection rate
System Metrics: CSAT scores, agent latency, WPM (Words Per Minute)
Evaluation Metrics: Custom evaluation results

Step 2: Open Optimization Panel

Click the “Optimize My Agent” button in the top-right corner of the execution page. This opens a side panel showing:

All Suggestions: Total number of optimization recommendations
Priority Levels: High, Medium, or Low priority for each suggestion
Issue Categories: Specific problems identified (latency, response brevity, detection tuning)
Affected Calls: Number of calls impacted by each issue
Last Updated: Timestamp of the latest analysis

Suggestions are automatically generated by analyzing your simulation results. The system identifies patterns, edge cases, and failure modes that can be addressed through optimization.

Understanding Suggestions

Each suggestion provides:

Issue Description: Clear explanation of the identified problem
Recommended Fix: Specific action to address the issue
Priority Level: Urgency of the fix (High/Medium/Low)
Affected Calls: Which calls exhibited this issue
View Issue Button: Deep-dive into specific call examples

Example Suggestions:

Aggressively Reduce Pipeline Latency - Reduce LLM time-to-first-token (TTFT) by switching to a faster model
Enforce Strict Response Brevity - Implement a hard token limit to enforce concise responses
Tune End-of-Speech Detection - Adjust VAD parameters for better conversation flow

Start with High Priority suggestions that affect the most calls. These typically have the greatest impact on overall agent performance.

Running Agent Optimization

Once you’ve reviewed the suggestions, you can run an optimization process to systematically improve your agent’s prompts.

Step 3: Configure Optimization

Click the “Optimize My Agent” button to open the optimization configuration dialog.

Required Configuration:

1. Name Your Optimization Run

Enter a descriptive name (e.g., “opt1”, “latency-optimization-v2”)
This helps track multiple optimization experiments

2. Choose Optimizer Select from Future AGI’s advanced optimization algorithms:

Random Search

Best for: Quick baseline testing and initial explorationHow it works: Generates random prompt variations using a teacher model and evaluates each candidate.Characteristics:

⚡⚡⚡ Fast execution
⭐⭐ Basic quality improvements
💰 Low cost
Ideal for: 10-30 examples

Use when: You need quick results or want to establish a performance baseline before trying more sophisticated algorithms.

Bayesian Search

Best for: Few-shot learning tasks and intelligent example selectionHow it works: Uses Bayesian optimization to intelligently select few-shot examples and prompt configurations.Characteristics:

⚡⚡ Medium speed
⭐⭐⭐⭐ High quality
💰💰 Medium cost
Ideal for: 15-50 examples

Use when: Your dataset contains good examples and you want to leverage few-shot learning effectively.

Meta-Prompt

Best for: Complex reasoning tasks requiring deep analysisHow it works: Analyzes failed examples, formulates hypotheses, and rewrites the entire prompt through deep reasoning.Characteristics:

⚡⚡ Medium speed
⭐⭐⭐⭐ High quality
💰💰💰 Higher cost
Ideal for: 20-40 examples

Use when: Your agent handles complex reasoning tasks or you need holistic prompt redesign.

ProTeGi

Best for: Identifying and fixing specific error patternsHow it works: Generates critiques of failures and applies targeted improvements using beam search to maintain multiple candidates.Characteristics:

⚡ Slower execution
⭐⭐⭐⭐ High quality
💰💰💰 Higher cost
Ideal for: 20-50 examples

Use when: You have clear failure patterns and want systematic error fixing.

PromptWizard

Best for: Creative exploration and diverse prompt variationsHow it works: Combines mutation with different “thinking styles”, then critiques and refines top performers.Characteristics:

⚡ Slower execution
⭐⭐⭐⭐ High quality
💰💰💰 Higher cost
Ideal for: 15-40 examples

Use when: You want creative exploration or diverse conversational approaches.

GEPA (Genetic-Evolutionary Prompt Algorithm)

Best for: Production deployments requiring state-of-the-art performanceHow it works: Uses evolutionary algorithms with reflective learning and mutation strategies inspired by natural selection.Characteristics:

⚡ Slower execution
⭐⭐⭐⭐⭐ Excellent quality
💰💰💰💰 Highest cost
Ideal for: 30-100 examples

Use when: You need production-grade optimization with robust results and have sufficient evaluation budget.

3. Select Language Model Choose the model that will be used for the optimization process: Available models include:

gpt-5 series (gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-chat-latest)
gpt-4 series (gpt-4, gpt-4.1, gpt-4o, gpt-4o-audio-preview)
Other supported models from your configuration

For optimization, using a more powerful model (like gpt-4 or gpt-5) as the teacher model often yields better prompt improvements, even if your production agent uses a smaller model.

4. Add Parameters Configure optimizer-specific parameters:

Number Variations: How many prompt variations to generate and test
- Start with 3-5 for quick iterations
- Use 10-20 for thorough optimization
- Consider cost vs. quality tradeoff

Each optimizer may have additional parameters. The platform shows recommended defaults that balance speed and quality.

Step 4: Start Optimization

Click “Start Optimizing your agent” to begin the process. The optimization engine will:

Analyze your simulation data and identified issues
Generate prompt variations using the selected algorithm
Evaluate each variation against your test scenarios
Score performance improvements
Select the best-performing optimized prompt

Optimization Algorithms Explained

Future AGI’s optimization uses advanced prompt refinement techniques. Understanding how each algorithm works helps you choose the right strategy for your use case.

Quick Selection Guide

Your Goal	Recommended Algorithm	Why
Quick improvement baseline	Random Search	Fast, simple, establishes performance floor
Reduce latency issues	Bayesian Search	Efficiently explores configuration space
Fix conversation logic errors	ProTeGi or Meta-Prompt	Targets specific failure patterns
Improve complex reasoning	Meta-Prompt	Deep analysis and systematic refinement
Optimize for production	GEPA	State-of-the-art evolutionary optimization
Explore creative approaches	PromptWizard	Diverse variations with structured refinement

Algorithm Comparison

Algorithm	Speed	Quality	Cost	Best Dataset Size
Random Search	⚡⚡⚡	⭐⭐	💰	10-30 examples
Bayesian Search	⚡⚡	⭐⭐⭐⭐	💰💰	15-50 examples
Meta-Prompt	⚡⚡	⭐⭐⭐⭐	💰💰💰	20-40 examples
ProTeGi	⚡	⭐⭐⭐⭐	💰💰💰	20-50 examples
PromptWizard	⚡	⭐⭐⭐⭐	💰💰💰	15-40 examples
GEPA	⚡	⭐⭐⭐⭐⭐	💰💰💰💰	30-100 examples

Speed: ⚡ = Slow, ⚡⚡ = Medium, ⚡⚡⚡ = Fast
Quality: ⭐ = Basic, ⭐⭐⭐⭐⭐ = Excellent
Cost: 💰 = Low, 💰💰💰💰 = High (based on API calls)

Decision Tree

Do you need production-grade optimization?
├─ Yes → Use GEPA
└─ No
   │
   Do you have clear error patterns to fix?
   ├─ Yes → Use ProTeGi
   └─ No
      │
      Is your task reasoning-heavy or complex?
      ├─ Yes → Use Meta-Prompt
      └─ No
         │
         Do you need few-shot learning optimization?
         ├─ Yes → Use Bayesian Search
         └─ No
            │
            Do you want creative exploration?
            ├─ Yes → Use PromptWizard
            └─ No → Use Random Search (baseline)

Viewing Optimization Results

After optimization completes, you can view the results in the Optimization Runs tab on your simulation execution page.

Analyzing Results

The optimization results show:

Performance Comparison
- Original prompt baseline scores
- Optimized prompt scores
- Improvement percentage
Best Prompt
- The highest-performing optimized prompt
- Changes made from the original
- Evaluation scores across metrics
Optimization History
- All variations tested
- Performance trajectory
- Iteration details

Deploying Optimized Prompts

Once you’ve identified an improved prompt:

Review the optimized prompt carefully
Test it with additional scenarios if needed
Update your agent definition with the new prompt
Re-run simulations to validate improvements
Monitor performance in production

Always validate optimized prompts with additional test cases before deploying to production. Optimization algorithms can sometimes overfit to the evaluation dataset.

Best Practices

1. Run Multiple Optimization Iterations

Don’t stop after one optimization run:

Start with Random Search to establish a baseline
Use ProTeGi or Meta-Prompt to fix identified issues
Apply GEPA for final production-grade refinement

2. Use Sufficient Test Data

Optimization quality depends on your simulation data:

Run at least 20-50 simulation scenarios before optimizing
Ensure scenarios cover diverse situations and edge cases
Include examples of both successful and failed interactions

3. Choose the Right Optimizer

Match the algorithm to your problem:

Latency issues: Bayesian Search (efficient parameter tuning)
Conversation logic errors: ProTeGi (targeted error fixing)
Complex reasoning: Meta-Prompt (deep analysis)
Production deployment: GEPA (robust evolutionary search)

4. Balance Cost and Quality

Optimization uses API calls:

Start with fewer variations (3-5) for quick iterations
Increase variations (10-20) when you’re close to deployment
Use faster algorithms (Random Search, Bayesian Search) for experimentation
Reserve expensive algorithms (GEPA, Meta-Prompt) for critical optimizations

5. Validate Improvements

Always verify optimization results:

Run new simulations with the optimized prompt
Compare metrics against the baseline
Test on scenarios not included in the optimization dataset
Monitor for overfitting or unexpected behaviors

6. Track Optimization Experiments

Maintain good experiment hygiene:

Use descriptive names for optimization runs
Document which suggestions you’re addressing
Keep notes on what worked and what didn’t
Version your prompts alongside optimization results

Optimization Workflow Example

Here’s a complete workflow for optimizing an insurance sales agent:

Initial State

Agent has 40% call connection rate
High latency (1470ms response time)
Mixed sentiment scores

Step 1: Run Comprehensive Simulations

- Create 50 diverse scenarios covering:
  ✓ Different customer types
  ✓ Various objection patterns  
  ✓ Edge cases and difficult situations
- Run simulation and analyze results

Step 2: Review Optimization Suggestions

Suggestions identified:
- [High Priority] Reduce Pipeline Latency (8 calls affected)
- [High Priority] Enforce Response Brevity (8 calls affected)  
- [Medium Priority] Tune End-of-Speech Detection (8 calls affected)

Step 3: First Optimization - Quick Baseline

- Name: "insurance-agent-baseline-v1"
- Optimizer: Random Search
- Model: gpt-4o
- Variations: 5
- Focus: Establish performance baseline

Step 4: Targeted Optimization - Fix Latency

- Name: "insurance-agent-latency-fix"
- Optimizer: ProTeGi
- Model: gpt-4o
- Variations: 10
- Focus: Address high-priority latency issues

Step 5: Advanced Optimization - Production Ready

- Name: "insurance-agent-production-v1"
- Optimizer: GEPA
- Model: gpt-4o
- Variations: 15
- Focus: Production-grade optimization

Step 6: Validation

- Run new simulation with optimized prompt
- Compare results:
  Before: 40% connection rate, 1470ms latency
  After: 65% connection rate, 850ms latency
  Improvement: +62.5% connection rate, -42% latency

Troubleshooting

No Suggestions Appearing

Possible causes:

Not enough simulation data (need 20+ calls)
Agent performed perfectly (no issues detected)
Evaluation metrics not configured

Solutions:

Run more comprehensive simulations
Add diverse scenarios including edge cases
Configure custom evaluation metrics

Optimization Not Improving Performance

Possible causes:

Insufficient training data
Wrong optimizer for the problem type
Too few variations tested
Overfitting to evaluation set

Solutions:

Increase simulation scenario count
Try a different optimization algorithm
Increase number of variations (10-20)
Validate on held-out test scenarios

Optimization Taking Too Long

Possible causes:

Using slow optimizer (GEPA, ProTeGi)
Too many variations configured
Large dataset size

Solutions:

Start with Random Search or Bayesian Search
Reduce number of variations to 3-5
Use a smaller sample of representative scenarios

Advanced Topics

You can mix automated optimization with manual improvements:

Run automated optimization to get AI-generated suggestions
Review the optimized prompt for insights
Manually refine based on domain expertise
Run another optimization starting from your manual refinement
Compare results to see which approach performs better

Custom Evaluation Metrics

For optimization to be most effective, configure evaluation metrics that match your business goals:

Conversion Rate: Did the agent successfully convert the customer?
Compliance: Did the agent follow regulatory requirements?
Customer Satisfaction: Sentiment and CSAT scores
Efficiency: Response latency, call duration, token usage

The optimization algorithms use your evaluation metrics as the fitness function. Better evaluation metrics lead to better optimization results.

Optimization for Different Agent Types

Different agent types benefit from different optimization strategies: Voice Agents:

Focus on: Latency, brevity, natural conversation flow
Best optimizers: Bayesian Search (parameter tuning), ProTeGi (error fixing)

Chat Agents:

Focus on: Response quality, accuracy, helpfulness
Best optimizers: Meta-Prompt (reasoning), PromptWizard (diverse styles)

Sales Agents:

Focus on: Conversion rate, objection handling, compliance
Best optimizers: GEPA (production-grade), Meta-Prompt (complex logic)

Support Agents:

Focus on: Problem resolution, empathy, efficiency
Best optimizers: ProTeGi (error patterns), Bayesian Search (few-shot examples)

Next Steps

Run Simulation

Learn how to run comprehensive agent simulations

Optimization Algorithms

Deep dive into optimization algorithm details

Create Scenarios

Build diverse test scenarios for better optimization

Agent Definition

Configure your agent for optimal performance

Prompt Optimization Overview - Learn about the agent-opt library
GEPA Algorithm - Evolutionary optimization deep dive
Meta-Prompt Algorithm - Deep reasoning refinement
ProTeGi Algorithm - Error-driven improvement
Evaluation Metrics - Using different evaluation strategies

Get Started

Guides

​Overview

​Accessing Optimization Suggestions

​Step 1: Navigate to Simulation Results

​Step 2: Open Optimization Panel

​Understanding Suggestions

​Running Agent Optimization

​Step 3: Configure Optimization

​Required Configuration:

​Step 4: Start Optimization

​Optimization Algorithms Explained

​Quick Selection Guide

​Algorithm Comparison

​Decision Tree

​Viewing Optimization Results

​Analyzing Results

​Deploying Optimized Prompts

​Best Practices

​1. Run Multiple Optimization Iterations

​2. Use Sufficient Test Data

​3. Choose the Right Optimizer

​4. Balance Cost and Quality

​5. Validate Improvements

​6. Track Optimization Experiments

​Optimization Workflow Example

​Initial State

​Step 1: Run Comprehensive Simulations

​Step 2: Review Optimization Suggestions

​Step 3: First Optimization - Quick Baseline

​Step 4: Targeted Optimization - Fix Latency

​Step 5: Advanced Optimization - Production Ready

​Step 6: Validation

​Troubleshooting

​No Suggestions Appearing

​Optimization Not Improving Performance

​Optimization Taking Too Long

​Advanced Topics

​Combining Optimization with Manual Refinement

​Custom Evaluation Metrics

​Optimization for Different Agent Types

​Next Steps

Run Simulation

Optimization Algorithms

Create Scenarios

Agent Definition

​Related Resources

Overview

Accessing Optimization Suggestions

Step 1: Navigate to Simulation Results

Step 2: Open Optimization Panel

Understanding Suggestions

Running Agent Optimization

Step 3: Configure Optimization

Required Configuration:

Step 4: Start Optimization

Optimization Algorithms Explained

Quick Selection Guide

Algorithm Comparison

Decision Tree

Viewing Optimization Results

Analyzing Results

Deploying Optimized Prompts

Best Practices

1. Run Multiple Optimization Iterations

2. Use Sufficient Test Data

3. Choose the Right Optimizer

4. Balance Cost and Quality

5. Validate Improvements

6. Track Optimization Experiments

Optimization Workflow Example

Initial State

Step 1: Run Comprehensive Simulations

Step 2: Review Optimization Suggestions

Step 3: First Optimization - Quick Baseline

Step 4: Targeted Optimization - Fix Latency

Step 5: Advanced Optimization - Production Ready

Step 6: Validation

Troubleshooting

No Suggestions Appearing

Optimization Not Improving Performance

Optimization Taking Too Long

Advanced Topics

Combining Optimization with Manual Refinement

Custom Evaluation Metrics

Optimization for Different Agent Types

Next Steps

Related Resources