Skip to main content
After running simulations and analyzing your agent’s performance, Future AGI provides intelligent optimization suggestions to systematically improve your agent’s quality, reduce failures, and enhance overall performance. The platform leverages advanced optimization algorithms to refine your agent’s prompts and configurations.

Overview

Agent optimization in Future AGI is a data-driven approach to improving your AI agent’s behavior based on actual simulation results. Instead of manually tweaking prompts through trial and error, the platform:
  • Analyzes simulation performance metrics and call data
  • Identifies specific issues and failure patterns
  • Suggests targeted improvements with priority levels
  • Optimizes agent prompts using advanced algorithms
  • Validates improvements through iterative refinement
This process combines the power of simulation testing with state-of-the-art prompt optimization techniques to deliver measurable improvements in agent performance.

Accessing Optimization Suggestions

After running a simulation, you can access optimization insights directly from the execution results page.

Step 1: Navigate to Simulation Results

Once your simulation run completes, you’ll see the execution details page with performance metrics including:
  • Call Details: Total calls, connected calls, connection rate
  • System Metrics: CSAT scores, agent latency, WPM (Words Per Minute)
  • Evaluation Metrics: Custom evaluation results
Simulation Results

Step 2: Open Optimization Panel

Click the “Optimize My Agent” button in the top-right corner of the execution page. This opens a side panel showing:
  • All Suggestions: Total number of optimization recommendations
  • Priority Levels: High, Medium, or Low priority for each suggestion
  • Issue Categories: Specific problems identified (latency, response brevity, detection tuning)
  • Affected Calls: Number of calls impacted by each issue
  • Last Updated: Timestamp of the latest analysis
Optimization Suggestions
Suggestions are automatically generated by analyzing your simulation results. The system identifies patterns, edge cases, and failure modes that can be addressed through optimization.

Understanding Suggestions

Each suggestion provides:
  1. Issue Description: Clear explanation of the identified problem
  2. Recommended Fix: Specific action to address the issue
  3. Priority Level: Urgency of the fix (High/Medium/Low)
  4. Affected Calls: Which calls exhibited this issue
  5. View Issue Button: Deep-dive into specific call examples
Example Suggestions:
  • Aggressively Reduce Pipeline Latency - Reduce LLM time-to-first-token (TTFT) by switching to a faster model
  • Enforce Strict Response Brevity - Implement a hard token limit to enforce concise responses
  • Tune End-of-Speech Detection - Adjust VAD parameters for better conversation flow
Start with High Priority suggestions that affect the most calls. These typically have the greatest impact on overall agent performance.

Running Agent Optimization

Once you’ve reviewed the suggestions, you can run an optimization process to systematically improve your agent’s prompts.

Step 3: Configure Optimization

Click the “Optimize My Agent” button to open the optimization configuration dialog. Optimization Configuration Optimization Settings

Required Configuration:

1. Name Your Optimization Run
  • Enter a descriptive name (e.g., “opt1”, “latency-optimization-v2”)
  • This helps track multiple optimization experiments
2. Choose Optimizer Select from Future AGI’s advanced optimization algorithms: Language Model Selection
Best for: Complex reasoning tasks requiring deep analysisHow it works: Analyzes failed examples, formulates hypotheses, and rewrites the entire prompt through deep reasoning.Characteristics:
  • ⚡⚡ Medium speed
  • ⭐⭐⭐⭐ High quality
  • 💰💰💰 Higher cost
  • Ideal for: 20-40 examples
Use when: Your agent handles complex reasoning tasks or you need holistic prompt redesign.
Best for: Identifying and fixing specific error patternsHow it works: Generates critiques of failures and applies targeted improvements using beam search to maintain multiple candidates.Characteristics:
  • ⚡ Slower execution
  • ⭐⭐⭐⭐ High quality
  • 💰💰💰 Higher cost
  • Ideal for: 20-50 examples
Use when: You have clear failure patterns and want systematic error fixing.
Best for: Creative exploration and diverse prompt variationsHow it works: Combines mutation with different “thinking styles”, then critiques and refines top performers.Characteristics:
  • ⚡ Slower execution
  • ⭐⭐⭐⭐ High quality
  • 💰💰💰 Higher cost
  • Ideal for: 15-40 examples
Use when: You want creative exploration or diverse conversational approaches.
Best for: Production deployments requiring state-of-the-art performanceHow it works: Uses evolutionary algorithms with reflective learning and mutation strategies inspired by natural selection.Characteristics:
  • ⚡ Slower execution
  • ⭐⭐⭐⭐⭐ Excellent quality
  • 💰💰💰💰 Highest cost
  • Ideal for: 30-100 examples
Use when: You need production-grade optimization with robust results and have sufficient evaluation budget.
3. Select Language Model Choose the model that will be used for the optimization process: Available models include:
  • gpt-5 series (gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-chat-latest)
  • gpt-4 series (gpt-4, gpt-4.1, gpt-4o, gpt-4o-audio-preview)
  • Other supported models from your configuration
For optimization, using a more powerful model (like gpt-4 or gpt-5) as the teacher model often yields better prompt improvements, even if your production agent uses a smaller model.
4. Add Parameters Configure optimizer-specific parameters:
  • Number Variations: How many prompt variations to generate and test
    • Start with 3-5 for quick iterations
    • Use 10-20 for thorough optimization
    • Consider cost vs. quality tradeoff
Each optimizer may have additional parameters. The platform shows recommended defaults that balance speed and quality.

Step 4: Start Optimization

Click “Start Optimizing your agent” to begin the process. The optimization engine will:
  1. Analyze your simulation data and identified issues
  2. Generate prompt variations using the selected algorithm
  3. Evaluate each variation against your test scenarios
  4. Score performance improvements
  5. Select the best-performing optimized prompt

Optimization Algorithms Explained

Future AGI’s optimization uses advanced prompt refinement techniques. Understanding how each algorithm works helps you choose the right strategy for your use case.

Quick Selection Guide

Your GoalRecommended AlgorithmWhy
Quick improvement baselineRandom SearchFast, simple, establishes performance floor
Reduce latency issuesBayesian SearchEfficiently explores configuration space
Fix conversation logic errorsProTeGi or Meta-PromptTargets specific failure patterns
Improve complex reasoningMeta-PromptDeep analysis and systematic refinement
Optimize for productionGEPAState-of-the-art evolutionary optimization
Explore creative approachesPromptWizardDiverse variations with structured refinement

Algorithm Comparison

AlgorithmSpeedQualityCostBest Dataset Size
Random Search⚡⚡⚡⭐⭐💰10-30 examples
Bayesian Search⚡⚡⭐⭐⭐⭐💰💰15-50 examples
Meta-Prompt⚡⚡⭐⭐⭐⭐💰💰💰20-40 examples
ProTeGi⭐⭐⭐⭐💰💰💰20-50 examples
PromptWizard⭐⭐⭐⭐💰💰💰15-40 examples
GEPA⭐⭐⭐⭐⭐💰💰💰💰30-100 examples
  • Speed: ⚡ = Slow, ⚡⚡ = Medium, ⚡⚡⚡ = Fast
  • Quality: ⭐ = Basic, ⭐⭐⭐⭐⭐ = Excellent
  • Cost: 💰 = Low, 💰💰💰💰 = High (based on API calls)

Decision Tree

Do you need production-grade optimization?
├─ Yes → Use GEPA
└─ No

   Do you have clear error patterns to fix?
   ├─ Yes → Use ProTeGi
   └─ No

      Is your task reasoning-heavy or complex?
      ├─ Yes → Use Meta-Prompt
      └─ No

         Do you need few-shot learning optimization?
         ├─ Yes → Use Bayesian Search
         └─ No

            Do you want creative exploration?
            ├─ Yes → Use PromptWizard
            └─ No → Use Random Search (baseline)

Viewing Optimization Results

After optimization completes, you can view the results in the Optimization Runs tab on your simulation execution page.

Analyzing Results

The optimization results show:
  1. Performance Comparison
    • Original prompt baseline scores
    • Optimized prompt scores
    • Improvement percentage
  2. Best Prompt
    • The highest-performing optimized prompt
    • Changes made from the original
    • Evaluation scores across metrics
  3. Optimization History
    • All variations tested
    • Performance trajectory
    • Iteration details

Deploying Optimized Prompts

Once you’ve identified an improved prompt:
  1. Review the optimized prompt carefully
  2. Test it with additional scenarios if needed
  3. Update your agent definition with the new prompt
  4. Re-run simulations to validate improvements
  5. Monitor performance in production
Always validate optimized prompts with additional test cases before deploying to production. Optimization algorithms can sometimes overfit to the evaluation dataset.

Best Practices

1. Run Multiple Optimization Iterations

Don’t stop after one optimization run:
  • Start with Random Search to establish a baseline
  • Use ProTeGi or Meta-Prompt to fix identified issues
  • Apply GEPA for final production-grade refinement

2. Use Sufficient Test Data

Optimization quality depends on your simulation data:
  • Run at least 20-50 simulation scenarios before optimizing
  • Ensure scenarios cover diverse situations and edge cases
  • Include examples of both successful and failed interactions

3. Choose the Right Optimizer

Match the algorithm to your problem:
  • Latency issues: Bayesian Search (efficient parameter tuning)
  • Conversation logic errors: ProTeGi (targeted error fixing)
  • Complex reasoning: Meta-Prompt (deep analysis)
  • Production deployment: GEPA (robust evolutionary search)

4. Balance Cost and Quality

Optimization uses API calls:
  • Start with fewer variations (3-5) for quick iterations
  • Increase variations (10-20) when you’re close to deployment
  • Use faster algorithms (Random Search, Bayesian Search) for experimentation
  • Reserve expensive algorithms (GEPA, Meta-Prompt) for critical optimizations

5. Validate Improvements

Always verify optimization results:
  • Run new simulations with the optimized prompt
  • Compare metrics against the baseline
  • Test on scenarios not included in the optimization dataset
  • Monitor for overfitting or unexpected behaviors

6. Track Optimization Experiments

Maintain good experiment hygiene:
  • Use descriptive names for optimization runs
  • Document which suggestions you’re addressing
  • Keep notes on what worked and what didn’t
  • Version your prompts alongside optimization results

Optimization Workflow Example

Here’s a complete workflow for optimizing an insurance sales agent:

Initial State

  • Agent has 40% call connection rate
  • High latency (1470ms response time)
  • Mixed sentiment scores

Step 1: Run Comprehensive Simulations

- Create 50 diverse scenarios covering:
  ✓ Different customer types
  ✓ Various objection patterns  
  ✓ Edge cases and difficult situations
- Run simulation and analyze results

Step 2: Review Optimization Suggestions

Suggestions identified:
- [High Priority] Reduce Pipeline Latency (8 calls affected)
- [High Priority] Enforce Response Brevity (8 calls affected)  
- [Medium Priority] Tune End-of-Speech Detection (8 calls affected)

Step 3: First Optimization - Quick Baseline

- Name: "insurance-agent-baseline-v1"
- Optimizer: Random Search
- Model: gpt-4o
- Variations: 5
- Focus: Establish performance baseline

Step 4: Targeted Optimization - Fix Latency

- Name: "insurance-agent-latency-fix"
- Optimizer: ProTeGi
- Model: gpt-4o
- Variations: 10
- Focus: Address high-priority latency issues

Step 5: Advanced Optimization - Production Ready

- Name: "insurance-agent-production-v1"
- Optimizer: GEPA
- Model: gpt-4o
- Variations: 15
- Focus: Production-grade optimization

Step 6: Validation

- Run new simulation with optimized prompt
- Compare results:
  Before: 40% connection rate, 1470ms latency
  After: 65% connection rate, 850ms latency
  Improvement: +62.5% connection rate, -42% latency

Troubleshooting

No Suggestions Appearing

Possible causes:
  • Not enough simulation data (need 20+ calls)
  • Agent performed perfectly (no issues detected)
  • Evaluation metrics not configured
Solutions:
  • Run more comprehensive simulations
  • Add diverse scenarios including edge cases
  • Configure custom evaluation metrics

Optimization Not Improving Performance

Possible causes:
  • Insufficient training data
  • Wrong optimizer for the problem type
  • Too few variations tested
  • Overfitting to evaluation set
Solutions:
  • Increase simulation scenario count
  • Try a different optimization algorithm
  • Increase number of variations (10-20)
  • Validate on held-out test scenarios

Optimization Taking Too Long

Possible causes:
  • Using slow optimizer (GEPA, ProTeGi)
  • Too many variations configured
  • Large dataset size
Solutions:
  • Start with Random Search or Bayesian Search
  • Reduce number of variations to 3-5
  • Use a smaller sample of representative scenarios

Advanced Topics

Combining Optimization with Manual Refinement

You can mix automated optimization with manual improvements:
  1. Run automated optimization to get AI-generated suggestions
  2. Review the optimized prompt for insights
  3. Manually refine based on domain expertise
  4. Run another optimization starting from your manual refinement
  5. Compare results to see which approach performs better

Custom Evaluation Metrics

For optimization to be most effective, configure evaluation metrics that match your business goals:
  • Conversion Rate: Did the agent successfully convert the customer?
  • Compliance: Did the agent follow regulatory requirements?
  • Customer Satisfaction: Sentiment and CSAT scores
  • Efficiency: Response latency, call duration, token usage
The optimization algorithms use your evaluation metrics as the fitness function. Better evaluation metrics lead to better optimization results.

Optimization for Different Agent Types

Different agent types benefit from different optimization strategies: Voice Agents:
  • Focus on: Latency, brevity, natural conversation flow
  • Best optimizers: Bayesian Search (parameter tuning), ProTeGi (error fixing)
Chat Agents:
  • Focus on: Response quality, accuracy, helpfulness
  • Best optimizers: Meta-Prompt (reasoning), PromptWizard (diverse styles)
Sales Agents:
  • Focus on: Conversion rate, objection handling, compliance
  • Best optimizers: GEPA (production-grade), Meta-Prompt (complex logic)
Support Agents:
  • Focus on: Problem resolution, empathy, efficiency
  • Best optimizers: ProTeGi (error patterns), Bayesian Search (few-shot examples)

Next Steps