AI Chat Agent Simulation and Fix My Agent Diagnostics

Simulate AI chat agents across multiple scenarios, analyze performance metrics, and use Fix My Agent for AI-powered diagnostics.

This cookbook shows you how to test and improve your AI chat agents using Future AGI’s simulation platform. You’ll learn how to:

Run Chat Simulations - Test your agent across multiple scenarios simultaneously
Analyze Performance - Get comprehensive metrics and evaluation results
Use Fix My Agent - Receive AI-powered diagnostics and actionable improvement suggestions

By the end of this guide, you’ll be able to simulate conversations at scale, identify issues automatically, and implement fixes to optimize your agent’s performance.

Note

Prerequisites: Before running this cookbook, make sure you have:

Created an agent definition in the Future AGI platform
Created scenarios for chat-type simulations (not voice type)
Created a Run Test configuration with evaluations and requirements

New to simulations? Check out our Simulation Overview first.

1. Installation

First, let’s install the required dependencies for chat simulation.

pip install agent-simulate litellm futureagi

These packages provide:

agent-simulate: The core SDK for simulating conversations with AI agents
litellm: A unified interface for calling multiple LLM providers
futureagi: The Future AGI platform SDK for managing prompts and evaluations

2. Import Required Libraries

Import all the necessary modules for the simulation:

from fi.simulate import TestRunner, AgentInput, AgentResponse
from fi.prompt.client import Prompt
import litellm
import os

from typing import Union
from getpass import getpass

3. Setup API Keys

Configure your API keys to connect to the AI services. You’ll need:

Future AGI API keys for accessing the platform
LLM provider API key (e.g., OpenAI, Gemini, Anthropic) for the agent’s model

Note

Uncomment the provider you’ll be using. For example, if using GPT models, uncomment the OPENAI_API_KEY line.

# Setup your API keys
os.environ["FI_API_KEY"] = getpass("Enter your Future AGI API key: ")
os.environ["FI_SECRET_KEY"] = getpass("Enter your Future AGI Secret key: ")
os.environ["GEMINI_API_KEY"] = getpass("Enter your GEMINI API key: ")
# os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key (optional): ")
# os.environ["ANTHROPIC_API_KEY"] = getpass("Enter your Anthropic API key (optional): ")

4. Define Prompt Template and Run Test

Before running the simulation, you need to define:

Prompt Template: The system prompt and configuration for your chat agent
Run Test Name: The test configuration created in the Future AGI platform

Create a Prompt Template

Navigate to the Prompt Workbench and:

Click on “Create Prompt”
Choose a label (production, staging, or development)
Name your template (e.g., “Customer_support_agent”)

Tip

Pro Tip: Use labels to organize different versions of your prompts and easily deploy them to production.

5. Configure and Fetch Agent

Now let’s set up an interactive configuration to fetch your agent’s prompt and create the simulation agent.

import ipywidgets as widgets

from IPython.display import display, clear_output
import asyncio

# --- 1. UI Setup (Widgets) ---
style = {'description_width': '150px'}
layout = widgets.Layout(width='500px')

header = widgets.HTML("<h3>🚀 Configure Simulation</h3>")

w_template_name = widgets.Text(
    value="Customer_support_agent",
    description="Prompt Template Name:",
    placeholder="e.g., Deliverysupportagent",
    style=style, layout=layout
)

w_label = widgets.Dropdown(
    options=["production", "staging", "development"],
    value="production",
    description="Environment Label:",
    style=style, layout=layout
)

w_run_name = widgets.Text(
    value="Chat test",
    description="Run Name:",
    style=style, layout=layout
)

w_concurrency = widgets.BoundedIntText(
    value=5,
    min=1, max=50,
    description="Concurrency:",
    style=style, layout=layout
)

btn_load = widgets.Button(
    description="Fetch Prompt & Create Agent",
    button_style='primary',
    layout=widgets.Layout(width='500px', margin='20px 0px 0px 0px'),
    icon='cloud-download'
)

out_log = widgets.Output(layout={'border': '1px solid #ddd', 'padding': '10px', 'margin': '20px 0px 0px 0px'})

Create the Agent Function

Define a function that creates your AI agent using LiteLLM:

def create_litellm_agent(system_prompt: str = None, model: str = "gpt-4o-mini"):
    """Creates the AI agent function using LiteLLM."""
    async def agent_function(input_data) -> str:
        messages = []
        
        # Add system prompt
        if system_prompt:
            messages.append({"role": "system", "content": system_prompt})

        # Add conversation history
        if hasattr(input_data, 'messages'):
            for msg in input_data.messages:
                content = msg.get("content", "")
                if not content: 
                    continue
                role = msg.get("role", "user")
                if role not in ["user", "assistant", "system"]: 
                    role = "user"
                messages.append({"role": role, "content": content})

        # Add new message
        if hasattr(input_data, 'new_message') and input_data.new_message:
            content = input_data.new_message.get("content", "")
            if content:
                messages.append({"role": "user", "content": content})

        # Call LiteLLM
        try:
            response = await litellm.acompletion(
                model=model,
                messages=messages,
                temperature=0.2,
            )
            if response and response.choices:
                return response.choices[0].message.content or ""
        except Exception as e:
            return f"Error generating response: {str(e)}"
        return ""

    return agent_function

Fetch Prompt and Configure Agent

def on_load_click(b):
    with out_log:
        clear_output()
        print("⏳ Connecting to Future AGI platform...")

        # Make variables available to other cells
        global agent_callback, concurrency, run_test_name

        # Update global config variables from widgets
        concurrency = w_concurrency.value
        run_test_name = w_run_name.value
        current_template = w_template_name.value
        current_label = w_label.value

        try:
            # 1. Fetch Prompt
            if current_label:
                prompt_obj = Prompt.get_template_by_name(current_template, label=current_label)
            else:
                prompt_obj = Prompt.get_template_by_name(current_template)

            print(f"✅ Successfully fetched: '{current_template}' ({current_label})")
            prompt_template = prompt_obj.template

            # 2. Extract Model
            model_name = "gpt-4o-mini"  # Default
            if hasattr(prompt_template, 'model_configuration') and prompt_template.model_configuration:
                if hasattr(prompt_template.model_configuration, 'model_name'):
                    model_name = prompt_template.model_configuration.model_name
            print(f"   ⚙️  Model: {model_name}")

            # 3. Extract System Prompt
            system_prompt = None
            # Check messages list
            if hasattr(prompt_template, 'messages') and prompt_template.messages:
                for msg in prompt_template.messages:
                    # Handle dict or object
                    role = msg.get('role') if isinstance(msg, dict) else getattr(msg, 'role', '')
                    content = msg.get('content') if isinstance(msg, dict) else getattr(msg, 'content', '')

                    if role == 'system':
                        system_prompt = content
                        break

            # Fallback: Try compiling
            if not system_prompt:
                try:
                    client = Prompt(template=prompt_template)
                    compiled = client.compile()
                    if compiled and isinstance(compiled, list):
                        for msg in compiled:
                            if isinstance(msg, dict) and msg.get('role') == 'system':
                                system_prompt = msg.get('content', '')
                                break
                except:
                    pass

            if not system_prompt:
                system_prompt = ""
                print("   ℹ️  No system prompt found (using empty).")
            else:
                preview = system_prompt[:50] + "..." if len(system_prompt) > 50 else system_prompt
                print(f"   📝 System Prompt loaded: \"{preview}\"")

            # 4. Create Agent
            agent_callback = create_litellm_agent(
                system_prompt=system_prompt,
                model=model_name
            )

            print("\n🎉 Agent created successfully! You can now run the simulation.")
            print("---------------------------------------------------------------")

        except NameError:
             print("❌ Error: 'Prompt' or 'litellm' library not defined. Please ensure previous setup cells were run.")
        except Exception as e:
            print(f"❌ Error fetching prompt: {e}")
            print("   Please check your API keys and Prompt Name.")

# --- 3. Display ---
btn_load.on_click(on_load_click)

ui = widgets.VBox([
    header,
    w_template_name,
    w_label,
    w_run_name,
    w_concurrency,
    btn_load,
    out_log
])

display(ui)

6. Run the Simulation

Now run the simulation with your configured agent and test scenarios:

print(f"\n🚀 Starting simulation: '{run_test_name}'")
print(f"   Concurrency: {concurrency} conversations at a time")
print(f"   This may take a few minutes...\n")

# Initialize the test runner
runner = TestRunner(
    api_key=os.environ["FI_API_KEY"],
    secret_key=os.environ["FI_SECRET_KEY"],
)

# Run the simulation
report = await runner.run_test(
    run_test_name=run_test_name,
    agent_callback=agent_callback,
    concurrency=concurrency,
)

print("\n✅ Simulation completed!")
print(f"   Total conversations: {len(report.results) if hasattr(report, 'results') else 'N/A'}")
print(f"\n📊 View detailed results in your Future AGI dashboard:")
print(f"   https://app.futureagi.com")

Understanding the Results

The simulation will:

Execute multiple test conversations concurrently
Test your agent against predefined scenarios
Generate a comprehensive report with metrics
Upload results to your Future AGI dashboard

Note

What’s Next? Now that you have simulation results, it’s time to analyze them and improve your agent. Instead of manually reviewing hundreds of data points, let AI do the heavy lifting with Fix My Agent.

7. Fix My Agent - Get Instant Diagnostics

Once your simulation completes, you’ll see a comprehensive dashboard with performance metrics and evaluation results. But here’s where it gets powerful: instead of manually analyzing data and debugging issues yourself, click the Fix My Agent button to get AI-powered diagnostics and actionable recommendations in seconds.

How Fix My Agent Works

After analyzing your simulation results, Fix My Agent:

Analyzes: Reviews all conversations against your evaluation criteria and performance metrics
Identifies: Pinpoints specific issues like latency bottlenecks, response quality problems, or conversation flow issues
Prioritizes: Ranks suggestions by impact (High/Medium/Low priority)
Recommends: Provides clear, actionable fixes you can implement immediately
Generates: Optionally creates optimized system prompts you can copy directly into your setup

Tip

Most teams see significant improvements by simply implementing the high-priority suggestions from Fix My Agent. It’s like having an AI expert review your agent’s performance and tell you exactly what to fix.

Key Features

Concurrent Testing

Run multiple conversations simultaneously to test at scale

Scenario-Based Testing

Test against predefined scenarios and edge cases

Automatic Evaluation

Get instant feedback on agent performance metrics

Fix My Agent

AI-powered diagnostics and actionable improvement recommendations

Best Practices

Start Small: Begin with a low concurrency value (e.g., 5) and increase gradually
Diverse Scenarios: Create test scenarios covering various user intents and edge cases
Use Fix My Agent: After each simulation, check Fix My Agent for improvement suggestions
Iterative Testing: Implement fixes, then re-run simulations to track improvements
Monitor Metrics: Pay attention to evaluation metrics like task completion, tone, and response quality
Use Labels: Leverage environment labels (dev, staging, production) to manage prompt versions

Troubleshooting

Connection Errors

Ensure all API keys are correctly set and have proper permissions. Check your internet connection and firewall settings.

Prompt Not Found

Verify the prompt template name and label exist in your Future AGI dashboard. Names are case-sensitive.

Simulation Timeout

Reduce the concurrency value or check if your agent is taking too long to respond. Consider optimizing your prompt or model selection.

Model Errors

Ensure the LLM provider API key is valid and the model name is correct. Some models may require specific API access.

Next Steps

Fix My Agent Guide

Deep dive into Fix My Agent features and optimization

Voice Simulation

Learn how to simulate voice conversations

Advanced Evaluations

Master advanced evaluation techniques

Simulation Documentation

Read the detailed simulation documentation

Conclusion

You’ve now learned how to simulate and improve your AI chat agents using the Future AGI platform. This powerful workflow helps you:

Test at Scale: Run multiple concurrent simulations across diverse scenarios
Get Instant Diagnostics: Use Fix My Agent to identify issues automatically
Implement Fixes Fast: Follow actionable recommendations to improve quality
Iterate Confidently: Validate improvements before deploying to production
Maintain Quality: Continuously monitor and optimize agent performance

The combination of simulation testing and AI-powered diagnostics ensures your agents deliver high-quality interactions in production.

For more information, visit the Future AGI Documentation or join our community forum.

Was this page helpful?

Questions & Discussion