Simulate AI chat agents at scale and get instant AI-powered diagnostics to improve performance
This cookbook shows you how to test and improve your AI chat agents using Future AGI’s simulation platform. You’ll learn how to:
Run Chat Simulations - Test your agent across multiple scenarios simultaneously
Analyze Performance - Get comprehensive metrics and evaluation results
Use Fix My Agent - Receive AI-powered diagnostics and actionable improvement suggestions
By the end of this guide, you’ll be able to simulate conversations at scale, identify issues automatically, and implement fixes to optimize your agent’s performance.
Prerequisites: Before running this cookbook, make sure you have:
Created an agent definition in the Future AGI platform
Created scenarios for chat-type simulations (not voice type)
Created a Run Test configuration with evaluations and requirements
Configure your API keys to connect to the AI services. You’ll need:
Future AGI API keys for accessing the platform
LLM provider API key (e.g., OpenAI, Gemini, Anthropic) for the agent’s model
Uncomment the provider you’ll be using. For example, if using GPT models, uncomment the OPENAI_API_KEY line.
Copy
Ask AI
# Setup your API keysos.environ["FI_API_KEY"] = getpass("Enter your Future AGI API key: ")os.environ["FI_SECRET_KEY"] = getpass("Enter your Future AGI Secret key: ")os.environ["GEMINI_API_KEY"] = getpass("Enter your GEMINI API key: ")# os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key (optional): ")# os.environ["ANTHROPIC_API_KEY"] = getpass("Enter your Anthropic API key (optional): ")
Define a function that creates your AI agent using LiteLLM:
Copy
Ask AI
def create_litellm_agent(system_prompt: str = None, model: str = "gpt-4o-mini"): """Creates the AI agent function using LiteLLM.""" async def agent_function(input_data) -> str: messages = [] # Add system prompt if system_prompt: messages.append({"role": "system", "content": system_prompt}) # Add conversation history if hasattr(input_data, 'messages'): for msg in input_data.messages: content = msg.get("content", "") if not content: continue role = msg.get("role", "user") if role not in ["user", "assistant", "system"]: role = "user" messages.append({"role": role, "content": content}) # Add new message if hasattr(input_data, 'new_message') and input_data.new_message: content = input_data.new_message.get("content", "") if content: messages.append({"role": "user", "content": content}) # Call LiteLLM try: response = await litellm.acompletion( model=model, messages=messages, temperature=0.2, ) if response and response.choices: return response.choices[0].message.content or "" except Exception as e: return f"Error generating response: {str(e)}" return "" return agent_function
Now run the simulation with your configured agent and test scenarios:
Copy
Ask AI
print(f"\n🚀 Starting simulation: '{run_test_name}'")print(f" Concurrency: {concurrency} conversations at a time")print(f" This may take a few minutes...\n")# Initialize the test runnerrunner = TestRunner( api_key=os.environ["FI_API_KEY"], secret_key=os.environ["FI_SECRET_KEY"],)# Run the simulationreport = await runner.run_test( run_test_name=run_test_name, agent_callback=agent_callback, concurrency=concurrency,)print("\n✅ Simulation completed!")print(f" Total conversations: {len(report.results) if hasattr(report, 'results') else 'N/A'}")print(f"\n📊 View detailed results in your Future AGI dashboard:")print(f" https://app.futureagi.com")
What’s Next? Now that you have simulation results, it’s time to analyze them and improve your agent. Instead of manually reviewing hundreds of data points, let AI do the heavy lifting with Fix My Agent.
Once your simulation completes, you’ll see a comprehensive dashboard with performance metrics and evaluation results. But here’s where it gets powerful: instead of manually analyzing data and debugging issues yourself, click the Fix My Agent button to get AI-powered diagnostics and actionable recommendations in seconds.
After analyzing your simulation results, Fix My Agent:
Analyzes: Reviews all conversations against your evaluation criteria and performance metrics
Identifies: Pinpoints specific issues like latency bottlenecks, response quality problems, or conversation flow issues
Prioritizes: Ranks suggestions by impact (High/Medium/Low priority)
Recommends: Provides clear, actionable fixes you can implement immediately
Generates: Optionally creates optimized system prompts you can copy directly into your setup
Most teams see significant improvements by simply implementing the high-priority suggestions from Fix My Agent. It’s like having an AI expert review your agent’s performance and tell you exactly what to fix.
You’ve now learned how to simulate and improve your AI chat agents using the Future AGI platform. This powerful workflow helps you:
Test at Scale: Run multiple concurrent simulations across diverse scenarios
Get Instant Diagnostics: Use Fix My Agent to identify issues automatically
Implement Fixes Fast: Follow actionable recommendations to improve quality
Iterate Confidently: Validate improvements before deploying to production
Maintain Quality: Continuously monitor and optimize agent performance
The combination of simulation testing and AI-powered diagnostics ensures your agents deliver high-quality interactions in production.For more information, visit the Future AGI Documentation or join our community forum.