Chat Simulation: Run Multi-Persona Conversations via SDK

Use FutureAGI's Chat Simulation feature to define personas, generate scenarios, execute multi-turn conversations via the SDK, and diagnose failures with Fix My Agent.

📝
TL;DR

Chat Simulation lets you define agent profiles, create diverse personas, auto-generate test scenarios, run multi-turn conversations via the SDK, and diagnose failures with Fix My Agent.

TimeDifficultyPackage
25 minIntermediateagent-simulate
Prerequisites

Install

pip install agent-simulate openai
export FI_API_KEY="your-api-key"
export FI_SECRET_KEY="your-secret-key"
export OPENAI_API_KEY="your-openai-api-key"

Key concepts

  • Agent Definition — A versioned profile of your agent: its type (Chat/Voice), system prompt, model, and optional knowledge base. Each version gets a commit message for tracking.
  • Persona — A simulated user with configurable personality, communication style, tone, and quirks (typos, slang, verbosity). Personas stress-test your agent from different user perspectives.
  • Scenario — A test case describing a situation the persona will act out (e.g., “customer wants to return a laptop”). Scenarios are auto-generated from your agent definition and persona set.
  • Simulation — A run that pairs your agent definition with scenarios and evaluations, then executes multi-turn conversations via the SDK.
  • Fix My Agent — A diagnostic tool that analyzes simulation results and surfaces actionable recommendations to improve your agent’s prompt and behavior.

Tutorial

Create an agent definition

Go to app.futureagi.comSimulateAgent DefinitionCreate agent definition.

The creation wizard has three steps:

Step 1: Basic Info

FieldValue
Agent typeChat
Agent namecustomer-support-bot
Select languageEnglish

Step 2: Configuration

For Chat agents, the only field is Model Used: select your LLM (e.g. gpt-4o-mini). This step is optional.

Step 3: Behaviour

FieldValue
Prompt / ChainsYou are a helpful customer support agent for TechStore. You assist customers with orders, returns, and product questions. Always be professional and solution-oriented.
Knowledge Base(optional) Select a KB if you want grounded responses
Commit MessageInitial support agent prompt

Click Create to save the agent definition as v1.

Tip

To iterate on your agent’s prompt later, open the agent definition and click Create new version. Each version gets a commit message for tracking. You can select which version to use when running simulations.

Create personas

Go to SimulatePersonasCreate your own persona.

Each persona has sections for Basic Info, Behavioural Settings, Chat Settings, Custom Properties, and Additional Instructions.

Create these three personas (select type Chat for each):

cooperative-customer

SectionFieldValue
Basic InfoNamecooperative-customer
Basic InfoDescriptionA patient, friendly customer who provides clear information and follows instructions.
BehaviouralPersonalityFriendly and cooperative
BehaviouralCommunication StyleDirect and concise
Chat SettingsToneneutral
Chat SettingsVerbositybalanced
Chat SettingsTypo Levelnone

frustrated-customer

SectionFieldValue
Basic InfoNamefrustrated-customer
Basic InfoDescriptionAn impatient customer who has already contacted support once. Wants a fast resolution.
BehaviouralPersonalityImpatient and direct
BehaviouralCommunication StyleAssertive
Chat SettingsTonecasual
Chat SettingsVerbositybrief
Chat SettingsTypo Leveloccasional

confused-customer

SectionFieldValue
Basic InfoNameconfused-customer
Basic InfoDescriptionA non-technical user unsure what information to provide. Needs guidance.
BehaviouralPersonalityAnxious
BehaviouralCommunication StyleQuestioning
Chat SettingsTonecasual
Chat SettingsVerbositydetailed
Chat SettingsTypo Levelrare

Tip

All persona options:

  • Personality (12 options): Friendly and cooperative, Professional and formal, Cautious and skeptical, Impatient and direct, Detail-oriented, Easy-going, Anxious, Confident, Analytical, Emotional, Reserved, Talkative
  • Communication Style (10 options): Direct and concise, Detailed and elaborate, Casual and friendly, Formal and polite, Technical, Simple and clear, Questioning, Assertive, Passive, Collaborative
  • Chat Settings: Tone (formal / neutral / casual), Verbosity (brief / balanced / detailed), Regional Mix (none / light / moderate / heavy), Slang Level (none / light / moderate / heavy), Typo Level (none / rare / occasional / frequent), Punctuation Style (clean / minimal / expressive / erratic), Emoji Frequency (never / light / regular / heavy)

You can also set Custom Properties (key-value pairs) and Additional Instructions (free text) for more nuanced behavior.

Create a scenario

Go to SimulateScenariosCreate New Scenario.

Scenarios define the test cases your personas will run against your agent. There are four scenario types:

TypeUse case
Workflow builderAuto-generate or manually build conversation flows
Import datasetsUse structured data (CSV, JSON, Excel) as test cases
Upload scriptImport existing conversation scripts
Call / Chat SOPDefine standard operating procedures for testing

For this guide, select Workflow builder and fill in:

FieldValue
Scenario Nameorder-return-request
DescriptionCustomer wants to return a laptop with a cracked screen. Has order number but hasn’t initiated a return yet.
Choose sourceSelect customer-support-bot (Agent Definition)
Choose versionv1
No. of scenarios20

Attach personas: In the Persona section, leave the Add by default toggle on — this auto-adds all active personas to your scenarios. Alternatively, turn the toggle off and click Add persona to manually select specific personas.

Click Create.

Tip

You can also add Columns (custom inputs like order IDs, product names, or issue categories) to generate more varied scenario data. Use the Custom Instructions toggle to provide extra context for scenario generation beyond the agent definition.

Configure the simulation

Go to SimulateRun SimulationCreate a Simulation.

The creation wizard has four steps:

Step 1: Add simulation details

FieldValue
Simulation namereturn-flow-test
Choose Agent definitioncustomer-support-bot
Choose versionv1
DescriptionTesting return flow with 3 customer personas

Step 2: Choose Scenario(s)

Select the order-return-request scenario from the list. You can search and select multiple scenarios.

Step 3: Select Evaluations

Click Add Evaluations and under Groups, select Conversational agent evaluation for broad coverage. This group includes 10 built-in evals:

  • customer_agent_loop_detection
  • customer_agent_context_retention
  • customer_agent_query_handling
  • customer_agent_termination_handling
  • customer_agent_conversation_quality
  • customer_agent_objection_handling
  • customer_agent_language_handling
  • customer_agent_human_escalation
  • customer_agent_clarification_seeking
  • customer_agent_prompt_conformance

If your agent uses tool calling, toggle Enable tool call evaluation. The platform will automatically evaluate every tool invocation made during the simulation and show Pass/Fail results as additional columns in the results grid (e.g., “check_order_status #1”) with reasoning; no extra code needed.

Step 4: Summary

Review your simulation configuration — agent definition, scenarios, and evaluations — then click Run Simulation to create the simulation.

After the simulation is created, the platform shows SDK instructions with a code snippet to run the simulation. Chat simulations are executed via the SDK — copy the code and proceed to the next step.

Run the simulation via SDK

Chat simulations require the SDK to execute. The platform generates a code snippet after you create the simulation; replace the placeholder agent with your real agent logic.

import asyncio
import os
import openai
from fi.simulate import TestRunner, AgentInput

openai_client = openai.AsyncOpenAI()

SYSTEM_PROMPT = """You are a helpful customer support agent for TechStore.
You assist customers with orders, returns, and product questions.
Always be professional, empathetic, and solution-oriented.
If you cannot resolve an issue, offer to escalate to a human agent."""


async def agent_callback(input: AgentInput) -> str:
    # Build the full conversation history for context
    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
    for msg in input.messages:
        messages.append(msg)

    response = await openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        temperature=0.2,
    )
    return response.choices[0].message.content or ""


async def main():
    runner = TestRunner(
        api_key=os.environ["FI_API_KEY"],
        secret_key=os.environ["FI_SECRET_KEY"],
    )

    report = await runner.run_test(
        run_test_name="return-flow-test",
        agent_callback=agent_callback,
    )

    print(f"Simulation finished! Processed {len(report.results)} test cases")


asyncio.run(main())

Expected output:

🔍 Fetching Run Test ID for name: return-flow-test
✓ Found Run Test ID: <uuid>
Starting Simulation for Run ID: <uuid>
✓ Test Execution Started: <uuid>
🔄 Fetching batch of scenarios...
📥 Received batch: 20 calls
▶️ Processing Call: <uuid>
✓ Call Finished: <uuid> (6 turns)
...
✅ Cloud Simulation Completed.
Simulation finished! Processed 20 test cases

Warning

The run_test_name value must exactly match the simulation name you entered in Step 4 (e.g. return-flow-test). A mismatch returns a 404 from the platform.

Tip

Your agent_callback receives an AgentInput with thread_id, messages (full history), and new_message (latest turn). Return a plain str or an AgentResponse for tool-calling scenarios. Pre-built wrappers are available: OpenAIAgentWrapper, LangChainAgentWrapper, GeminiAgentWrapper, AnthropicAgentWrapper.

Review results and Fix My Agent

Once the simulation completes, go to SimulateRun Simulation → open return-flow-test. The results page shows three tabs:

  • Chat Details: per-conversation transcripts, CSAT scores, and evaluation scores
  • Analytics: evaluation score distributions and trends
  • Optimization Runs: results from prompt optimization runs

Fix My Agent: Click the Fix My Agent button (top-right) to open the diagnostic drawer. The platform analyzes your simulation traces and surfaces two categories of recommendations:

  • Fixable Recommendations: organized into two tabs:
    • Agent Level: prompt and behavior improvements you can apply directly (e.g. missing empathy phrases, unclear escalation paths)
    • Branch Level: domain-specific issues grouped by conversation topic or flow (e.g. return policy gaps, billing confusion). Each recommendation highlights which specific calls are affected, so you can trace issues back to exact conversations.
  • Non-Fixable Recommendations: system-level issues that require infrastructure changes (e.g. missing integrations, data access limitations), plus a human comparison summary showing where a human agent would have handled the situation differently.
  • Overall Insights: a synthesis of patterns across all calls.

Optimize My Agent: Inside the Fix My Agent drawer, click Optimize My Agent to generate improved prompt variants automatically:

  1. Enter a Name for the optimization run
  2. Choose Optimizer — select from available optimizers (e.g. Bayesian Search, MetaPrompt, ProTeGi, GEPA, PromptWizard, Random Search)
  3. Language Model — select the model for optimization
  4. Click Start Optimizing your agent

Optimization results appear in the Optimization Runs tab. Review the generated prompt variants and their scores to decide which version to promote.

Tip

For reliable Fix My Agent suggestions, run at least 15 conversations and include as many evaluations as practical (minimum: 1).

What you built

You can now simulate multi-persona chat conversations, evaluate agent quality with built-in metrics, and diagnose failures with Fix My Agent.

  • Created a chat agent definition with a 3-step wizard (Basic Info → Configuration → Behaviour) with version tracking
  • Built 3 distinct personas with personality, communication style, and chat-specific settings
  • Generated test scenarios using the Workflow builder with auto-attached personas
  • Configured a simulation with the Conversational agent evaluation group (10 built-in evals)
  • Ran the simulation via SDK with TestRunner and a custom agent_callback
  • Reviewed results across Chat Details, Analytics, and Optimization Runs tabs
  • Used Fix My Agent to surface failure patterns and Optimize My Agent to generate improved prompts

Next steps

Was this page helpful?

Questions & Discussion