What it does
- Runs an existing Run Test (configured in the Future AGI UI) in chat mode
- For each conversation, the simulator sends chat messages and calls your agent callback to get responses
- Stores transcripts + results in your Future AGI dashboard
Before you start (UI setup)
Chat simulation uses the same high-level building blocks as voice simulation, but some fields are chat-specific.
- Agent Definition (Chat): Create your agent definition as
chat. Voice-only fields like phone number aren’t required for chat tests. See Agent Definition.
- Personas (Chat): Persona “voice” settings (accent, background noise, speaking speed) are voice-only; for chat, focus on tone, behavior, and custom properties. See Personas.
- Scenarios (Chat): Create scenarios that represent chat conversations (dataset/workflow/script/SOP). See Scenarios.
- Run Tests: Create a Run Test that links your chat agent + scenarios. You’ll reference the Run Test name from the SDK. See Run Tests.
Requirements
- Python 3.10+
FI_API_KEY and FI_SECRET_KEY from Future AGI
- A created Run Test (chat) in the Future AGI UI
- If your callback uses an LLM provider: the relevant provider key (e.g.
OPENAI_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY, etc.)
Colab example
You can run the full notebook here: Chat Simulate Testing.ipynb
Install
pip install agent-simulate litellm
Quick start (cloud chat simulation)
To run a chat simulation, you need to:
- Define an
agent_callback (your chat agent)
- Call
run_test for an existing Run Test you created in the UI
from fi.simulate import TestRunner, AgentInput, AgentResponse
import litellm
import os
from typing import Union
import asyncio
# ---- Auth (Future AGI) ----
# You can also set these as environment variables in your shell.
FI_API_KEY = os.environ.get("FI_API_KEY", "<YOUR_FI_API_KEY>")
FI_SECRET_KEY = os.environ.get("FI_SECRET_KEY", "<YOUR_FI_SECRET_KEY>")
# If you use a provider model via LiteLLM, set the relevant key:
# os.environ["OPENAI_API_KEY"] = "..."
# os.environ["ANTHROPIC_API_KEY"] = "..."
# os.environ["GOOGLE_API_KEY"] = "..."
# ---- Configure ----
run_test_name = "Chat test" # must match your Run Test name in the UI
concurrency = 5
# ---- Your agent callback ----
# Replace this with your real agent (LangChain, LlamaIndex, custom app, etc.)
async def agent_callback(input: AgentInput) -> Union[str, AgentResponse]:
user_text = (input.new_message or {}).get("content", "") or ""
# Example using LiteLLM (works with OpenAI/Anthropic/Gemini/etc.)
resp = await litellm.acompletion(
model="gpt-4o-mini",
messages=[{"role": "user", "content": user_text}],
temperature=0.2,
)
return resp.choices[0].message.content or ""
async def main():
print(f"\n🚀 Starting simulation: '{run_test_name}'")
print(f"Concurrency: {concurrency} conversations at a time")
runner = TestRunner(api_key=FI_API_KEY, secret_key=FI_SECRET_KEY)
await runner.run_test(
run_test_name=run_test_name,
agent_callback=agent_callback,
concurrency=concurrency,
)
print("\n✅ Simulation completed!")
print("View results in the dashboard: https://app.futureagi.com")
asyncio.run(main())
If you already have your own chat agent (LangChain, LlamaIndex, custom app, etc.), keep it unchanged: just wrap it in agent_callback so the simulator can call it turn-by-turn.
Callback contract (what the SDK sends to you)
input.new_message: the latest simulator message you should respond to (treat it like “the user message”)
input.messages: the conversation history so far (including that last simulator message)
input.thread_id / input.execution_id: IDs you can use for logging / correlation
AgentInput: what the simulator sends to your code each turn (history + latest message).
AgentResponse: optional structured return type (content + tool calls/results). You can also just return a plain string.
AgentWrapper: an abstract class that provides a clean pattern if you don’t want to pass a raw function as agent_callback.
SDK class reference:
class AgentInput(BaseModel):
thread_id: str
messages: List[Dict[str, str]]
new_message: Optional[Dict[str, str]] = None
execution_id: Optional[str] = None
class AgentResponse(BaseModel):
content: str
tool_calls: Optional[List[Dict[str, Any]]] = None
tool_responses: Optional[List[Dict[str, Any]]] = None
metadata: Optional[Dict[str, Any]] = None
class AgentWrapper(ABC):
@abstractmethod
async def call(self, input: AgentInput) -> Union[str, AgentResponse]:
pass
Example wrapper:
from fi.simulate import AgentWrapper, AgentInput, AgentResponse
from typing import Union
class MyAgent(AgentWrapper):
async def call(self, input: AgentInput) -> Union[str, AgentResponse]:
user_text = (input.new_message or {}).get("content", "") or ""
return f"You said: {user_text}"
# Usage:
# await runner.run_test(run_test_name=..., agent_callback=MyAgent(), concurrency=...)
If your agent uses tools/functions, return an AgentResponse (instead of a plain string):
from fi.simulate import AgentResponse
async def agent_callback(input: AgentInput) -> AgentResponse:
# Example shape only — generate these from your tool-calling stack.
return AgentResponse(
content="Let me look that up for you.",
tool_calls=[
{
"id": "call_1",
"type": "function",
"function": {"name": "lookup_order", "arguments": "{\"order_id\": \"123\"}"},
}
],
tool_responses=[
{"role": "tool", "tool_call_id": "call_1", "content": "{\"status\": \"shipped\"}"},
],
)
If you want to mock tools during a real simulation run (so you can see how your agent behaves end-to-end without calling external systems), you can stub tool outputs inside your agent_callback.import os
import json
from fi.simulate import AgentResponse
MOCK_TOOLS = os.getenv("MOCK_TOOLS", "false").lower() in ("1", "true", "yes")
async def agent_callback(input: AgentInput) -> AgentResponse:
# 1) Ask your model to decide whether to call tools (tool_calls)
tool_calls = [
{
"id": "call_1",
"type": "function",
"function": {"name": "lookup_order", "arguments": "{\"order_id\": \"123\"}"},
}
]
# 2) In mock mode, stub tool execution via a registry (no hardcoded if/else)
tool_responses = []
if MOCK_TOOLS:
from unittest.mock import MagicMock
# Tool registry: tool name -> callable
# In real mode, this would map to your actual tool implementations.
# In mock mode, replace them with MagicMock(...) to return deterministic outputs.
tool_registry = {
"lookup_order": MagicMock(return_value={"status": "shipped", "order_id": "123"}),
}
for tc in tool_calls:
fn = (tc.get("function") or {}).get("name")
args = (tc.get("function") or {}).get("arguments", "{}")
args_dict = json.loads(args) if isinstance(args, str) else (args or {})
tool_fn = tool_registry.get(fn)
output = tool_fn(**args_dict) if tool_fn else {"error": f"Unknown tool: {fn}"}
tool_responses.append(
{"role": "tool", "tool_call_id": tc["id"], "content": json.dumps(output)}
)
# 3) Return both the tool_calls and (mocked) tool_responses as an AgentResponse
return AgentResponse(
content="Let me check that for you.",
tool_calls=tool_calls,
tool_responses=tool_responses or None,
)
Where results show up
Cloud chat simulation writes results to your Future AGI dashboard. The SDK call is mainly used to:
- orchestrate runs
- call your
agent_callback
- stream messages back to the simulator
Troubleshooting
- ReadError / timeouts: try increasing
timeout:
await runner.run_test(
run_test_name=run_test_name,
agent_callback=agent_callback,
concurrency=concurrency,
timeout=180.0,
)
- “Invalid status. Valid choices are …”: statuses are lowercase (
pending, queued, ongoing, completed, failed, analyzing, cancelled). If you see this, it’s a backend validation message surfaced in logs and you can ignore it unless runs are stuck.
Pro tip: reuse a prompt from Future AGIIf you maintain your system prompt in Future AGI, you can fetch it and use it inside your callback.
For more on prompt templates and compiling variables, see Prompt Workbench Using SDK.from fi.prompt.client import Prompt
prompt = Prompt.get_template_by_name("customer-support-agent", label="production")
prompt_template = prompt.template
Next steps
- Review the transcripts and scores in Run Tests
- Reiterate on your agent callback to improve the agent’s performance