Skip to main content

What it does

  • Runs an existing Run Test (configured in the Future AGI UI) in chat mode
  • For each conversation, the simulator sends chat messages and calls your agent callback to get responses
  • Stores transcripts + results in your Future AGI dashboard

Before you start (UI setup)

Chat simulation uses the same high-level building blocks as voice simulation, but some fields are chat-specific.
  • Agent Definition (Chat): Create your agent definition as chat. Voice-only fields like phone number aren’t required for chat tests. See Agent Definition.
  • Personas (Chat): Persona “voice” settings (accent, background noise, speaking speed) are voice-only; for chat, focus on tone, behavior, and custom properties. See Personas.
  • Scenarios (Chat): Create scenarios that represent chat conversations (dataset/workflow/script/SOP). See Scenarios.
  • Run Tests: Create a Run Test that links your chat agent + scenarios. You’ll reference the Run Test name from the SDK. See Run Tests.

Requirements

  • Python 3.10+
  • FI_API_KEY and FI_SECRET_KEY from Future AGI
  • A created Run Test (chat) in the Future AGI UI
  • If your callback uses an LLM provider: the relevant provider key (e.g. OPENAI_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY, etc.)

Colab example

You can run the full notebook here: Chat Simulate Testing.ipynb

Install

pip install agent-simulate litellm

Quick start (cloud chat simulation)

To run a chat simulation, you need to:
  1. Define an agent_callback (your chat agent)
  2. Call run_test for an existing Run Test you created in the UI
from fi.simulate import TestRunner, AgentInput, AgentResponse
import litellm
import os
from typing import Union
import asyncio

# ---- Auth (Future AGI) ----
# You can also set these as environment variables in your shell.
FI_API_KEY = os.environ.get("FI_API_KEY", "<YOUR_FI_API_KEY>")
FI_SECRET_KEY = os.environ.get("FI_SECRET_KEY", "<YOUR_FI_SECRET_KEY>")

# If you use a provider model via LiteLLM, set the relevant key:
# os.environ["OPENAI_API_KEY"] = "..."
# os.environ["ANTHROPIC_API_KEY"] = "..."
# os.environ["GOOGLE_API_KEY"] = "..."

# ---- Configure ----
run_test_name = "Chat test"  # must match your Run Test name in the UI
concurrency = 5

# ---- Your agent callback ----
# Replace this with your real agent (LangChain, LlamaIndex, custom app, etc.)
async def agent_callback(input: AgentInput) -> Union[str, AgentResponse]:
    user_text = (input.new_message or {}).get("content", "") or ""

    # Example using LiteLLM (works with OpenAI/Anthropic/Gemini/etc.)
    resp = await litellm.acompletion(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": user_text}],
        temperature=0.2,
    )
    return resp.choices[0].message.content or ""

async def main():
    print(f"\n🚀 Starting simulation: '{run_test_name}'")
    print(f"Concurrency: {concurrency} conversations at a time")

    runner = TestRunner(api_key=FI_API_KEY, secret_key=FI_SECRET_KEY)

    await runner.run_test(
        run_test_name=run_test_name,
        agent_callback=agent_callback,
        concurrency=concurrency,
    )

    print("\n✅ Simulation completed!")
    print("View results in the dashboard: https://app.futureagi.com")

asyncio.run(main())
If you already have your own chat agent (LangChain, LlamaIndex, custom app, etc.), keep it unchanged: just wrap it in agent_callback so the simulator can call it turn-by-turn.

Callback contract (what the SDK sends to you)

  • input.new_message: the latest simulator message you should respond to (treat it like “the user message”)
  • input.messages: the conversation history so far (including that last simulator message)
  • input.thread_id / input.execution_id: IDs you can use for logging / correlation

The 3 core SDK types (AgentInput, AgentResponse, AgentWrapper)

  • AgentInput: what the simulator sends to your code each turn (history + latest message).
  • AgentResponse: optional structured return type (content + tool calls/results). You can also just return a plain string.
  • AgentWrapper: an abstract class that provides a clean pattern if you don’t want to pass a raw function as agent_callback.
SDK class reference:

class AgentInput(BaseModel):
    thread_id: str
    messages: List[Dict[str, str]]
    new_message: Optional[Dict[str, str]] = None
    execution_id: Optional[str] = None

class AgentResponse(BaseModel):
    content: str
    tool_calls: Optional[List[Dict[str, Any]]] = None
    tool_responses: Optional[List[Dict[str, Any]]] = None
    metadata: Optional[Dict[str, Any]] = None

class AgentWrapper(ABC):
    @abstractmethod
    async def call(self, input: AgentInput) -> Union[str, AgentResponse]:
        pass
Example wrapper:
from fi.simulate import AgentWrapper, AgentInput, AgentResponse
from typing import Union

class MyAgent(AgentWrapper):
    async def call(self, input: AgentInput) -> Union[str, AgentResponse]:
        user_text = (input.new_message or {}).get("content", "") or ""
        return f"You said: {user_text}"

# Usage:
# await runner.run_test(run_test_name=..., agent_callback=MyAgent(), concurrency=...)

Optional: tool calling with AgentResponse

If your agent uses tools/functions, return an AgentResponse (instead of a plain string):
from fi.simulate import AgentResponse

async def agent_callback(input: AgentInput) -> AgentResponse:
    # Example shape only — generate these from your tool-calling stack.
    return AgentResponse(
        content="Let me look that up for you.",
        tool_calls=[
            {
                "id": "call_1",
                "type": "function",
                "function": {"name": "lookup_order", "arguments": "{\"order_id\": \"123\"}"},
            }
        ],
        tool_responses=[
            {"role": "tool", "tool_call_id": "call_1", "content": "{\"status\": \"shipped\"}"},
        ],
    )
If you want to mock tools during a real simulation run (so you can see how your agent behaves end-to-end without calling external systems), you can stub tool outputs inside your agent_callback.
import os
import json
from fi.simulate import AgentResponse

MOCK_TOOLS = os.getenv("MOCK_TOOLS", "false").lower() in ("1", "true", "yes")

async def agent_callback(input: AgentInput) -> AgentResponse:
    # 1) Ask your model to decide whether to call tools (tool_calls)
    tool_calls = [
        {
            "id": "call_1",
            "type": "function",
            "function": {"name": "lookup_order", "arguments": "{\"order_id\": \"123\"}"},
        }
    ]

    # 2) In mock mode, stub tool execution via a registry (no hardcoded if/else)
    tool_responses = []
    if MOCK_TOOLS:
        from unittest.mock import MagicMock

        # Tool registry: tool name -> callable
        # In real mode, this would map to your actual tool implementations.
        # In mock mode, replace them with MagicMock(...) to return deterministic outputs.
        tool_registry = {
            "lookup_order": MagicMock(return_value={"status": "shipped", "order_id": "123"}),
        }

        for tc in tool_calls:
            fn = (tc.get("function") or {}).get("name")
            args = (tc.get("function") or {}).get("arguments", "{}")
            args_dict = json.loads(args) if isinstance(args, str) else (args or {})

            tool_fn = tool_registry.get(fn)
            output = tool_fn(**args_dict) if tool_fn else {"error": f"Unknown tool: {fn}"}

            tool_responses.append(
                {"role": "tool", "tool_call_id": tc["id"], "content": json.dumps(output)}
            )

    # 3) Return both the tool_calls and (mocked) tool_responses as an AgentResponse
    return AgentResponse(
        content="Let me check that for you.",
        tool_calls=tool_calls,
        tool_responses=tool_responses or None,
    )

Where results show up

Cloud chat simulation writes results to your Future AGI dashboard. The SDK call is mainly used to:
  • orchestrate runs
  • call your agent_callback
  • stream messages back to the simulator

Troubleshooting

  • ReadError / timeouts: try increasing timeout:
await runner.run_test(
    run_test_name=run_test_name,
    agent_callback=agent_callback,
    concurrency=concurrency,
    timeout=180.0,
)
  • “Invalid status. Valid choices are …”: statuses are lowercase (pending, queued, ongoing, completed, failed, analyzing, cancelled). If you see this, it’s a backend validation message surfaced in logs and you can ignore it unless runs are stuck.
Pro tip: reuse a prompt from Future AGIIf you maintain your system prompt in Future AGI, you can fetch it and use it inside your callback. For more on prompt templates and compiling variables, see Prompt Workbench Using SDK.
from fi.prompt.client import Prompt

prompt = Prompt.get_template_by_name("customer-support-agent", label="production")
prompt_template = prompt.template

Next steps

  • Review the transcripts and scores in Run Tests
  • Reiterate on your agent callback to improve the agent’s performance