Simulate from the Prompt Workbench
Run a simulation against your prompt directly from the FutureAGI Prompts page — no SDK, no code required.
Launch multi-turn chat simulations against any saved prompt version directly from the Prompts workbench — no SDK or agent definition required.
| Time | Difficulty | Package |
|---|---|---|
| 10 min | Beginner | UI only |
- FutureAGI account → app.futureagi.com
- At least one saved prompt version in the Prompts workbench (see Prompt Versioning if you need to create one)
- At least one chat scenario under Simulate → Scenarios (see Scenarios if you need to create one)
What is Prompt Workbench Simulation?
The Prompt Workbench has four tabs: Playground, Evaluation, Metrics, and Simulation. The Simulation tab lets you run multi-turn chat simulations where your saved prompt acts directly as the agent. The platform uses your prompt’s system message, model, and parameters to drive the conversation. You do not need a separate agent definition or any SDK code. Each scenario defines a simulated user persona and conversation goal; the platform runs one conversation per scenario row, up to 10 turns each.
Tutorial
Open your prompt in the workbench
Go to app.futureagi.com → Prompts (left sidebar under BUILD) → click the prompt template you want to test.
The workbench opens showing the Playground tab by default.
Navigate to the Simulation tab
Inside the prompt workbench, click the Simulation tab in the top tab bar (next to Playground, Evaluation, and Metrics).
Note
The Simulation tab is only clickable after the prompt has at least one saved version. If the tab shows a tooltip “Save your prompt to run simulations”, go back to the Playground tab and click Run Prompt — this executes the prompt and automatically saves it as a version.
Create a simulation
On the Simulation tab, click Create Simulation. A dialog opens — “Create Chat Simulation”.
Fill in the dialog:
- Simulation Name: Auto-populated as
Simulation - {Date} at {Time}. Edit it to something descriptive, for example:support-prompt-v2-test. - Prompt Version: Select which saved version of your prompt to test. The default version is pre-selected. Use the dropdown to switch versions.
- Description (optional): Notes about what you are testing, for example:
Testing revised tone instructions against return-request scenario. - Select Scenarios: Check one or more scenarios from the list. Each checked scenario produces one simulated conversation when the simulation runs.
Tip
If you have no scenarios yet, click Create New Chat Scenario at the top of the scenario list — it opens the scenario creation page in a new tab. After saving, return to this dialog and click the refresh icon to reload the list.
Click Create Simulation. The dialog closes and the simulation detail view opens automatically.
Review and adjust the simulation configuration
The simulation detail view shows the simulation name and a run count chip. The header toolbar includes three controls on the right: Version, Scenarios, and Evals.
Version dropdown: Use this to switch which prompt version the next run uses without recreating the simulation. Changing it updates the simulation immediately.
Scenarios button: Click to open a popover where you can add or remove scenarios. The count badge shows how many are currently attached.
Evals button: Click to open the evaluations drawer. You can add evaluations that will run automatically on each completed conversation. Click Add Evaluation inside the drawer to configure one.
Tip
Adding evaluations before running is recommended. Evaluations like Task Completion, Tone, and the Conversational agent evaluation group give you structured quality scores on top of raw CSAT. You can also add evaluations after the run and re-run them on completed conversations.
Run the simulation
Click Run Simulation in the top-right corner of the simulation detail header.
A success notification confirms execution has started. The simulation creates one chat conversation per attached scenario row. Each conversation runs up to 10 turns between your prompt (acting as the agent) and the simulated customer.
The executions grid below the header updates in real time. Each row is one conversation. You can search runs using the search bar above the grid.
View execution results
Once conversations complete, click any row in the executions grid to open the execution detail page at /dashboard/simulate/test/{simulationId}/{executionId}.
The execution detail page has three tabs: Simulated runs, Logs, and Analytics.
Simulated runs tab
Shows the full conversation transcript — every turn between the simulated user and your prompt. Review the dialogue to see how the prompt handled the scenario.
Analytics tab
Shows aggregate performance metrics across executions in this simulation:
| Metric group | What it shows |
|---|---|
| Chat Details | Total chats, completed count, completion percentage |
| System Metrics | Avg total tokens, avg input tokens, avg output tokens, avg chat latency (ms) |
| Evaluation Metrics | Average score per configured evaluation (e.g., Task Completion, Tone) |
Reading the executions grid
Back on the simulation detail view, the grid shows one row per completed conversation with these columns:
| Column | Description |
|---|---|
| Status | Completed, In Progress, or Failed |
| CSAT | Customer satisfaction score with color indicator |
| Total Tokens | Total tokens used in the conversation |
| Input Tokens | Prompt tokens |
| Output Tokens | Completion tokens |
| Average Latency (ms) | Average response time per turn |
| Turn Count | Number of back-and-forth turns |
| Evaluation Metrics | Per-eval results as colored tags |
Iterate — swap versions and re-run
Use the Version dropdown in the simulation header to switch to a different prompt version, then click Run Simulation again. Each run appends new rows to the executions grid — all previous runs are preserved. Compare CSAT and evaluation scores across runs to measure whether prompt changes improved results.
What you built
You can now run multi-turn chat simulations against any prompt version, review CSAT scores and evaluation results, and iterate on prompt quality without writing any code.
- Opened a saved prompt in the Prompts workbench and navigated to the Simulation tab
- Created a chat simulation by selecting a prompt version and attaching scenarios
- Configured evaluations to score each completed conversation automatically
- Ran the simulation and reviewed per-conversation CSAT scores and transcripts in the execution detail view
- Iterated by switching prompt versions and re-running without leaving the workbench