Skip to main content
POST
/
simulate
/
run-tests
/
create
Create a New Test Run
curl --request POST \
  --url https://api.futureagi.com/simulate/run-tests/create/ \
  --header 'Content-Type: application/json' \
  --header 'X-Api-Key: <api-key>' \
  --header 'X-Secret-Key: <api-key>' \
  --data @- <<EOF
{
  "name": "new-run-test",
  "description": "",
  "scenarioIds": [
    "fae7d086-6466-4b40-b21f-13bb7e1d83fe"
  ],
  "agentDefinitionId": "87a193df-12a6-46e1-860d-d18ddb4a00cf",
  "agentVersion": "117efec9-5e9b-4e9e-9272-cf171b6e4af1",
  "evalConfigIds": [],
  "evaluationsConfig": [
    {
      "name": "task_completion",
      "templateId": "5419b2e4-f155-4f0f-846f-0a3f848a74be",
      "templateName": "task_completion",
      "mapping": {
        "input": "transcript",
        "output": "transcript"
      },
      "config": {
        "mapping": {
          "input": "transcript",
          "output": "transcript"
        },
        "config": {},
        "reasonColumn": true
      },
      "description": "Measures whether the model fulfilled the user's request accurately and completely.",
      "type": "futureagi_built",
      "requiredKeys": [
        "input",
        "output"
      ],
      "tags": [
        "TEXT",
        "FUTURE_EVALS",
        "AUDIO"
      ],
      "errorLocalizer": true,
      "model": "turing_small",
      "eval_group": "10a3037b-5893-4997-a5d5-9d058aae10d1"
    },
    {
      "name": "is_polite",
      "templateId": "122a4e83-4c5e-4a17-bcfc-1d29affba6f9",
      "templateName": "is_polite",
      "mapping": {
        "output": "transcript"
      },
      "config": {
        "mapping": {
          "output": "transcript"
        },
        "config": {},
        "reasonColumn": true
      },
      "description": "Ensures that the output maintains a respectful, kind, and non-aggressive tone.",
      "type": "futureagi_built",
      "requiredKeys": [
        "output"
      ],
      "tags": [
        "TEXT",
        "FUTURE_EVALS",
        "AUDIO"
      ],
      "errorLocalizer": true,
      "model": "turing_small",
      "eval_group": "10a3037b-5893-4997-a5d5-9d058aae10d1"
    }
  ],
  "datasetRowIds": [],
  "enableToolEvaluation": true
}
EOF
{
  "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "name": "<string>",
  "description": "<string>",
  "status": "<string>",
  "scenarios": [
    {}
  ],
  "agent_definition": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "agent_version": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "evaluations": [
    {}
  ],
  "created_at": "2023-11-07T05:31:56Z",
  "updated_at": "2023-11-07T05:31:56Z"
}

Authorizations

X-Api-Key
string
header
required

API Key for authentication. Click here to access API Key

X-Secret-Key
string
header
required

Secret Key for authentication. Click here to access Secret Key

Body

application/json
name
string
required

A unique name for the test run.

scenarioIds
string<uuid>[]
required

A list of scenario UUIDs to be included in this test run.

agentDefinitionId
string<uuid>
required

The UUID of the agent definition to be tested.

description
string

An optional description for the test run.

agentVersion
string<uuid> | null

The specific UUID of the agent version to be tested. If not provided, the active version will be used.

evalConfigIds
string<uuid>[]

A list of existing evaluation configuration UUIDs to associate with this test run.

evaluationsConfig
object[]

A list of new, detailed evaluation configurations to create and associate with this test run.

datasetRowIds
string<uuid>[]

A list of specific dataset row UUIDs to test against.

enableToolEvaluation
boolean
default:false

Flag to enable tool evaluation for this test run.

Response

The test run was created successfully.

id
string<uuid>
name
string
description
string
status
string
scenarios
object[]
agent_definition
string<uuid>
agent_version
string<uuid>
evaluations
object[]
created_at
string<date-time>
updated_at
string<date-time>