Compare Eval Summaries

Compare evaluation summaries side-by-side across multiple test executions. Accepts a JSON-encoded array of execution UUIDs; returns per-execution eval metrics keyed by execution ID.

GET https://api.futureagi.com/simulate/run-tests/{run_test_id}/eval-summary-comparison/

Authentication

X-Api-Key API Key Required

Your Future AGI API key used to authenticate requests. You can find and manage your API keys in the Dashboard under Settings.

X-Secret-Key Secret Key Required

Your Future AGI secret key, used alongside the API key for request authentication. This is generated when you create an API key in the Dashboard.

Path parameters

run_test_id UUID Required

UUID of the test run containing the executions to compare.

Query parameters

execution_ids string Required

JSON-encoded array of test execution UUIDs to compare. Must be URL-encoded. Example: ["uuid1","uuid2"].

Response

200 OK
(execution_id) object

Dictionary keyed by execution UUID. Each value is an array of evaluation summary objects for that execution.

name string
Name of the evaluation configuration.
average_score number
Average score across all evaluated calls.
total_runs integer
Total evaluation runs for this config.
passed integer
Number of passing evaluations.
failed integer
Number of failing evaluations.

Errors

400 Bad Request Optional

Missing, malformed, or empty execution_ids parameter.

{"execution_ids": ["execution_ids must be valid JSON"]}

Or when empty:

{"execution_ids": ["execution_ids list is required"]}
401 Unauthorized Optional

Missing or invalid X-Api-Key or X-Secret-Key headers.

404 Not Found Optional

No test run found with the specified run_test_id.

{"error": "RunTest not found."}
500 Internal Server Error Optional

Unexpected server error.

{"error": "Unable to fetch eval summary"}
GET /
Authentication
REQUEST
 
RESPONSE