Evaluate via Platform & SDK
Run evaluations via the Future AGI platform UI or the Python SDK.
What it is
Evaluate via Platform & SDK is the primary way to run evaluations on Future AGI — either through the platform UI on a dataset, or programmatically via the Python SDK. It supports built-in and custom eval templates, sync and async execution, and returns a score, pass/fail result, or reason for each evaluated input.
Use cases
- Quick quality check — Run a single eval (e.g. tone) on one input to verify the pipeline before scaling.
- Try built-in templates — Use Future AGI templates (e.g. tone) or your own custom template from the UI or SDK.
- Automate evals — Call the SDK from scripts or CI to run evals programmatically (sync or async).
- Run evals on a dataset — From the UI, open a dataset, add an evaluation, map columns, and run so every row is evaluated.
How to
Choose UI or SDK below; each tab has the process in steps.
Select a dataset
You need a dataset to run evals from the UI. If you don’t have one, add a dataset first. See Dataset overview.

Open the evaluation panel
Open your dataset, then click Evaluate in the top-right. The evaluation configuration panel opens.

Add and run an eval
Click Add Evaluation. You’re taken to the evaluation list: choose a built-in template (e.g. tone) or Create your own eval. For a template: click it, give the evaluation a name, and in config select the dataset column(s) to use as input (and output if the template requires it). Optionally enable Error Localization so that when a row fails, the platform can localize the error in the dataset. Choose a model if the template requires one (many built-in evals do). Click Add & Run to run the eval on the dataset.

Add dataset eval (API): The request includes name, template_id, config (column mapping), optional error_localizer, optional model, and run: true to run immediately.
Optional: Create your own eval
From the Add Evaluation flow, click Create your own eval to define a custom template (name, model, rule prompt, output type, and optional settings). After you save it, the new eval appears in the evaluation list and you can add it to your dataset as in the step above. For full details on creating and configuring custom evals, see Create custom evals.

Install and initialise
Install the package ai-evaluation and create an Evaluator with your Future AGI API key and secret. Prefer setting FI_API_KEY and FI_SECRET_KEY in the environment instead of passing them in code. See Accessing API keys.
pip install ai-evaluationfrom fi.evals import Evaluator
evaluator = Evaluator(
fi_api_key="your_api_key",
fi_secret_key="your_secret_key",
) Run a sync eval
Call evaluate with the eval template name (e.g. tone), inputs (dict with the keys the template expects, e.g. "input"), and model_name. Many built-in (system) templates require a model.
result = evaluator.evaluate(
eval_templates="tone",
inputs={
"input": "Dear Sir, I hope this email finds you well. I look forward to any insights or advice you might have whenever you have a free moment"
},
model_name="turing_flash",
)
print(result.eval_results[0].output)
print(result.eval_results[0].reason) Optional: Run async eval
For long-running or large runs, set is_async=True. The call returns immediately with an eval_id; the evaluation runs in the background.
result = evaluator.evaluate(
eval_templates="tone",
inputs={"input": "Your text here"},
model_name="turing_flash",
is_async=True,
)
eval_id = result.eval_results[0].eval_id Retrieve async results
Use get_eval_result(eval_id) to fetch the result when the evaluation has finished. The same method works for both sync and async runs (e.g. to re-fetch a result).
result = evaluator.get_eval_result(eval_id)
print(result.eval_results[0].output)
print(result.eval_results[0].reason) Use a custom template
To use a template you created in the UI, pass its name as eval_templates and supply the inputs dict with the keys your template’s required_keys expect (e.g. "input", "output"). Use the same template name you see in the evaluation list.
from fi.evals import evaluate
result = evaluate(
eval_templates="name-of-your-eval",
inputs={
"input": "your_input_text",
"output": "your_output_text"
},
model_name="model_name"
)
print(result.eval_results[0].output)
print(result.eval_results[0].reason) Note
For system (built-in) eval templates, model_name is required and must be one of the models listed for that template. The backend validates required input keys from the template’s config.
Tip
For more eval templates and Future AGI models, see Built-in evals and Future AGI models.
What you can do next
Create custom evals
Define your own eval rules and criteria.
Eval groups
Run multiple evals together as a group.
Use custom models
Bring your own model for evaluations.
Future AGI models
Built-in models available for evals.
CI/CD pipeline
Run evals automatically in your pipeline.
Evaluation overview
How evaluation fits into the platform.