In most of the AI applications, predictable behaviour is essential for maintaining reliability, consistency, and system integrity. Deterministic evaluations help verify that AI models operate within expected constraints, minimising inconsistencies and unintended variability.Future AGI provides Deterministic Eval that evaluates outputs by comparing them to predefined rules or expected patterns. It checks whether:
The output consistently adheres to a given rule prompt.
It meets the structure or format defined by the evaluation (e.g., multiple-choice validation).
Variability in output is minimised when the same input is provided.
Key Features of Deterministic Eval:
Rule-Based Validation: Uses a customisable rule prompt to define the criteria for evaluating outputs.
Multiple Choice Handling: Supports evaluation for tasks that involve multiple-choice questions.
Customisable Inputs: Works with flexible input types and can be adapted for various AI-generated outputs, such as text, conversation, or image.