How To
Custom Evaluation Using Deterministic Eval
In most of the AI applications, predictable behaviour is essential for maintaining reliability, consistency, and system integrity. Deterministic evaluations help verify that AI models operate within expected constraints, minimising inconsistencies and unintended variability.
Future AGI provides Deterministic Eval that evaluates outputs by comparing them to predefined rules or expected patterns. It checks whether:
- The output consistently adheres to a given rule prompt.
- It meets the structure or format defined by the evaluation (e.g., multiple-choice validation).
- Variability in output is minimised when the same input is provided.
Key Features of Deterministic Eval:
- Rule-Based Validation: Uses a customisable rule prompt to define the criteria for evaluating outputs.
- Multiple Choice Handling: Supports evaluation for tasks that involve multiple-choice questions.
- Customisable Inputs: Works with flexible input types and can be adapted for various AI-generated outputs, such as text, conversation, or image.
Click here to read the eval definition of Deterministic Eval
a. Using Interface
Required Parameters
- Input: The input to be evaluated.
- Choices: A set of predefined options for multiple-choice questions (if applicable).
- Rule Prompt: A rule or set of conditions that the output must meet (e.g., “Output must include X if input includes Y”).
- MultiChoice: A Boolean value indicating whether the output should be treated as a multiple-choice question.
The result is a set of choices provided by the user of the output’s adherence to the deterministic criteria.