Deterministic evaluation is an evaluation method that outputs a fixed set of predefined choices. It ensures that model outputs are restricted to a specific set of valid responses, making the evaluation process fully deterministic and predictable.

Client Setup

Initialize the evaluation client with your API credentials:

from fi.evals import EvalClient

evaluator = EvalClient(fi_api_key="your_api_key", fi_secret_key="your_secret_key")

Configuration

The evaluation accepts the following configuration parameters:

ParameterDescriptionRequiredDefault
multi_choiceWhether to allow multiple choices in outputYesfalse
choicesList of valid choices/outputsYes[]
rule_promptCustom prompt for evaluation rulesYes""
inputInput strings to testYes{}
from fi.evals import Deterministic

# Initialize the deterministic evaluator
deterministic_eval = Deterministic(config={
    "multi_choice": False,
    "choices": ["Yes", "No"],
    "rule_prompt": "Evaluate if the {{input_key1}} is consistently the same as {{input_key2}}",
    "input": {
        "input_key1": "response",
        "input_key2": "expected_response"
    }
})  

Test Case Setup

The evaluation requires test cases with inputs and outputs:

from fi.testcases import TestCase

# Define a test case class for each input   
class DeterministicTestCase(TestCase):
    response: str
    expected_response: str

test_case = DeterministicTestCase(
    response="4",
    expected_response="4"
)

Complete Example

from fi.evals import Deterministic, EvalClient
from fi.testcases import TestCase

# Initialize the deterministic evaluator
deterministic_eval = Deterministic(config={
    "multi_choice": False,
    "choices": ["Yes", "No"],
    "rule_prompt": "Evaluate if {{input_key1}} and {{input_key2}} are equal",
    "input": {
        "input_key1": "response",
        "input_key2": "expected_response"
    }
})

#Define a test case class for each input
from fi.testcases import TestCase

class DeterministicTestCase(TestCase):
    response: str
    expected_response: str

# Create a test case
test_case = DeterministicTestCase(response = '90', expected_response = '90')

# Initialize the evaluation client
evaluator = EvalClient(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key"
)

# Run the evaluation
result = evaluator.evaluate(deterministic_eval, test_case)
print(result) # Will return Yes or No