Custom Code

Evaluation Using Interface

Input:

Configuration Parameters:

code: A string containing the custom Python code to execute. This code must define a function main(**kwargs), where kwargs will be populated with the values from the corresponding dataset row/columns. The function should return the evaluation result (e.g., a score, boolean).

Example Code Structure:

def main(**kwargs):
    # Access column 'input_col' via kwargs['input_col']
    # Access column 'output_col' via kwargs['output_col']
    input_val = kwargs.get('input_col', '')
    output_val = kwargs.get('output_col', '')

    # Implement custom logic
    if 'expected pattern' in output_val and len(input_val) > 10:
        return 1.0 # Represents Pass or high score
    else:
        return 0.0 # Represents Fail or low score

Output: The value returned by the custom main function.

What to do when Custom Code Eval Fails

Do code review for checking syntax errors, verifying that the function is correctly implemented, and ensuring all required dependencies are available. Input validation ensures that all necessary arguments are properly accessed and that input data types and formats align with expected requirements.

Differentiating Custom Code Eval with Deterministic Eval

Deterministic Evals and Custom Code Eval share flexibility and customisation capabilities, allowing for tailored evaluation logic. Both can be configured for different types of outputs, with Deterministic Evals utilising rule prompts to guide evaluations. However, Custom Code Eval executes actual Python code, enabling dynamic computations and logic, while Deterministic Evals rely on structured, rule-based evaluation methods.

Introduction

Evaluation

Knowledge Base

Dataset

Prototype

Observe

Tracing

Optimization

Prompt Workbench

Protect

MCP

Admin & Settings

FAQs

Evaluation Using Interface

What to do when Custom Code Eval Fails

Differentiating Custom Code Eval with Deterministic Eval

Introduction

Evaluation

Knowledge Base

Dataset

Prototype

Observe

Tracing

Optimization

Prompt Workbench

Protect

MCP

Admin & Settings

FAQs

​Evaluation Using Interface

​What to do when Custom Code Eval Fails

​Differentiating Custom Code Eval with Deterministic Eval

Evaluation Using Interface

What to do when Custom Code Eval Fails

Differentiating Custom Code Eval with Deterministic Eval