Skip to main content
result = evaluator.evaluate(
    eval_templates="is_helpful",
    inputs={
        "input": "Why doesn't honey go bad?",
        "output": "Honey doesn't spoil because its low moisture and high acidity prevent the growth of bacteria and other microbes."
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)
Input
Required InputTypeDescription
inputstringUser query to the model
outputstringModel’s response to the user query
Output
FieldDescription
ResultReturns Passed if the response is helpful, or Failed if it’s not
ReasonProvides a detailed explanation of the evaluation

Troubleshooting

If you encounter issues with this evaluation:
  • Ensure that both the input (user query) and output (AI response) parameters are provided
  • The helpfulness evaluation works best when the context of the request is clear
  • If evaluating complex responses, make sure the entire response is included
  • Consider combining with other evaluations like completeness or factual-accuracy for more comprehensive assessment
  • **completeness: Determines if the response addresses all aspects of the query
  • **task-completion: Checks if a specific requested task was accomplished
  • **instruction-adherence: Evaluates if the response follows specific instructions
  • **is-concise: Assesses whether the response avoids unnecessary verbosity
I