Skip to main content
result = evaluator.evaluate(
    eval_templates="is_factually_consistent", 
    inputs={
        "input": "Why doesn't honey go bad?",
        "output": "Because its low moisture and high acidity prevent the growth of bacteria and other microbes.",
        "context": "Honey never spoils because it has low moisture content and high acidity, creating an environment that resists bacteria and microorganisms. Archaeologists have even found pots of honey in ancient Egyptian tombs that are still perfectly edible."
    },
    model_name="turing_flash"
)

print(result.eval_results[0].metrics[0].value)
print(result.eval_results[0].reason)
Input
Required InputTypeDescription
outputstringThe response to be evaluated for factual consistency.
contextstringThe context provided to the model.
Optional Input
inputstringThe source material, context, or question.
Output
FieldDescription
ResultReturns Passed if the output is factually consistent with the input, or Failed if it contains inconsistencies.
ReasonProvides a detailed explanation of why the response was deemed factually consistent or inconsistent.

What to do If you get Undesired Results

If the content is evaluated as factually inconsistent (Failed) and you want to improve it:
  • Verify all facts against reliable sources or the provided context
  • Remove any claims or details not supported by the source material
  • Correct any inaccuracies, contradictions, or misrepresentations
  • Ensure numbers, dates, names, and specific details align with the source
  • Avoid extrapolating beyond what is explicitly stated in the source
  • Use qualifying language (like “may,” “could,” or “suggests”) when appropriate
  • Cite specific parts of the source material when providing information

Comparing Is Factually Consistent with Similar Evals

  • Factual Accuracy: While Is Factually Consistent focuses on consistency with the provided input or context, Factual Accuracy might verify claims against broader world knowledge.
  • Groundedness: Is Factually Consistent evaluates whether output contradicts the source, while Groundedness measures how well the output is supported by the source.
I