Ground Truth Match

Evaluates whether the model-generated output matches the provided ground-truth expected output.

result = evaluator.evaluate(
    eval_templates="ground_truth_match",
    inputs={
        "generated_value": "The capital of France is Paris.",
        "expected_value": "Paris is the capital of France."
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)
import { Evaluator, Templates } from "@future-agi/ai-evaluation";

const evaluator = new Evaluator();

const result = await evaluator.evaluate(
  "ground_truth_match",
  {
    generated_value: "The capital of France is Paris.",
    expected_value: "Paris is the capital of France."
  },
  {
    modelName: "turing_flash",
  }
);

console.log(result);
Input
Required InputTypeDescription
generated_valuestringThe model-generated output to be evaluated
expected_valuestringThe ground-truth reference output
Output
FieldDescription
ResultReturns Passed if the generated output matches or is equivalent to the expected ground truth, Failed if they differ in meaning, correctness, or format
ReasonProvides a detailed explanation of the match assessment

What to Do When Ground Truth Match Fails

  • Review the generated output for factual errors or missing information
  • Check if the format of the generated output matches what was expected
  • Ensure the model has access to the correct context to produce the right answer
  • Consider whether the expected value allows for paraphrasing or requires exact match

Comparing Ground Truth Match with Similar Evals

  • Fuzzy Match: Ground Truth Match evaluates semantic equivalence using an LLM, while Fuzzy Match uses approximate string matching without LLM reasoning.
  • Embedding Similarity: Ground Truth Match gives a Pass/Fail verdict on correctness, while Embedding Similarity returns a continuous similarity score based on vector distance.
Was this page helpful?

Questions & Discussion