Input | |||
---|---|---|---|
Required Input | Type | Description | |
reference | string | The reference containing the information to be captured. | |
hypothesis | string | The content to be evaluated for recall against the reference. |
Output | ||
---|---|---|
Field | Description | |
Result | Returns a score representing the recall of the hypothesis against the reference, where higher values indicate better recall. | |
Reason | Provides a detailed explanation of the recall evaluation. |
Overview
Recall Score measures how completely a hypothesis text captures the information present in a reference text. Unlike metrics that focus on exact wording, Recall Score evaluates whether the essential information is preserved, regardless of how it’s phrased. A high recall score indicates that the hypothesis contains most or all of the information from the reference, while a low score suggests significant information has been omitted.What to do If you get Undesired Results
If the recall score is lower than expected:- Ensure that all key facts, entities, and relationships from the reference are included in the hypothesis
- Check for missing details, numbers, dates, or proper nouns that might be important
- Verify that important contextual information isn’t omitted
- Consider that paraphrasing may preserve recall as long as the core information is included
- For summaries, focus on including the most critical information from the reference
- Be aware that recall doesn’t penalize for additional information in the hypothesis (that’s measured by precision)
Comparing Recall Score with Similar Evals
- ROUGE Score: While Recall Score focuses on information coverage, ROUGE Score uses n-gram overlap to evaluate text similarity.
- BLEU Score: Recall Score measures how much reference information is captured, while BLEU Score emphasizes precision by measuring how much of the hypothesis matches the reference.
- Completeness: Recall Score measures information coverage from a reference text, whereas Completeness evaluates whether a response fully answers a given query.