Evaluates how closely the provided context matches the expected context. This evaluation is crucial for ensuring that the context used in generating responses aligns with what is anticipated or required, thereby supporting accurate and relevant outputs.
Input:
Cosine Similarity
, Jaccard Similarity
, Normalised Levenshtein Similarity
, Jaro Winckler similarity
, Sorensen Dice similarity
)Output:
Interpretation:
Click here to learn how to setup evaluation using the Python SDK.
Input | Parameter | Type | Description |
---|---|---|---|
Required Inputs | context | string | The context provided to the model. |
response | string | The response generated by the model. | |
Configuration Parameters | Comparator | string | The method to use for comparison (Cosine Similarity , etc.) Class name shared in below table. |
Failure Threshold | float | The threshold below which the evaluation fails (e.g., 0.7). |
Comparator Name | Class Name |
---|---|
Cosine Similarity | Comparator.COSINE.value |
Jaccard Similarity | Comparator.JACCARD.value |
Normalised Levenshtein Similarity | Comparator.NORMALISED_LEVENSHTEIN.value |
Jaro Winckler similarity | Comparator.JARO_WINKLER.value |
Sorensen Dice similarity | Comparator.SORENSEN_DICE.value |
Output | Type | Description |
---|---|---|
Score | float | Returns score between 0 and 1. Higher scores indicate more similarity between context and response; lower scores indicate less similarity. |
First try to identify discrepancies by determining which elements of the provided context do not align with the expected context and identifying any missing or extraneous information that affects similarity.
Next, enhance context alignment by adjusting the provided context to better match the expected context, adding missing relevant details, and removing irrelevant content.
Finally, implement system adjustments to ensure context retrieval processes prioritise similarity with the expected context, refining context processing to better align with anticipated requirements.