Evaluates how closely the provided context matches the expected context. This evaluation is crucial for ensuring that the context used in generating responses aligns with what is anticipated or required, thereby supporting accurate and relevant outputs.
Cosine Similarity
, Jaccard Similarity
, Normalised Levenshtein Similarity
, Jaro Winckler similarity
, Sorensen Dice similarity
)Click here to learn how to setup evaluation using the Python SDK.
Input | Parameter | Type | Description |
---|---|---|---|
Required Inputs | context | string | The context provided to the model. |
response | string | The response generated by the model. | |
Configuration Parameters | Comparator | string | The method to use for comparison (Cosine Similarity , etc.) Class name shared in below table. |
Failure Threshold | float | The threshold below which the evaluation fails (e.g., 0.7). |
Comparator Name | Class Name |
---|---|
Cosine Similarity | Comparator.COSINE.value |
Jaccard Similarity | Comparator.JACCARD.value |
Normalised Levenshtein Similarity | Comparator.NORMALISED_LEVENSHTEIN.value |
Jaro Winckler similarity | Comparator.JARO_WINKLER.value |
Sorensen Dice similarity | Comparator.SORENSEN_DICE.value |
Output | Type | Description |
---|---|---|
Score | float | Returns score between 0 and 1. Higher scores indicate more similarity between context and response; lower scores indicate less similarity. |