Assesses the similarity between an expected response and an actual response. This evaluation uses various comparison methods to determine how closely the actual response matches the expected one.
response
is sufficiently similar to the expected_response
based on the chosen Comparator
.response
deviates significantly from the expected_response
.Click here to learn how to setup evaluation using the Python SDK.
Input Type | Parameter | Type | Description |
---|---|---|---|
Required Inputs | expected_response | string | The reference answer. |
response | string | The generated answer. | |
Configuration Parameters | comparator | string | The method to use for comparison (e.g., Comparator.COSINE.value ). |
failure_threshold | float | The threshold below which the evaluation fails (e.g., 0.7). |
Comparator Name | Class Name |
---|---|
Cosine Similarity | Comparator.COSINE.value |
Jaccard Similarity | Comparator.JACCARD.value |
Normalised Levenshtein Similarity | Comparator.NORMALISED_LEVENSHTEIN.value |
Jaro Winckler similarity | Comparator.JARO_WINKLER.value |
Sorensen Dice similarity | Comparator.SORENSEN_DICE.value |
Output | Type | Description |
---|---|---|
Score | float | Returns a score between 0 and 1. Values ≥ failure_threshold indicate sufficient similarity. |