Eval Definition
Levenshtein Similarity
Measures text similarity based on the minimum number of single-character edits required to transform one text into another.
Evaluation Using Interface
Input:
- Required Inputs:
- expected_text: The reference text against which to compare.
- response: The text to be evaluated.
Output:
- Score: A numeric score between 0 and 1, where 1 represents perfect similarity.
- Reason: A detailed explanation of the similarity assessment.
Evaluation Using Python SDK
Click here to learn how to setup evaluation using the Python SDK.
Input:
- Required Inputs:
- expected_text:
string
- The reference text against which to compare. - response:
string
- The text to be evaluated.
- expected_text:
Output:
- Score: Returns a float value between 0 and 1, where higher values indicate greater similarity.
- Reason: Provides a detailed explanation of the similarity assessment.
Example Output:
Overview
Levenshtein Similarity is a character-level metric that quantifies how similar two text sequences are by calculating the minimum number of operations needed to transform one sequence into the other. The output is normalized to a score between 0 and 1, where 1 indicates an exact match and 0 indicates maximum dissimilarity. This metric is useful for use-cases in spelling correction, OCR, and deterministic text matching.
Edit Operations
- Possible operations that are allowed in Levenshtein calculation:
- Insertion: Add a character (e.g.,
kitten -> kitteng
) - Deletion: Remove a character (e.g.,
kitten -> kiten
) - Substitution: Replace one character with another (e.g.,
kitten -> sitten
)
- Insertion: Add a character (e.g.,
- Each operation has a cost of 1. The final distance is the sum of all such operations needed to match the two strings.
Normalized Levenshtein Score
- Score of 1 means the two strings are identical.
- Score of 0 means no characters are shared at corresponding positions.
What to do If you get Undesired Results
If the Levenshtein similarity score is lower than expected:
- Consider case sensitivity - the comparison is typically case-sensitive
- Check for whitespace and punctuation differences, which count as edits
- For meaning-based comparison rather than exact character matching, consider using semantic similarity metrics
- For texts with similar meaning but different wording, consider metrics like ROUGE, BLEU, or embedding similarity
- Remember that this metric measures character-level similarity, not semantic similarity
Comparing Levenshtein Similarity with Similar Evals
- Fuzzy Match: While Levenshtein Similarity focuses on character-level edits, Fuzzy Match may use different algorithms for approximate string matching.
- Embedding Similarity: Levenshtein Similarity measures character-level edits, whereas Embedding Similarity captures semantic similarity through vector representations.
- BLEU Score: Levenshtein operates at character level, while BLEU focuses on n-gram precision between the candidate and reference texts.