Levenshtein Similarity

Evaluation Using Interface

Input:

Required Inputs:
- expected_text: The reference text against which to compare.
- response: The text to be evaluated.

Output:

Score: A numeric score between 0 and 1, where 1 represents perfect similarity.
Reason: A detailed explanation of the similarity assessment.

Evaluation Using Python SDK

Click here to learn how to setup evaluation using the Python SDK.

Input:

Required Inputs:
- expected_text: string - The reference text against which to compare.
- response: string - The text to be evaluated.

Output:

Score: Returns a float value between 0 and 1, where higher values indicate greater similarity.
Reason: Provides a detailed explanation of the similarity assessment.

result = evaluator.evaluate(
    eval_templates="levenshtein_similarity",
    inputs={
        "expected_text": "The Eiffel Tower is a famous landmark in Paris, built in 1889 for the World's Fair. It stands 324 meters tall.",
        "response": "The Eiffel Tower, located in Paris, was built in 1889 and is 324 meters high."
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)

Example Output:

[Output would show the similarity score and detailed reason]

Overview

Levenshtein Similarity is a character-level metric that quantifies how similar two text sequences are by calculating the minimum number of operations needed to transform one sequence into the other. The output is normalized to a score between 0 and 1, where 1 indicates an exact match and 0 indicates maximum dissimilarity. This metric is useful for use-cases in spelling correction, OCR, and deterministic text matching.

Edit Operations

Possible operations that are allowed in Levenshtein calculation:
- Insertion: Add a character (e.g., kitten -> kitteng)
- Deletion: Remove a character (e.g., kitten -> kiten)
- Substitution: Replace one character with another (e.g., kitten -> sitten)
Each operation has a cost of 1. The final distance is the sum of all such operations needed to match the two strings.

Normalized Levenshtein Score

\hbox{Score} = 1 - { \hbox{Levenshtein Distance} \over \hbox{max(Length of Prediction, Length of Reference)} }

Score of 1 means the two strings are identical.
Score of 0 means no characters are shared at corresponding positions.

What to do If you get Undesired Results

If the Levenshtein similarity score is lower than expected:

Consider case sensitivity - the comparison is typically case-sensitive
Check for whitespace and punctuation differences, which count as edits
For meaning-based comparison rather than exact character matching, consider using semantic similarity metrics
For texts with similar meaning but different wording, consider metrics like ROUGE, BLEU, or embedding similarity
Remember that this metric measures character-level similarity, not semantic similarity

Comparing Levenshtein Similarity with Similar Evals

Fuzzy Match: While Levenshtein Similarity focuses on character-level edits, Fuzzy Match may use different algorithms for approximate string matching.
Embedding Similarity: Levenshtein Similarity measures character-level edits, whereas Embedding Similarity captures semantic similarity through vector representations.
BLEU Score: Levenshtein operates at character level, while BLEU focuses on n-gram precision between the candidate and reference texts.

Introduction

Evaluation

Knowledge Base

Dataset

Prototype

Observe

Tracing

Optimization

Prompt Workbench

Protect

MCP

Admin & Settings

FAQs

Levenshtein Similarity

Evaluation Using Interface

Evaluation Using Python SDK

Overview

Edit Operations

Normalized Levenshtein Score

What to do If you get Undesired Results

Comparing Levenshtein Similarity with Similar Evals

Introduction

Evaluation

Knowledge Base

Dataset

Prototype

Observe

Tracing

Optimization

Prompt Workbench

Protect

MCP

Admin & Settings

FAQs

​Evaluation Using Interface

​Evaluation Using Python SDK

​Overview

​Edit Operations

​Normalized Levenshtein Score

​What to do If you get Undesired Results

​Comparing Levenshtein Similarity with Similar Evals

Evaluation Using Interface

Evaluation Using Python SDK

Overview

Edit Operations

Normalized Levenshtein Score

What to do If you get Undesired Results

Comparing Levenshtein Similarity with Similar Evals