Overview

It evaluates how similar two texts are in meaning by comparing their vector embeddings using distance-based similarity measures. Traditional metrics like BLEU or ROUGE rely on word overlap and can fail when the generated output is a valid paraphrase with no lexical match.

How Similarity Is Calculated?

Once both texts are encoded into a high-dimensional vector representations, the similarity between the two vectors u and v is computed using one of the following methods:

  1. Cosine Similarity: Measures the cosine of the angle between vectors.
Cosine Similarity=1uvuv\text{Cosine Similarity} = 1 - \frac{\mathbf{u} \cdot \mathbf{v}}{\|\mathbf{u}\| \|\mathbf{v}\|}
  1. Euclidean Distance: Measures the straight-line distance between vectors (L2 Norm).

    Euclidean Distance=i=1n(uivi)2\text{Euclidean Distance} = \sqrt{ \sum_{i=1}^{n} (u_i - v_i)^2 }
  2. Manhattan Distance: Measures sum of absolute differences between vectors (L1 Norm).

Manhattan Distance=i=1nuivi\text{Manhattan Distance} = {\sum_{i=1}^{n} |u_i - v_i|}

Embedding Similarity Eval using Future AGI’s Python SDK

Click here to learn how to setup evaluation using the Python SDK.

Input & Configuration:

ParameterTypeDescription
Required InputsresponsestrModel-generated output to be evaluated.
expected_textstr or List[str]One or more reference texts for comparison.
Optional Configsimilarity_methodstrDistance function used to compare embedding vectors. Options: "cosine" (default), "euclidean", "manhattan".
normalizeboolWhether to normalize embedding vectors before computing similarity. Default is True.

Parameter Options:

Parameter - similarity_methodDescription
cosineMeasures the cosine of the angle between two vectors
euclideanComputes the straight-line (L2) distance between vectors
manhattanComputes the L1 (absolute) distance between vectors

Output:

Output FieldTypeDescription
scorefloatValue between 0 and 1 representing semantic similarity. Higher values indicate stronger similarity.

Example:

from fi.evals.metrics import EmbeddingSimilarity
from fi.testcases import TestCase

test_case = TestCase(
    response="The quick brown fox jumps over the lazy dog",
    expected_text="The fast brown fox leaps over the sleepy dog"
)

evaluator = EmbeddingSimilarity(config={
    "similarity_method": "cosine",
    "normalize": True
})

result = evaluator.evaluate([test_case])
print(f"{result.eval_results[0].metrics[0].value:.4f}")

Output:

0.8835