Answer similarity validation refers to the ability to evaluate and validate text output from LLM models based on how similar it is to a given reference answer. This includes using various similarity measures to assess the similarity between the generated text and the reference answer.

Required Parameters

ParameterDescription
expected_responseThe expected correct response
responseThe actual response to be evaluated

Optional Configuration

ParameterDescription
comparatorThe method to use for comparison
failure_thresholdThe threshold below which the evaluation fails

Available Comparators

ComparatorDescription
Comparator.COSINEMeasures similarity based on vector angle between text embeddings
Comparator.LEVENSHTEINCalculates edit distance between strings, normalized to [0,1]
Comparator.JARO_WINKLERString similarity that favors strings matching from the beginning
Comparator.JACCARDMeasures overlap between word sets using intersection over union
Comparator.SORENSEN_DICESimilar to Jaccard but gives more weight to overlapping terms

Example Usage

from fi.evals import AnswerSimilarity, EvalClient
from fi.evals.types import Comparator
from fi.testcases import LLMTestCase

# Initialize the evaluation client
evaluator = EvalClient(
    fi_api_key="your_api_key", 
    fi_secret_key="your_secret_key"
)

# Create a test case with required parameters
test_case = LLMTestCase(
    expected_response="Paris is the capital city of France.",
    response="The capital of France is Paris."
)

# Initialize the answer similarity evaluator (with optional configuration)
answer_similarity = AnswerSimilarity(
    comparator=Comparator.COSINE.value,
    failure_threshold=0.8
)

# Add the evaluator to the client and run evaluation
result = evaluator.evaluate(answer_similarity, test_case)
print(result) # Will return Pass if similarity score exceeds threshold