Context Similarity

Evaluation Using Interface

Input:

Required Inputs:
- context: The context column provided to the model.
- response: The response column generated by the model.
Configuration Parameters:
- Comparator: The method to use for comparison (Cosine Similarity, Jaccard Similarity, Normalised Levenshtein Similarity, Jaro Winckler similarity, Sorensen Dice similarity)
- Failure Threshold: The threshold below which the evaluation fails (e.g., 0.7)

Output:

Score: percentage score between 0 and 100

Interpretation:

Higher scores: Indicate that the context is more similar to the context used in generating the response.
Lower scores: Indicate that the context is less similar to the context used in generating the response.

Evaluation Using Python SDK

Click here to learn how to setup evaluation using the Python SDK.

Input	Parameter	Type	Description
Required Inputs	`context`	`string`	The context provided to the model.
	`response`	`string`	The response generated by the model.
Configuration Parameters	`Comparator`	`string`	The method to use for comparison (`Cosine Similarity`, etc.) Class name shared in below table.
	`Failure Threshold`	`float`	The threshold below which the evaluation fails (e.g., 0.7).

Comparator Name	Class Name
Cosine Similarity	`Comparator.COSINE.value`
Jaccard Similarity	`Comparator.JACCARD.value`
Normalised Levenshtein Similarity	`Comparator.NORMALISED_LEVENSHTEIN.value`
Jaro Winckler similarity	`Comparator.JARO_WINKLER.value`
Sorensen Dice similarity	`Comparator.SORENSEN_DICE.value`

Output	Type	Description
`Score`	`float`	Returns score between 0 and 1. Higher scores indicate more similarity between context and response; lower scores indicate less similarity.

from fi.testcases import TestCase
from fi.evals.types import Comparator
from fi.evals.templates import ContextSimilarity

template = ContextSimilarity(
    config={
        "comparator": Comparator.COSINE.value,
        "failure_threshold": 0.7
    }
)

test_case = TestCase(
    context="The Earth orbits around the Sun in an elliptical path.",
    response="The Earth's orbit around the Sun is not perfectly circular but elliptical."
)

result = evaluator.evaluate(eval_templates=[template], inputs=[test_case], model_name="turing_flash")

score = result.eval_results[0].metrics[0].value

What to do when Context Similarity is Low

First try to identify discrepancies by determining which elements of the provided context do not align with the expected context and identifying any missing or extraneous information that affects similarity. Next, enhance context alignment by adjusting the provided context to better match the expected context, adding missing relevant details, and removing irrelevant content. Finally, implement system adjustments to ensure context retrieval processes prioritise similarity with the expected context, refining context processing to better align with anticipated requirements.

Differentiating Context Similarity with Similar Evals

Context Relevance: Assesses whether the context is sufficient and appropriate for answering the query, while Context Similarity focuses on how closely the provided context matches the expected context.
Context Adherence: Measures how well responses stay within the provided context, whereas Context Similarity evaluates the alignment between provided and expected context.

Introduction

Evaluation

Knowledge Base

Dataset

Prototype

Observe

Tracing

Optimization

Prompt Workbench

Protect

MCP

Admin & Settings

FAQs

Evaluation Using Interface

Evaluation Using Python SDK

What to do when Context Similarity is Low

Differentiating Context Similarity with Similar Evals

Introduction

Evaluation

Knowledge Base

Dataset

Prototype

Observe

Tracing

Optimization

Prompt Workbench

Protect

MCP

Admin & Settings

FAQs

​Evaluation Using Interface

​Evaluation Using Python SDK

​What to do when Context Similarity is Low

​Differentiating Context Similarity with Similar Evals

Evaluation Using Interface

Evaluation Using Python SDK

What to do when Context Similarity is Low

Differentiating Context Similarity with Similar Evals