Definition

Context Similarity evaluates how closely the provided context matches the expected context. This evaluation is crucial for ensuring that the context used in generating responses aligns with what is anticipated or required, thereby supporting accurate and relevant outputs.


Calculation

The evaluation process begins by analysing both the provided and expected contexts to identify key elements, concepts, and relationships.

Various similarity metrics are then applied to compare the two contexts, ensuring a structured and quantifiable assessment. Metrics such as Cosine Similarity, Normalised Levenshtein Similarity, Jaro-Winkler Similarity, Jaccard Similarity, and Sorensen-Dice Similarity are used to measure the degree of similarity between them.

Based on the selected metric, a similarity score is computed to indicate how closely the provided context aligns with the expected context. A numerical score is assigned, quantifying a clear and interpretable measure of context similarity.


What to do when Context Similarity is Low

First try to identify discrepancies by determining which elements of the provided context do not align with the expected context and identifying any missing or extraneous information that affects similarity.

Next, enhance context alignment by adjusting the provided context to better match the expected context, adding missing relevant details, and removing irrelevant content.

Finally, implement system adjustments to ensure context retrieval processes prioritise similarity with the expected context, refining context processing to better align with anticipated requirements.


Differentiating Context Similarity with Similar Evals

  1. Context Relevance: Assesses whether the context is sufficient and appropriate for answering the query, while Context Similarity focuses on how closely the provided context matches the expected context.
  2. Context Adherence: Measures how well responses stay within the provided context, whereas Context Similarity evaluates the alignment between provided and expected context.