Evaluation Using Interface

Input:

  • Required Inputs:
    • context: The context column provided to the model
    • input: The input column provided to the model
  • Configuration parameters:
    • Check Internet: Whether to check the internet for relevant information

Output:

  • Score: Percentage score between 0 and 100

Interpretation:

  • Higher scores: Indicate that the context is more relevant to the query.
  • Lower scores: Suggest that the context is less relevant to the query.

Evaluation Using Python SDK

Click here to learn how to setup evaluation using the Python SDK.

Input:

  • Required Inputs:
    • context: string - The context column provided to the model
    • input: string - The input column provided to the model
  • Configuration parameters:
    • Check Internet: bool - True/False (Whether to check the internet for relevant information)

Output:

  • Score: float - Returns score between 0 and 1

Interpretation:

  • Higher scores: Indicate that the context is more relevant to the query.
  • Lower scores: Suggest that the context is less relevant to the query.
from fi.evals import EvalClient
from fi.testcases import TestCase
from fi.evals.templates import ContextRelevance

relevance_eval = ContextRelevance(config={
    "check_internet": False
})

test_case = TestCase(
    context="The current temperature is 72°F with partly cloudy skies.",
    input="What is the weather like?",
)

result = evaluator.evaluate(eval_templates=[relevance_eval], inputs=[test_case])
relevance_score = result.eval_results[0].metrics[0].value


What to do when Context Relevance is Low

When context relevance is low, the first step is to identify which parts of the context are either irrelevant or insufficient to address the query effectively.

If critical information is missing, additional details should be incorporated to ensure completeness. At the same time, any irrelevant content should be removed or refined to improve focus and alignment with the query.

Implementing mechanisms to enhance context-query alignment can further strengthen relevance, ensuring that only pertinent information is considered. Additionally, optimising context retrieval processes can help prioritise relevant details, improving overall response accuracy and coherence.


Differentiating Context Relevance with Similar Evals

  1. Context Adherence: It measures how well responses stay within the provided context while Context Relevance evaluates the sufficiency and appropriateness of the context.
  2. Completeness: Completeness evaluates if the response completely answers the query, while Context Relevance focuses on the context’s ability to support a complete response.
  3. Context Similarity: It compares similarity between provided and expected context, that is, it measures how closely the context matches expected information, while Context Relevance assesses if the context is sufficient and appropriate for the query.