Context Adherence
Evaluates how well responses stay within the provided context by measuring if the output contains any information not present in the given context. This evaluation is crucial for ensuring factual consistency and preventing hallucination in responses.
Evaluation Using Interface
Input:
- Required Inputs:
- output: The output column generated by model.
- context: The context column provided to the model
Output:
- Score: Percentage score between 0 and 100
Interpretation:
- Higher scores: Indicate that the output is more contextually consistent.
- Lower scores: Suggest that the output is less contextually consistent.
Evaluation Using Python SDK
Click here to learn how to setup evaluation using the Python SDK.
Input:
- Required Inputs:
- output:
string
- The output column generated by the model. - context:
string
- The context column provided to the model
- output:
Output:
- Score:
float
- Returns score between 0 and 1
Interpretation:
- Higher scores: Indicate that the output is more contextually consistent.
- Lower scores: Suggest that the output is less contextually consistent.
What to do when Context Adherence is Low
When context adherence is low, start by identifying statements that are not supported by the provided context and checking for implicit versus explicit information to assess potential misinterpretations.
Reviewing how the context is processed can help pinpoint inconsistencies. If necessary, expand context coverage to fill in gaps, clarify ambiguous details, and add missing relevant information.
To improve adherence, implement stricter context binding, integrate fact-checking mechanisms, and enhance overall context processing.
Comparing Context Adherence with Similar Evals
- Context Relevance: While Context Adherence focuses on staying within context bounds, Context Relevance evaluates if the provided context is sufficient and appropriate for the query.
- Prompt/Instruction Adherence: Context Adherence measures factual consistency with context, while Prompt Adherence evaluates following instructions and format requirements.