Evaluation Using Interface
Input:- Required Inputs:
- output: The output column generated by model.
- context: The context column provided to the model
- Score: Percentage score between 0 and 100
- Higher scores: Indicate that the output is more contextually consistent.
- Lower scores: Suggest that the output is less contextually consistent.
Evaluation Using SDK
Click here to learn how to setup evaluation using SDK.Input:
- Required Inputs:
- output:
string
- The output column generated by the model. - context:
string
- The context column provided to the model
- output:
- Score:
float
- Returns score between 0 and 1
- Higher scores: Indicate that the output is more contextually consistent.
- Lower scores: Suggest that the output is less contextually consistent.
What to do when Context Adherence is Low
When context adherence is low, start by identifying statements that are not supported by the provided context and checking for implicit versus explicit information to assess potential misinterpretations. Reviewing how the context is processed can help pinpoint inconsistencies. If necessary, expand context coverage to fill in gaps, clarify ambiguous details, and add missing relevant information. To improve adherence, implement stricter context binding, integrate fact-checking mechanisms, and enhance overall context processing.Comparing Context Adherence with Similar Evals
- Context Relevance: While Context Adherence focuses on staying within context bounds, Context Relevance evaluates if the provided context is sufficient and appropriate for the query.
- Prompt/Instruction Adherence: Context Adherence measures factual consistency with context, while Prompt Adherence evaluates following instructions and format requirements.