Evaluation Using Interface

Input:
  • Required Inputs:
    • output: The output column generated by model.
    • context: The context column provided to the model
Output:
  • Score: Percentage score between 0 and 100
Interpretation:
  • Higher scores: Indicate that the output is more contextually consistent.
  • Lower scores: Suggest that the output is less contextually consistent.

Evaluation Using SDK

Click here to learn how to setup evaluation using SDK.
Input:
  • Required Inputs:
    • output: string - The output column generated by the model.
    • context: string - The context column provided to the model
Output:
  • Score: float - Returns score between 0 and 1
Interpretation:
  • Higher scores: Indicate that the output is more contextually consistent.
  • Lower scores: Suggest that the output is less contextually consistent.
result = evaluator.evaluate(
    eval_templates="context_adherence",
    inputs={
        "context": "Honey never spoils because it has low moisture content and high acidity, creating an environment that resists bacteria and microorganisms. Archaeologists have even found pots of honey in ancient Egyptian tombs that are still perfectly edible.",
        "output": "Honey doesn't spoil because its low moisture and high acidity prevent the growth of bacteria and other microbes."
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)

What to do when Context Adherence is Low

When context adherence is low, start by identifying statements that are not supported by the provided context and checking for implicit versus explicit information to assess potential misinterpretations. Reviewing how the context is processed can help pinpoint inconsistencies. If necessary, expand context coverage to fill in gaps, clarify ambiguous details, and add missing relevant information. To improve adherence, implement stricter context binding, integrate fact-checking mechanisms, and enhance overall context processing.

Comparing Context Adherence with Similar Evals

  1. Context Relevance: While Context Adherence focuses on staying within context bounds, Context Relevance evaluates if the provided context is sufficient and appropriate for the query.
  2. Prompt/Instruction Adherence: Context Adherence measures factual consistency with context, while Prompt Adherence evaluates following instructions and format requirements.