Evaluation Using Interface
Input:- Required Inputs:
- output: column containing conversation history between the user and the model
- Score: percentage score between 0 and 100
- Higher scores: Indicate that the conversation is more coherent.
- Lower scores: Suggest that the conversation is less coherent.
Evaluation Using SDK
Click here to learn how to setup evaluation using SDK.Input:
- Required Inputs:
- output:
string
- conversation history between the user and the model provided as query and response pairs
- output:
- Score:
float
- returns score between 0 and 1
- Higher scores: Indicate that the conversation is more coherent.
- Lower scores: Suggest that the conversation is less coherent.
What to do when Conversation Coherence is Low
- Review conversation history to identify where context breaks occurred
- Implement context window management to ensure important information is retained
- Consider reducing the length of conversation threads if context loss is persistent
Comparing Conversation Coherence with Similar Evals
- Conversation Resolution: While Coherence focuses on the flow and context maintenance throughout the conversation, Resolution evaluates whether the conversation reaches a satisfactory conclusion.
- Context Adherence: Coherence differs from Context Adherence as it evaluates the internal consistency of the conversation rather than adherence to external context.
- Completeness: Coherence focuses on the logical flow between messages, while Completeness evaluates whether individual responses fully address their queries.