Conversation Resolution: Query Completion Metric
Evaluates whether each user query receives a complete response, checking that questions are answered and conversations reach satisfactory conclusions.
result = evaluator.evaluate(
eval_templates="conversation_resolution",
inputs={
"conversation": '''
User: My Wi-Fi keeps disconnecting every few minutes.
Assistant: You can try restarting your router and updating your network drivers.
User: I restarted the router and it's stable now. Thanks!
Assistant: Glad to hear that! Let me know if you need anything else.
'''
},
model_name="turing_flash"
)
print(result.eval_results[0].output)
print(result.eval_results[0].reason)
import { Evaluator, Templates } from "@future-agi/ai-evaluation";
const evaluator = new Evaluator();
const result = await evaluator.evaluate(
"conversation_resolution",
{
conversation: "User: My Wi-Fi keeps disconnecting every few minutes. Assistant: You can try restarting your router and updating your network drivers. User: I restarted the router and it's stable now. Thanks! Assistant: Glad to hear that! Let me know if you need anything else."
},
{
modelName: "turing_flash",
}
);
console.log(result); | Input | |||
|---|---|---|---|
| Required Input | Type | Description | |
conversation | string | Conversation history between the user and the model provided as query and response pairs |
| Output | ||
|---|---|---|
| Field | Description | |
| Result | Returns a score, where higher scores indicate more resolved conversation | |
| Reason | Provides a detailed explanation of the conversation resolution assessment |
What to do when Conversation Resolution is Low
- Add confirmation mechanisms to verify user satisfaction
- Develop fallback responses for unclear or complex queries
- Track common patterns in unresolved queries for improvement
- Consider implementing a clarification system for ambiguous requests
Comparing Conversation Resolution with Similar Evals
- Conversation Coherence: While Resolution focuses on addressing user needs, Coherence evaluates the logical flow and context maintenance. A conversation can be perfectly coherent but fail to resolve user queries, or vice versa.
- Completeness: Resolution differs from Completeness as it focuses on satisfactory conclusion rather than comprehensive coverage. A response can be complete but not resolve the user’s actual need.
- Context Relevance: Resolution evaluates whether queries are answered, while Context Relevance assesses if the provided context is sufficient for generating responses. A response can use relevant context but still fail to resolve the user’s query.
Was this page helpful?
Questions & Discussion