1. Conversation Resolution
This evaluation examines a series of exchanged messages between the user and the AI system. It uses an LLM to analyse the interaction and determine whether the conversation reached a satisfactory resolution. The analysis results in a score representing the level of resolution achieved. Returns an output score. A higher score indicates that the conversation was resolved effectively. A lower score points to incomplete, unclear, or unresolved conversations. The Conversation Resolution evaluation enables developers to:- Spot unresolved interactions and improve system responses.
- Benchmark different conversational models to find the best-performing one.
- Continuously improve the quality of AI interactions to meet user expectations.
a. Using Interface
Required Parameters- output: An array of conversation messages between the user and the assistant. This is a required parameter.
b. Using SDK
Export your API key and Secret key into your environment variables.
2. Conversation Coherence
This eval determines whether a conversation flows logically, stays consistent with the context, and avoids abrupt or irrelevant shifts. It ensures that context Is maintained, responses are logical, and user experience is enhanced. Without coherence, conversations can become confusing or nonsensical, leading to user frustration and disengagement. This evaluation is useful for:- Testing the logical flow of chatbot or virtual assistant interactions.
- Benchmarking different conversational models for their ability to handle complex, multi-turn dialogues.
- Identifying breakdowns in context management or logical progression during AI interactions.
a. Using Interface
Required Inputs- output: Column that contains the complete conversation, represented as an array of user and assistant messages.