Customer Agent Conversation Quality Evaluation Metric

Conversation-level quality metric that assesses overall user experience including clarity, helpfulness, responsiveness, tone, and user satisfaction.

result = evaluator.evaluate(
    eval_templates="customer_agent_conversation_quality",
    inputs={
        "conversation": "User: Hi, I need help resetting my password.\nAgent: Of course! I'll send a reset link to your registered email. Is there anything else I can help you with?\nUser: That's all, thanks!\nAgent: You're welcome! Have a great day."
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)

import { Evaluator, Templates } from "@future-agi/ai-evaluation";

const evaluator = new Evaluator();

const result = await evaluator.evaluate(
  "customer_agent_conversation_quality",
  {
    conversation: "User: Hi, I need help resetting my password.\nAgent: Of course! I'll send a reset link to your registered email. Is there anything else I can help you with?\nUser: That's all, thanks!\nAgent: You're welcome! Have a great day."
  },
  {
    modelName: "turing_flash",
  }
);

console.log(result);

Input
	Required Input	Type	Description
	`conversation`	`string`	The full conversation history between the customer and agent

Output
	Field	Description
	Result	Returns one of: `1`, `2`, `3`, `4`, or `5` — where 1 is very poor and 5 is excellent overall conversation quality
	Reason	Provides a detailed explanation of the conversation quality assessment

What to Do When Conversation Quality Score is Low

Review the full conversation for clarity, tone, and helpfulness
Identify specific turns where the agent failed to meet user expectations
Improve response templates for common customer scenarios
Combine with other customer agent evals to pinpoint specific weaknesses

Comparing Conversation Quality with Similar Evals

Conversation Resolution: Conversation Quality provides a holistic quality rating, while Conversation Resolution focuses specifically on whether the user’s query was fully addressed.
Is Helpful: Conversation Quality rates the overall interaction experience, while Is Helpful evaluates individual response helpfulness.

Was this page helpful?

Questions & Discussion