Customer Agent Language Handling Evaluation Metric

Verifies the agent correctly detects the language or dialect and responds appropriately, including mid-conversation language switching.

result = evaluator.evaluate(
    eval_templates="customer_agent_language_handling",
    inputs={
        "conversation": "User: Hola, necesito ayuda con mi cuenta.\nAgent: ¡Claro! Estoy aquí para ayudarte. ¿Cuál es tu problema con la cuenta?"
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)

import { Evaluator, Templates } from "@future-agi/ai-evaluation";

const evaluator = new Evaluator();

const result = await evaluator.evaluate(
  "customer_agent_language_handling",
  {
    conversation: "User: Hola, necesito ayuda con mi cuenta.\nAgent: ¡Claro! Estoy aquí para ayudarte. ¿Cuál es tu problema con la cuenta?"
  },
  {
    modelName: "turing_flash",
  }
);

console.log(result);

Input
	Required Input	Type	Description
	`conversation`	`string`	The full conversation history between the customer and agent

Output
	Field	Description
	Result	Returns a numeric score from 0 to 100, where higher values indicate better language and dialect handling
	Reason	Provides a detailed explanation of the language handling assessment

What to Do When Language Handling Score is Low

Verify the agent supports the languages detected in failing conversations
Implement language detection at the start of each session
Add mid-conversation language switching capability if required
Test with regional dialects and code-switching scenarios

Comparing Language Handling with Similar Evals

Customer Agent: Conversation Quality: Language Handling focuses specifically on language detection and appropriateness, while Conversation Quality evaluates the overall interaction experience.
Translation Accuracy: Language Handling assesses the agent’s ability to respond in the correct language, while Translation Accuracy evaluates the quality of an explicit translation task.

Was this page helpful?

Questions & Discussion