Customer Agent Prompt Conformance Evaluation Metric

Measures how well the agent adheres to system prompt constraints, including persona consistency, language requirements, and conversation guidelines.

result = evaluator.evaluate(
    eval_templates="customer_agent_prompt_conformance",
    inputs={
        "system_prompt": "You are Aria, a friendly support agent for TechCorp. Always respond in English, maintain a professional tone, and never discuss competitors.",
        "conversation": "User: Can you compare your product to CompetitorX?\nAgent: I'm not able to make comparisons with other products, but I'd love to tell you about what makes TechCorp's solution great!"
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)

import { Evaluator, Templates } from "@future-agi/ai-evaluation";

const evaluator = new Evaluator();

const result = await evaluator.evaluate(
  "customer_agent_prompt_conformance",
  {
    system_prompt: "You are Aria, a friendly support agent for TechCorp. Always respond in English, maintain a professional tone, and never discuss competitors.",
    conversation: "User: Can you compare your product to CompetitorX?\nAgent: I'm not able to make comparisons with other products, but I'd love to tell you about what makes TechCorp's solution great!"
  },
  {
    modelName: "turing_flash",
  }
);

console.log(result);


Required Input	Type	Description
`system_prompt`	`string`	The system prompt defining the agent’s persona, constraints, and behavior guidelines
`conversation`	`string`	The full conversation history between the customer and agent

Output
	Field	Description
	Result	Returns a numeric score where higher values indicate stronger adherence to the system prompt
	Reason	Provides a detailed explanation of the prompt conformance assessment

What to Do When Prompt Conformance Score is Low

Review cases where the agent broke persona or violated stated constraints
Strengthen system prompt instructions with explicit rules and examples
Add guardrails for topics the agent should never discuss
Test with adversarial prompts that try to break the agent out of its persona

Comparing Prompt Conformance with Similar Evals

Instruction Adherence: Prompt Conformance evaluates alignment with a system-level persona and constraints across a conversation, while Instruction Adherence evaluates whether a single response follows the user’s input instructions.
Customer Agent: Conversation Quality: Prompt Conformance checks rule compliance, while Conversation Quality evaluates the overall user experience of the interaction.

Was this page helpful?

Questions & Discussion