No LLM Reference: AI Provider Mention Detection Metric

Detects references to LLM providers or model names in AI responses, flagging mentions of OpenAI, Anthropic, GPT-4, Claude, or Llama.

result = evaluator.evaluate(
    eval_templates="no_llm_reference",
    inputs={
        "output": "Dear Sir, I hope this email finds you well. I look forward to any insights or advice you might have whenever you have a free moment"
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)

import { Evaluator, Templates } from "@future-agi/ai-evaluation";

const evaluator = new Evaluator();

const result = await evaluator.evaluate(
  "no_llm_reference",
  {
    output: "Dear Sir, I hope this email finds you well. I look forward to any insights or advice you might have whenever you have a free moment"
  },
  {
    modelName: "turing_flash",
  }
);

console.log(result);

Input
	Required Input	Type	Description
	`output`	`string`	Content to evaluate for LLM reference.

Output
	Field	Description
	Result	Returns Passed if no LLM reference is detected in the model’s output, or Failed if LLM reference is detected in the model’s output.
	Reason	Provides a detailed explanation of why the content was classified as containing or not containing LLM reference.

What to Do When No LLM Reference Score is Low

This evaluation detects both explicit mentions (e.g., “OpenAI”, “ChatGPT”, “Claude”, “Llama”) and implicit self-identification (“As an AI language model…”)
It covers references to all major LLM providers (OpenAI, Anthropic, Meta, Mistral, DeepSeek, etc.), their products, and model names/versions
If your content legitimately needs to discuss LLM providers as subject matter, consider using a different evaluation
For comprehensive brand compliance, combine with other brand-specific evaluations

Was this page helpful?

Questions & Discussion