Image Instruction Adherence Evaluation Metric

Measures how well generated images adhere to a text instruction across subject, style, and composition, checking visual alignment with the prompt.

result = evaluator.evaluate(
    eval_templates="image_instruction_adherence",
    inputs={
        "instruction": "A photorealistic image of a red sports car on a mountain road at sunset",
        "images": ["https://example.com/generated-car.jpg"]
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)

import { Evaluator, Templates } from "@future-agi/ai-evaluation";

const evaluator = new Evaluator();

const result = await evaluator.evaluate(
  "image_instruction_adherence",
  {
    instruction: "A photorealistic image of a red sports car on a mountain road at sunset",
    images: ["https://example.com/generated-car.jpg"]
  },
  {
    modelName: "turing_flash",
  }
);

console.log(result);


Required Input	Type	Description
`instruction`	`string`	The text instruction describing what the image should contain or depict
`images`	`string` or `list[string]`	The generated image(s) to be evaluated against the instruction

Output
	Field	Description
	Result	Returns a numeric score where higher values indicate closer adherence to the instruction
	Reason	Provides a detailed explanation of how well the image matches the instruction

What to Do When Image Instruction Adherence Score is Low

Review the instruction for ambiguity and make it more specific
Check that all key elements mentioned in the instruction are present in the image
Verify that style, composition, and color requirements are reflected
Consider iterating on the generation prompt to better guide the model
Break complex instructions into simpler, more focused prompts

Comparing Image Instruction Adherence with Similar Evals

CLIP Score: Image Instruction Adherence uses an LLM to reason about detailed instruction compliance, while CLIP Score computes a statistical alignment metric between image and text embeddings.
Caption Hallucination: Image Instruction Adherence evaluates whether a generated image matches its instruction, while Caption Hallucination checks whether a text caption accurately describes what is visible in an image.

Was this page helpful?

Questions & Discussion