Caption Hallucination
Evaluates whether an image caption contains fabricated information not actually visible in the image.
result = evaluator.evaluate(
eval_templates="caption_hallucination",
inputs={
"image": "https://www.esparklearning.com/app/uploads/2024/04/Albert-Einstein-generated-by-AI-1024x683.webp",
"caption": "old man"
},
model_name="turing_flash"
)
print(result.eval_results[0].output)
print(result.eval_results[0].reason)import { Evaluator, Templates } from "@future-agi/ai-evaluation";
const evaluator = new Evaluator();
const result = await evaluator.evaluate(
"caption_hallucination",
{
image: "https://www.esparklearning.com/app/uploads/2024/04/Albert-Einstein-generated-by-AI-1024x683.webp",
caption: "old man"
},
{
modelName: "turing_flash",
}
);
console.log(result); | Input | |||
|---|---|---|---|
| Required Input | Type | Description | |
image | string | URL or file path to the image being captioned | |
caption | string | The caption text to evaluate |
| Output | ||
|---|---|---|
| Field | Description | |
| Result | Returns Passed or Failed, where Passed indicates the caption accurately represents what’s in the image without hallucination and Failed indicates the caption contains hallucinated elements | |
| Reason | Provides a detailed explanation of the evaluation |
What to do If you get Undesired Results
If the caption is evaluated as containing hallucinations (Failed) and you want to improve it:
- Stick strictly to describing what is visibly present in the image
- Avoid making assumptions about:
- People’s identities (unless clearly labeled or universally recognizable)
- The location or setting (unless clearly identifiable)
- Time periods or dates
- Actions occurring before or after the captured moment
- Emotions or thoughts of subjects
- Objects that are partially obscured or ambiguous
- Use qualifying language (like “appears to be,” “what looks like”) when uncertain
- Focus on concrete visual elements rather than interpretations
- For generic descriptions, stay high-level and avoid specifics that aren’t clearly visible
Comparing Caption Hallucination with Similar Evals
- Is AI Generated Image: Caption Hallucination evaluates the accuracy of image descriptions, while Is AI Generated Image determines if the image itself was created by AI.
- Detect Hallucination: Caption Hallucination specifically evaluates image descriptions, whereas Detect Hallucination evaluates factual fabrication in text content more broadly.
- Groundedness: Caption Hallucination focuses on whether descriptions match what’s visible in images, while Groundedness ensures text responses adhere strictly to provided context without adding external information.
Was this page helpful?