Evaluation Using Interface
Input:- Required Inputs:
- input: URL or file path to the image being captioned.
- output: The caption text to evaluate.
- Result: Returns ‘Passed’ if the caption accurately represents what’s in the image without hallucination, ‘Failed’ if the caption contains hallucinated elements.
- Reason: A detailed explanation of why the caption was classified as containing or not containing hallucinations.
Evaluation Using SDK
Click here to learn how to setup evaluation using SDK.Input:
- Required Inputs:
- input:
string
- URL or file path to the image being captioned. - output:
string
- The caption text to evaluate.
- input:
- Result: Returns a list containing ‘Passed’ if the caption accurately represents what’s in the image without hallucination, or ‘Failed’ if the caption contains hallucinated elements.
- Reason: Provides a detailed explanation of the evaluation.
- The image does indeed show an elderly male figure with characteristic features of advanced age (white/gray hair, wrinkles, aged appearance).
- The caption is minimalist but factually correct, avoiding any specific claims about identity, activity, setting, or other details that might constitute hallucination.
- While the caption doesn’t capture the specific identity of the person (who appears to be Albert Einstein or an Einstein-like figure), simply describing the subject as an “old man” remains factually accurate without overreaching.