Score Eval
Scores the linkage between instructions, input images, and output images. This evaluation ensures that the output images accurately reflect the instructions and input images, adhering to the defined evaluation criteria. A high score indicates strong alignment and coherence, while a low score suggests discrepancies or misalignment.
Evaluation Using Interface
Input:
- Required Inputs:
- input: The text or instruction column that serves as the reference for evaluation.
- rule_prompt: A guideline or rule column used to measure the linkage. This can include dynamic placeholders (e.g., ).
- Optional Inputs:
- Note: While the definition mentions input/output images, the provided parameters focus on text/instruction and rule prompt. Add image inputs here if they are configurable via the interface.
Output:
- Score: Percentage score between 0 and 100
Interpretation:
- Higher scores: Indicate strong alignment and coherence between the input/instruction and the rule prompt.
- Lower scores: Suggest discrepancies or misalignment.
Evaluation Using Python SDK
Click here to learn how to setup evaluation using the Python SDK.
Input Type | Parameter | Type | Description |
---|---|---|---|
Required Inputs | input | string | The text or instruction that serves as the reference for evaluation. |
rule_prompt | string | A guideline or rule used to measure the linkage. | |
Optional Inputs | Add image parameters here if applicable via SDK |
Output | Type | Description |
---|---|---|
Score | float | Returns a score between 0 and 1, where higher values indicate better alignment/coherence. |
What to do if Score Eval Gives Low Score
The evaluation criteria should be reassessed to ensure they are clearly defined and aligned with the intended evaluation goals. Adjustments may be necessary to make the criteria more comprehensive and relevant.
Additionally, examining the output images for alignment with instructions and input images can help identify discrepancies or misalignments.
Refining the instructions or improving the image generation process can enhance the overall evaluation outcome.
Differentiating Score Eval with Eval Image Instruction
Eval Image Instruction focuses specifically on assessing the alignment between textual instructions and image, ensuring that the generated image accurately represents the given instructions. In contrast, Score Eval has a broader scope, evaluating coherence and alignment across multiple inputs and outputs, including both text and images.
Eval Image Instruction assesses instruction-image accuracy, whereas Score Eval examines overall coherence and adherence to instructions. Eval Image Instruction is ideal for cases where precise image representation is the main concern, while Score Eval is better suited for complex scenarios involving multiple modalities, ensuring comprehensive alignment and coherence.
Was this page helpful?