Chunk Attribution
Definition
It evaluates how effectively a language model utilises context chunks when generating responses. This metric assesses whether the output incorporates the information provided in the context, thereby indicating the model’s ability to leverage relevant data to produce coherent and contextually appropriate responses.
A successful evaluation results in a “Passed” score, indicating that the model effectively used the context. Conversely, a “Failed” score suggests that the output did not adequately reflect the provided context, highlighting potential gaps in the model’s understanding or application of the information.
Calculation
The evaluation process begins by configuring the necessary inputs, which include the input prompt, the context provided, and the output generated by the model. The system checks if the output effectively uses the information from the context.
The scoring is straightforward:
- If the output demonstrates clear utilization of the context, it receives a “Passed” score.
- If the output fails to incorporate the context, it receives a “Failed” score.
What to Do When Chunk Attribution Fails
If the evaluation results in a “Failed” score, indicating that the model did not effectively use the context, consider the following actions:
- Ensure that the context provided is relevant and sufficiently detailed for the model to utilise effectively.
- Modify the input prompt to better guide the model in using the context. Clearer instructions may help the model understand how to incorporate the context into its response.
- If the model consistently fails to use context, it may require retraining or fine-tuning with more examples that emphasize the importance of context utilization.
By addressing these factors, developers can enhance the model’s ability to leverage context effectively, leading to more accurate and relevant outputs.
Differentiating Chunk Attribution with Chunk Utilization
Chunk Attribution verifies whether the model references the provided context at all, focusing on its ability to acknowledge and use relevant information. It results in a binary outcome—either the context is used (Pass) or it is not (Fail). In contrast, Chunk Utilization measures how effectively the model integrates the context into its response, assigning a score that reflects the degree of reliance on the provided information. While Attribution confirms if context is considered, Utilization evaluates how much of it contributes to generating a well-informed response.