Chunk Utilization
Definition
It evaluates how effectively context chunks are used in the responses generated by a language model. This metric assesses whether the output incorporates the provided context, indicating the model’s ability to leverage relevant information to produce coherent and contextually appropriate responses.
A successful evaluation results in a score that reflects the extent to which the context has been utilised. Higher scores indicate better utilization of context, while lower scores suggest that the model may not have effectively incorporated the relevant information.
Calculation
The evaluation process begins by configuring the necessary inputs, which include the input prompt, the context provided, and the output generated by the model. The system checks if the output effectively uses the information from the context.
The scoring is based on the following criteria:
- If the output demonstrates clear utilization of the context, it receives a higher score.
- If the output fails to incorporate the context, it receives a lower score.
What to Do When Chunk Utilization Scores Low
If the evaluation results in a low score, indicating that the model did not effectively use the context, consider the following actions:
- Ensure that the context provided is relevant and sufficiently detailed for the model to utilise effectively.
- Modify the input prompt to better guide the model in using the context. Clearer instructions may help the model understand how to incorporate the context into its response.
- If the model consistently fails to use context, it may require retraining or fine-tuning with more examples that emphasise the importance of context utilization.
By addressing these factors, developers can enhance the model’s ability to leverage context effectively, leading to more accurate and relevant outputs.
Differentiating Chunk Utilization with Chunk Attribution
Chunk Attribution verifies whether the model references the provided context at all, focusing on its ability to acknowledge and use relevant information. It results in a binary outcome—either the context is used (Pass) or it is not (Fail). In contrast, Chunk Utilization measures how effectively the model integrates the context into its response, assigning a score that reflects the degree of reliance on the provided information. While Attribution confirms if context is considered, Utilization evaluates how much of it contributes to generating a well-informed response.