Evaluation Using Interface
Input:- Required:
- context: - The contextual information provided to the model.
- output: - The response generated by the language model.
- Optional:
- input: - The original query or instruction given to the model.
- Result: - Passed / Failed
Interpretation:
- Passed: signifies that the model acknowledged the context, which is a prerequisite for generating contextually grounded responses.
- Failed: indicates a potential issue, such as the model ignoring the context, the context being entirely irrelevant, or the prompt not adequately instructing the model to use the context. This often points to problems in the retrieval or generation step of a RAG system.
Evaluation Using SDK
Click here to learn how to setup evaluation using SDK.
Input | Parameter | Type | Description |
---|---|---|---|
Required | context | string or list[string] | The contextual information provided to the model. |
output | string | The response generated by the language model. | |
Optional | input | string | The original query or instruction given to the model. |
Output | Type | Description |
---|---|---|
Result | string | Passed / Failed |
What to Do When Chunk Attribution Fails
- Ensure that the context provided is relevant and sufficiently detailed for the model to utilise effectively. Irrelevant context might be ignored.
- Modify the input prompt to explicitly guide the model to use the context. Clearer instructions (e.g., “Using the provided documents, answer…”) can help.
- Check the retrieval mechanism: Is the correct context being retrieved and passed to the generation model?
- If the model consistently fails to use context despite relevant information and clear prompts, it may require fine-tuning with examples that emphasize context utilization.