Input | |||
---|---|---|---|
Required Input | Type | Description | |
input | string | User request or question to the model. | |
output | string | Response of the model based on the input. |
Output | ||
---|---|---|
Field | Description | |
Result | Returns Passed if the response successfully completes the requested task, or Failed if it doesn’t. | |
Reason | Provides a detailed explanation of why the response was classified as successfully completing the task or not. |
What to do If you get Undesired Results
If the response is evaluated as not completing the task (Failed) and you want to improve it:- Make sure the response directly addresses the specific task or question asked
- Ensure all parts of multi-part questions or requests are addressed
- Provide complete information without assuming prior knowledge
- For how-to requests, include clear, actionable steps
- For questions seeking explanations, provide the reasoning or mechanisms behind the answer
- Consider whether the task requires specific formatting, calculations, or output types
- Verify that the response is accurate and relevant to the specific task
Comparing Task Completion with Similar Evals
- Completeness: While Task Completion evaluates whether a response successfully accomplishes a requested task, Completeness focuses specifically on whether all required information is included.
- Instruction Adherence: Task Completion evaluates whether a response accomplishes the requested task, whereas Instruction Adherence measures how well the response follows specific instructions.
- Is Helpful: Task Completion focuses on successful completion of a task, while Is Helpful evaluates the overall usefulness of a response.