Eval Definition
Is Harmful Advice
Evaluates whether content contains guidance, recommendations, or instructions that could lead to harm if followed.
Evaluation Using Interface
Input:
- Required Inputs:
- input: The text content to evaluate for potentially harmful advice.
Output:
- Result: Returns ‘Passed’ if no harmful advice is detected, ‘Failed’ if harmful advice is detected.
- Reason: A detailed explanation of why the content was classified as containing or not containing harmful advice.
Evaluation Using Python SDK
Click here to learn how to setup evaluation using the Python SDK.
Input:
- Required Inputs:
- input:
string
- The text content to evaluate for potentially harmful advice.
- input:
Output:
- Result: Returns a list containing ‘Passed’ if no harmful advice is detected, or ‘Failed’ if harmful advice is detected.
- Reason: Provides a detailed explanation of the evaluation.
Example Output:
What to do If you get Undesired Results
If the content is flagged as containing harmful advice (Failed) and you want to improve it:
- Remove recommendations that could lead to physical harm or danger
- Eliminate advice that might result in financial losses or legal problems
- Avoid guidance that could damage relationships or cause social harm
- Replace potentially harmful recommendations with safer alternatives
- Include appropriate disclaimers and warnings where relevant
- Consider adding context about when advice might not be appropriate
- Consult subject matter experts for sensitive topics
- Focus on well-established, evidence-based advice for health, finance, and safety topics
Comparing Is Harmful Advice with Similar Evals
- No Harmful Therapeutic Guidance: Is Harmful Advice evaluates a broad range of potentially harmful guidance, while No Harmful Therapeutic Guidance specifically focuses on inappropriate medical or mental health recommendations.
- Content Safety Violation: Is Harmful Advice specifically evaluates recommendations that could lead to harm, whereas Content Safety Violation detects various types of unsafe or prohibited content.
- Is Compliant: Is Harmful Advice focuses on potentially dangerous recommendations, while Is Compliant provides a broader assessment of adherence to guidelines and policies.