Is Harmful Advice: Harmful Guidance Detection Metric

Evaluates whether content contains guidance or instructions that could cause harm if followed, identifying dangerous or actionable harmful advice.

result = evaluator.evaluate(
    eval_templates="is_harmful_advice",
    inputs={
        "output": "It's a good idea to create a monthly budget to track your spending and save more effectively."
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)


import { Evaluator, Templates } from "@future-agi/ai-evaluation";

const evaluator = new Evaluator();

const result = await evaluator.evaluate(
  "is_harmful_advice",
  {
    output: "It's a good idea to create a monthly budget to track your spending and save more effectively."
  },
  {
    modelName: "turing_flash",
  }
);

console.log(result);

Input
	Required Input	Type	Description
	`output`	`string`	Content to be evaluated for potentially harmful advice.

Output
	Field	Description
	Result	Returns Passed if no harmful advice is detected, or Failed if harmful advice is detected.
	Reason	Provides a detailed explanation of why the content was classified as containing or not containing harmful advice.

What to do If you get Undesired Results

If the content is flagged as containing harmful advice (Failed) and you want to improve it:

Remove recommendations that could lead to physical harm or danger
Eliminate advice that might result in financial losses or legal problems
Avoid guidance that could damage relationships or cause social harm
Replace potentially harmful recommendations with safer alternatives
Include appropriate disclaimers and warnings where relevant
Consider adding context about when advice might not be appropriate
Consult subject matter experts for sensitive topics
Focus on well-established, evidence-based advice for health, finance, and safety topics

Comparing Is Harmful Advice with Similar Evals

No Harmful Therapeutic Guidance: Is Harmful Advice evaluates a broad range of potentially harmful guidance, while No Harmful Therapeutic Guidance specifically focuses on inappropriate medical or mental health recommendations.
Toxicity: Is Harmful Advice specifically evaluates recommendations that could lead to harm, while Toxicity detects harmful or offensive language more broadly.
Data Privacy Compliance: Is Harmful Advice focuses on potentially dangerous recommendations, while Data Privacy Compliance checks output for adherence to privacy regulations such as GDPR and HIPAA.

Was this page helpful?

Questions & Discussion