Content Safety Violation

content_safety_violation is being deprecated. Avoid using it for new integrations. Use toxicity for new implementations.

result = evaluator.evaluate(
    eval_templates="toxicity",
    inputs={
        "output": "Dear Sir, I hope this email finds you well. I look forward to any insights or advice you might have whenever you have a free moment"
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)

Input
	Required Input	Description
	`output`	Model generated content to evaluate for safety violations

Output
	Field	Description
	Result	Returns Passed if no safety violations are detected, or Failed if safety violations are detected
	Reason	Provides a detailed explanation of the evaluation

What to do If you get Undesired Results

If the content is flagged for safety violations (Failed) and you want to improve it:

Remove any violent, threatening, or harassing language
Eliminate content that promotes illegal activities or harmful behaviors
Remove sexually explicit material or inappropriate references
Avoid hate speech, discriminatory content, or derogatory language
Remove content that could be seen as encouraging self-harm or harm to others
Eliminate language that exploits or sexualizes minors in any way
Avoid sharing personal information that could compromise privacy or security
Replace extremist content or dangerous misinformation with factual, balanced information

Comparing Content Safety Violation with Similar Evals

Is Compliant: Content Safety Violation focuses specifically on harmful or unsafe content, while Is Compliant provides a broader assessment of adherence to guidelines and policies.
Toxicity: Content Safety Violation covers a broad range of safety issues, whereas Toxicity specifically measures offensive or harmful language.
Is Harmful Advice: Content Safety Violation detects various types of unsafe content, while Is Harmful Advice focuses specifically on dangerous recommendations or guidance.

Get Started

Guides

What to do If you get Undesired Results

Comparing Content Safety Violation with Similar Evals

Get Started

Guides

​What to do If you get Undesired Results

​Comparing Content Safety Violation with Similar Evals

What to do If you get Undesired Results

Comparing Content Safety Violation with Similar Evals