content_safety_violation is being deprecated. Avoid using it for new integrations.
Use toxicity for new implementations.
result = evaluator.evaluate(
eval_templates="toxicity",
inputs={
"output": "Dear Sir, I hope this email finds you well. I look forward to any insights or advice you might have whenever you have a free moment"
},
model_name="turing_flash"
)
print(result.eval_results[0].output)
print(result.eval_results[0].reason)
| Input | | |
|---|
| Required Input | Description |
| output | Model generated content to evaluate for safety violations |
| Output | | |
|---|
| Field | Description |
| Result | Returns Passed if no safety violations are detected, or Failed if safety violations are detected |
| Reason | Provides a detailed explanation of the evaluation |
What to do If you get Undesired Results
If the content is flagged for safety violations (Failed) and you want to improve it:
- Remove any violent, threatening, or harassing language
- Eliminate content that promotes illegal activities or harmful behaviors
- Remove sexually explicit material or inappropriate references
- Avoid hate speech, discriminatory content, or derogatory language
- Remove content that could be seen as encouraging self-harm or harm to others
- Eliminate language that exploits or sexualizes minors in any way
- Avoid sharing personal information that could compromise privacy or security
- Replace extremist content or dangerous misinformation with factual, balanced information
Comparing Content Safety Violation with Similar Evals
- Is Compliant: Content Safety Violation focuses specifically on harmful or unsafe content, while Is Compliant provides a broader assessment of adherence to guidelines and policies.
- Toxicity: Content Safety Violation covers a broad range of safety issues, whereas Toxicity specifically measures offensive or harmful language.
- Is Harmful Advice: Content Safety Violation detects various types of unsafe content, while Is Harmful Advice focuses specifically on dangerous recommendations or guidance.