Evaluation Using SDK
Troubleshooting
If you encounter issues with this evaluation:- Ensure that the
input
parameter contains the user’s request and theoutput
parameter contains the model’s response - For accurate evaluation, provide the complete response as generated by your model without modifications
- This evaluation only detects if harmful responses were provided when they should have been refused
- Consider pairing this with
content-safety-violation
for a more comprehensive safety evaluation
Related Evaluations
- content-safety-violation: Checks if content itself contains harmful material
- is-harmful-advice: Evaluates whether advice given could cause harm
- prompt-injection: Detects attempts to manipulate the model’s behavior through prompting