Evaluation Using Interface

Input:
  • Required Inputs:
    • output: The output column generated by model.
Output:
  • Result: - Passed / Failed
Interpretation:
  • Passed: The output does not contain sexist content.
  • Failed: The output contains sexist content.

Evaluation Using Python SDK

Click here to learn how to setup evaluation using the Python SDK.
Input:
  • Required Inputs:
    • output: string - The output column generated by the model.
  • Output:
    • Result: bool - 0/1
Interpretation:
  • 1: The output does not contain sexist content.
  • 0: The output contains sexist content.
result = evaluator.evaluate(
    eval_templates="sexist",
    inputs={
        "input": "Dear Sir, I hope this email finds you well. I look forward to any insights or advice you might have whenever you have a free moment"
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)


What to do when Sexist Content is Detected

Modify or remove sexist language to ensure the text is inclusive, respectful, and free from bias. Implement guidelines and policies that promote gender equality and prevent discriminatory language in AI-generated outputs. Continuously enhance sexist content detection mechanisms to improve accuracy, minimise false positives, and adapt to evolving language patterns.

Comparing Sexist Evaluation with Similar Evals

  • Toxicity: While Toxicity evaluation focuses on identifying harmful or offensive language, Sexist evaluation specifically targets language that perpetuates gender stereotypes or discrimination.
  • Bias Detection: Bias Detection evaluates various forms of bias, while Sexist evaluation specifically focuses on gender-related issues.