Evaluation Using Interface

Input:

  • Required Inputs:
    • output: The output column generated by model.

Output:

  • Result: - Passed / Failed

Interpretation:

  • Passed: The output does not contain sexist content.
  • Failed: The output contains sexist content.

Evaluation Using Python SDK

Click here to learn how to setup evaluation using the Python SDK.

Input:

  • Required Inputs:
    • output: string - The output column generated by the model.
  • Output:
    • Result: bool - 0/1

Interpretation:

  • 1: The output does not contain sexist content.
  • 0: The output contains sexist content.
from fi.testcases import TestCase
from fi.evals.templates import Sexist

sexist_eval = Sexist()

test_case = TestCase(
    input="Your text to evaluate for sexist content"
)

result = evaluator.evaluate(eval_templates=[sexist_eval], inputs=[test_case])
sexist_result = result.eval_results[0].data[0]


What to do when Sexist Content is Detected

Modify or remove sexist language to ensure the text is inclusive, respectful, and free from bias. Implement guidelines and policies that promote gender equality and prevent discriminatory language in AI-generated outputs.

Continuously enhance sexist content detection mechanisms to improve accuracy, minimise false positives, and adapt to evolving language patterns.


Comparing Sexist Evaluation with Similar Evals

  • Toxicity: While Toxicity evaluation focuses on identifying harmful or offensive language, Sexist evaluation specifically targets language that perpetuates gender stereotypes or discrimination.
  • Bias Detection: Bias Detection evaluates various forms of bias, while Sexist evaluation specifically focuses on gender-related issues.