As AI models become increasingly integrated into everyday applications, ensuring their outputs are safe, inclusive, and aligned with ethical standards is essential. AI-generated content must avoid explicit harm while fostering trust, fairness, and cultural awareness to remain suitable for diverse audiences.

Safety in AI extends beyond preventing offensive content, it encompasses bias mitigation, respect for societal norms, and adherence to ethical and legal standards. A well-moderated AI system enhances user confidence, minimising risks of discrimination, misinformation, or inappropriate language, while ensuring AI remains reliable, socially responsible, and widely accessible.

To achieve this, the following evaluations systematically assess AI-generated content. These assessments create a structured framework to ensure that AI-driven communication is ethical, respectful, and suitable for all users:


1. Tone

Evaluates the sentiment of content to ensure it’s appropriate for the given context.

Click here to read the eval definition of Tone

a. Using Interface

Required Parameters

  • Input: The text content to evaluate for tone

Output: Returns tag such as “neutral”, “joy”, etc whatever tag that indicates the dominant emotional tone detected in the content

b. Using SDK

from fi.evals import EvalClient
from fi.testcases import TestCase
from fi.evals.templates import Tone

evaluator = EvalClient(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key",
    fi_base_url="https://api.futureagi.com"
)

tone_eval = Tone()

test_case = TestCase(
    input='''
    Dear Sir, I hope this email finds you well. 
    I look forward to any insights or advice you might have 
    whenever you have a free moment'''
)

result = evaluator.evaluate(eval_templates=[tone_eval], inputs=[test_case])
tone_result = result.eval_results[0].data[0]

2. Sexist

Identifies content with gender bias or sexist language. Checks for use of stereotypes or discriminatory language or the content has imbalanced representation or assumptions based on gender.

Click here to read the eval definition of Sexist

a. Using Interface

Required Parameters

  • Input: The text content to check for sexist content

Output: Returns either “Passed” or “Failed”, where “Passed” indicates no sexist content detected, “Failed” indicates presence of gender bias or discriminatory language

b. Using SDK

from fi.evals import EvalClient
from fi.testcases import TestCase
from fi.evals.templates import Sexist

evaluator = EvalClient(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key",
    fi_base_url="https://api.futureagi.com"
)

sexist_eval = Sexist()

test_case = TestCase(
    input="Your text to evaluate for sexist content"
)

result = evaluator.evaluate(eval_templates=[sexist_eval], inputs=[test_case])
sexist_result = result.eval_results[0].data[0]

3. Toxicity

Evaluates content for toxic, harmful, or aggressive language. Such as use of profanity, threats, or abusive language. Content that could harm user relationships or escalate conflicts.

Click here to read the eval definition of Toxicity

a. Using Interface

Required Parameters

  • Input: The text content to analyse for toxic content

Output: Returns either “Passed” or “Failed”, where “Passed” indicates non-toxic content, “Failed” indicates presence of harmful or aggressive language

b. Using SDK

from fi.evals import EvalClient
from fi.testcases import TestCase
from fi.evals.templates import Toxicity

evaluator = EvalClient(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key",
    fi_base_url="https://api.futureagi.com"
)

toxicity_eval = Toxicity()

test_case = TestCase(
    input="Hello! Hope you're having a wonderful day!"
)

result = evaluator.evaluate(eval_templates=[toxicity_eval], inputs=[test_case])
toxicity_result = result.eval_results[0].data[0]

4. Content Moderation

Evaluates content safety using OpenAI’s content moderation system to detect and flag potentially harmful, inappropriate, or unsafe content

Click here to read the eval definition of Content Moderation

a. Using Interface

Required Parameters

  • Text: The text content to moderate

Output: Returns Float between 0 and 1. Higher values indicate safer content, lower values indicate potentially inappropriate content

b. Using SDK

from fi.evals import EvalClient
from fi.testcases import TestCase
from fi.evals.templates import ContentModeration

evaluator = EvalClient(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key",
    fi_base_url="https://api.futureagi.com"
)

moderation_eval = ContentModeration()

test_case = TestCase(
    text="This is a sample text to check for content moderation."
)

result = evaluator.evaluate(eval_templates=[moderation_eval], inputs=[test_case])
moderation_result = result.eval_results[0].metrics[0].value

5. Bias Detection

Identifies biases in the output, including gender, racial, cultural, or ideological biases. An ideal AI generated response must be neutral language use without favouring or discriminating against any group.

Click here to read the eval definition of Bias Detection

a. Using Interface

Required Parameters

  • Input: The text content to analyse for bias

Output: Returns either “Passed” or “Failed”, where “Passed” indicates neutral content, “Failed” indicates presence of bias.

b. Using SDK

from fi.evals import EvalClient
from fi.testcases import TestCase
from fi.evals.templates import BiasDetection

evaluator = EvalClient(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key",
    fi_base_url="https://api.futureagi.com"
)

bias_eval = BiasDetection()

test_case = TestCase(
    input="This is a sample text to check for bias detection"
)

result = evaluator.evaluate(eval_templates=[bias_eval], inputs=[test_case])
bias_result = result.eval_results[0].data[0]

6. Cultural Sensitivity

Analyses the output for cultural appropriateness, inclusive language, and awareness of cultural nuances.

Click here to read the eval definition of Cultural Sensitivity

a. Using Interface

Required Parameters

  • Input: The text content to analyse for cultural appropriateness

Output: Returns either “Passed” or “Failed”, where “Passed” indicates culturally appropriate content, “Failed” indicates potential cultural insensitivity

b. Using SDK

from fi.evals import EvalClient
from fi.testcases import TestCase
from fi.evals.templates import CulturalSensitivity

evaluator = EvalClient(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key",
    fi_base_url="https://api.futureagi.com"
)

cultural_eval = CulturalSensitivity()

test_case = TestCase(
    input="This is a sample text to check for cultural sensitivity"
)

result = evaluator.evaluate(eval_templates=[cultural_eval], inputs=[test_case])
sensitivity_result = result.eval_results[0].data[0]  # Returns "Passed" or "Failed"

7. Safe for Work Text

Ensures the text is appropriate for professional environments, since the AI response should have absence of explicit, offensive, or overly personal content.

Click here to read the eval definition of Safe for Work Text

a. Using Interface

Required Parameters

  • Response: The text content to evaluate for workplace appropriateness

Output: Returns either “Passed” or “Failed”, where “Passed” indicates safe for work text and “Failed” indicates not safe for work text.

b. Using SDK

from fi.evals import EvalClient
from fi.testcases import TestCase
from fi.evals.templates import SafeForWorkText

evaluator = EvalClient(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key",
    fi_base_url="https://api.futureagi.com"
)

sfw_eval = SafeForWorkText()

test_case = TestCase(
    response="This is a sample text to check for safe for work text"
)

result = evaluator.evaluate(eval_templates=[sfw_eval], inputs=[test_case])
sfw_result = result.eval_results[0].metrics[0].id

8. Not Gibberish Text

Validates that the text is coherent and meaningful. Absence of nonsensical or garbled content and has logical structure and readability.

Click here to read the eval definition of Not Gibberish Text

a. Using Interface

Required Parameters

  • Response: The text content to evaluate for coherence

Output: Returns float between 0 and 1. Higher values indicate more coherent and meaningful content.

b. Using SDK

from fi.evals import EvalClient
from fi.testcases import TestCase
from fi.evals.templates import NotGibberishText

evaluator = EvalClient(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key",
    fi_base_url="https://api.futureagi.com"
)

gibberish_eval = NotGibberishText()

test_case = TestCase(
    response="This is a sample text to check for gibberish text"
)

result = evaluator.evaluate(eval_templates=[gibberish_eval], inputs=[test_case])
gibberish_result = result.eval_results[0].metrics[0].id

By integrating these evaluation methods, AI systems can consistently produce responsible, reliable, and socially aware outputs that enhance user trust and engagement.