With Protect, you can:

  • Define custom guardrail criteria
  • Enforce dynamic content filtering in production
  • Instantly respond to violations based on metrics like Toxicity, Sexism, Prompt Injection, Data Privacy, and more

Future AGI’s Protect operates and integrates natively into your application ensuring your AI is not just tested for safety, but continuously shielded from emerging threats and evolving compliance standards. Adaptive guardrails let you update criteria such as policies or risks change, keeping your systems resilient and aligned.


QuickStart

Use the Protect module from the FutureAGI SDK to evaluate and filter AI-generated content based on safety metrics like toxicity.

Step 1 : Install the SDK

pip install futureagi

Step 2 : Set Your API Keys

Make sure to set your API keys as environment variables:

export FI_API_KEY=xxxx123xxxx
export FI_SECRET_KEY=xxxx12341xxxx

Step 3 : Use Protect

from fi.evals import protect

# Example AI-generated response
response = "AI Generated Message"

# Define a fallback action if the content is flagged
fallback_action = "I'm sorry, I can't help you with that."

# Specify which rules to check against
rules = [
    {'metric': 'Toxicity'}
]

# Run protection check
protected_response = protect(
    response,
    protect_rules=rules,
    action=fallback_action,
    reason=True,     # Include reason for flagging
    timeout=25       # Optional timeout in seconds
)

print(protected_response)

To dive deeper into configuring Protect for your specific workflows check out the How to Configure Protect guide.


By enabling intelligent, real-time decisions on what passes through your model, Protect helps maintain trust, ensure safety, and strengthen the integrity of your AI in the real world.

Protect – Supported Evaluations

Protect gives you fast, reliable access to a subset of Future AGI’s safety metrics, helping you secure your AI application in real-time production environments.

Toxicity

Detects harmful or offensive language in the input text, flagging content that may be toxic or inappropriate. Learn More about Toxicity Evaluation on Future AGI (similar everywhere once links are fixed )

Tone

Analyses the emotional tone and sentiment of the content, classifying it into categories such as neutral, angry, or positive.

Sexism

Checks for gender-biased or sexist language to promote fairness and inclusivity in your AI output.

Prompt Injection

Identifies patterns or keywords that may indicate attempts to manipulate downstream systems through malicious or unintended instructions.

Data Privacy

Evaluates content for compliance with data privacy standards (e.g., GDPR, HIPAA). Flags potential exposure of sensitive data and checks adherence to privacy best practices.