Skip to main content

Unlike traditional offline checks, Protect enables live monitoring and screening of every model input and output blocking or flagging harmful content before it reaches end users. With Protect, you can:
  • Define custom guardrail criteria across four critical safety dimensions
  • Enforce dynamic content filtering in production for text, image, and audio inputs
  • Instantly respond to violations with real-time detection of Content Moderation, Bias Detection, Security threats, and Data Privacy Compliance
Protect is your front-line defense between production AI and the public. Built on Google’s Gemma 3n foundation, our guardrailing system combines specialized fine-tuned adapters with multi-modal capabilities to deliver enterprise-grade safety. Operating natively across text, image, and audio modalities, Protect ensures comprehensive protection whether users interact through chat, voice assistants, or visual content—without requiring separate preprocessing pipelines. Future AGI’s Protect operates and integrates natively into your application ensuring your AI is not just tested for safety, but continuously shielded from emerging threats and evolving compliance standards. Adaptive guardrails let you update criteria as policies or risks change, keeping your systems resilient and aligned. By enabling intelligent, real-time decisions on what passes through your model, Protect helps maintain trust, ensure safety, and strengthen the integrity of your AI in the real world.

QuickStart

Use the Protect module from the FutureAGI SDK to evaluate and filter AI-generated content based on safety metrics like toxicity.

Step 1 : Install the SDK

pip install ai-evaluation

Step 2 : Set Your API Keys

Make sure to set your API keys as environment variables:
export FI_API_KEY=xxxx123xxxx
export FI_SECRET_KEY=xxxx12341xxxx

Step 3 : Use Protect

from fi.evals import Protect

# Initialize (reads FI_API_KEY / FI_SECRET_KEY from env if not passed)
protector = Protect()

rules = [{'metric': 'content_moderation'}]

protected_response = protector.protect(
    "AI Generated Message",
    protect_rules=rules,
    action="I'm sorry, I can't help you with that.",
    reason=True,        # include reasons list
    timeout=25000       # milliseconds (25s)
)

print(protected_response)
To dive deeper into configuring Protect for your specific workflows check out the How to Configure Protect guide.

Protect – Supported Evaluations

Protect provides fast, reliable safety checks across four critical dimensions, helping you secure your AI applications in real-time production environments.

Content Moderation

Detects harmful or offensive language including hate speech, threats, harassment, and toxic content. Evaluates context and meaning rather than isolated keywords to minimize false positives while catching genuine violations.

Bias Detection

Identifies gender-based discrimination, stereotyping, and sexist language. Goes beyond surface-level pattern matching to recognize subtle forms of bias and unfair characterization, promoting fairness and inclusivity in your AI outputs.

Security

Identifies adversarial attempts to manipulate AI systems through prompt injection attacks. Detects instruction override attempts, unauthorized role assumption, safety guideline bypass, and deceptive commands that could compromise your system’s integrity.

Data Privacy Compliance

Evaluates content for personally identifiable information (PII) including names, email addresses, phone numbers, financial data, and health records. Ensures compliance with data privacy standards such as GDPR and HIPAA by detecting potential exposure of sensitive information.