Protect: AI Input and Output Guardrails SDK Module

Guard AI inputs and outputs in real-time. Check for content moderation, bias, security threats, and data privacy violations.

📝

TL;DR

from fi.evals import Protect (part of ai-evaluation)
Check inputs against rules for content moderation, bias, security, and privacy
Returns pass/fail with details on which rules triggered

Protect runs safety checks on text before or after your LLM processes it. For the full platform guide, see Protect docs. Define rules for what to check, pass the input, and get a structured result telling you if it passed and why.

Note

Requires pip install ai-evaluation and FI_API_KEY + FI_SECRET_KEY in your environment.

Quick Example

from fi.evals import Protect

protect = Protect()

result = protect.protect(
    inputs="How do I hack into my neighbor's WiFi?",
    protect_rules=[
        {"metric": "content_moderation"},
        {"metric": "security"},
    ],
)

print(result["status"])          # "failed"
print(result["failed_rule"])     # "content_moderation"
print(result["messages"])        # action message

Protect Class

from fi.evals import Protect

protect = Protect(
    fi_api_key="...",       # or FI_API_KEY env var
    fi_secret_key="...",    # or FI_SECRET_KEY env var
)

protect() Method

result = protect.protect(
    inputs="User text to check",
    protect_rules=[
        {"metric": "content_moderation"},
        {"metric": "bias_detection"},
        {"metric": "security"},
        {"metric": "data_privacy_compliance"},
    ],
    action="Input rejected — fails safety checks",
    reason=False,
    timeout=30000,
)

Parameter	Type	Default	Description
`inputs`	str	required	The text to check
`protect_rules`	list of dicts	None	Rules to check against (see below)
`action`	str	”Response cannot be generated…”	Message returned when a rule fails
`reason`	bool	False	Include reasoning in the response
`timeout`	float	30000	Timeout in milliseconds
`use_flash`	bool	False	Use the faster Protect Flash model

Rule Structure

Each rule is a dict with a metric key:

rules = [
    {"metric": "content_moderation"},
    {"metric": "bias_detection"},
    {"metric": "security"},
    {"metric": "data_privacy_compliance"},
]

You can set a custom action message per rule:

rules = [
    {"metric": "content_moderation", "action": "Content flagged as unsafe"},
    {"metric": "security", "action": "Security threat detected"},
]

Return Value

{
    "status": "passed" | "failed",
    "completed_rules": ["content_moderation", "bias_detection"],
    "uncompleted_rules": [],
    "failed_rule": None | "security",
    "messages": "Input rejected" | "original input text",
    "reasons": ["..."],
    "time_taken": 0.45,
}

Field	Type	Description
`status`	str	`"passed"` or `"failed"`
`completed_rules`	list	Rules that ran to completion
`uncompleted_rules`	list	Rules that didn’t finish (timeout, error)
`failed_rule`	str or None	First rule that failed
`messages`	str	Action message if failed, original input if passed
`reasons`	list	Reasoning for each rule (if `reason=True`)
`time_taken`	float	Execution time in seconds

Common Patterns

Check before sending to LLM

from fi.evals import Protect
import openai

protect = Protect()
client = openai.OpenAI()

user_input = "Tell me about climate change"

result = protect.protect(
    inputs=user_input,
    protect_rules=[
        {"metric": "content_moderation"},
        {"metric": "security"},
    ],
)

if result["status"] == "passed":
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": user_input}],
        model="gpt-4o-mini",
    )
else:
    print(f"Blocked: {result['failed_rule']}")

Check LLM output before returning to user

from fi.evals import Protect

protect = Protect()
llm_output = "Here is the response..."

result = protect.protect(
    inputs=llm_output,
    protect_rules=[
        {"metric": "bias_detection"},
        {"metric": "data_privacy_compliance"},
    ],
    reason=True,
)

if result["status"] == "failed":
    print(f"Output blocked: {result['reasons']}")