Platform Integration

How Prism AI Gateway connects to the broader Future AGI platform — observability, evaluation, protection, and experimentation.

About

Prism is not a standalone gateway. It’s the data collection and enforcement layer of the Future AGI platform. Every request through Prism generates signals that flow into Observe, Evaluate, Protect, and Experiment — closing the loop between production traffic and model quality.

How the platform fits together

Your application
       │
       ▼
  ┌─────────┐    traces, costs, latency    ┌─────────┐
  │  Prism  │ ─────────────────────────── │ Observe │
  │ Gateway │                              └─────────┘
  │         │    guardrail scores          ┌──────────┐
  │         │ ─────────────────────────── │ Evaluate │
  │         │                              └──────────┘
  │         │    shadow results            ┌───────────┐
  │         │ ─────────────────────────── │ Experiment│
  └─────────┘                              └───────────┘

Prism → Observe

Every request through Prism generates an execution trace — request, response, latency, token counts, cost, provider used, routing decision, and guardrail outcomes. These traces feed directly into the Observe product.

From Observe you can:

View per-request traces with full metadata
Monitor latency percentiles (p50, p95, p99) per model and provider
Track cost breakdown by model, provider, team, or custom metadata dimension
See provider health trends and error rate history
Drill into sessions (x-prism-session-id) to trace conversation-level patterns

How to tag requests for attribution:

from prism import Prism

client = Prism(
    api_key="sk-prism-...",
    base_url="https://gateway.futureagi.com",
    metadata={"team": "search", "feature": "query-expansion", "env": "production"},
)

These metadata fields appear as filterable dimensions in Observe dashboards.

Prism → Evaluate

Prism’s guardrails are backed by the Future AGI evaluation engine. When you configure a Future AGI Evaluation guardrail, Prism sends each request/response pair to the evaluation engine in real time. The engine runs model-level checks — not just regex — to detect hallucinations, quality regressions, and policy violations.

This is the key differentiator from guardrail products that rely on pattern matching: evaluation guardrails score outputs using the same models and metrics you use in offline eval.

The futureagi guardrail type connects Prism to Evaluate:

config = client.guardrails.configs.create(
    name="Production quality gate",
    rules=[
        {
            "name": "futureagi",          # Future AGI evaluation engine
            "stage": "post",
            "mode": "sync",
            "action": "warn",
            "threshold": 0.7,
        }
    ],
)

Guardrail scores and decisions are logged in both Prism (for traffic analysis) and Evaluate (for quality trend tracking).

Prism → Experiment

Shadow experiments in Prism generate comparison data that feeds directly into Experiment pipelines.

When you configure traffic mirroring, Prism collects:

Production model responses
Shadow model responses
Latency and token deltas for each request pair

These paired results appear in the Experiment product where you can:

Run automated scoring on response pairs using evaluation metrics
Calculate win rates across hundreds or thousands of production requests
Make evidence-based migration decisions before switching providers

Enabling shadow experiments:

from prism import Prism, GatewayConfig, TrafficMirrorConfig

client = Prism(
    api_key="sk-prism-...",
    base_url="https://gateway.futureagi.com",
    config=GatewayConfig(
        mirror=TrafficMirrorConfig(
            target_model="claude-sonnet-4-20250514",
            target_provider="anthropic",
            sample_rate=0.1,
        )
    ),
)

Shadow results are automatically synced to the Experiment product for analysis.

Metadata as the connective tissue

The x-prism-metadata header (or metadata= parameter in the SDK) is how you connect Prism data to your application’s dimensions. Tags set on requests flow through to all connected products:

Tag	Use in Observe	Use in Evaluate	Use in Experiment
`metadata.team`	Cost breakdown by team	Quality trends per team	Experiment scoping by team
`metadata.feature`	Latency per feature	Regression alerts per feature	A/B test segmentation
`metadata.user_id`	Per-user cost	User-level quality flags	User cohort experiments
`metadata.env`	Separate prod/staging metrics	Different quality thresholds	Shadow test isolation

Platform Integration

About

How the platform fits together

Prism → Observe

Prism → Evaluate

Prism → Experiment

Metadata as the connective tissue

Next steps

Shadow Experiments

Guardrails

Core Concepts

Cost Tracking

Questions & Discussion

FutureAGI AI Assistant

About

How the platform fits together

Prism → Observe

Prism → Evaluate

Prism → Experiment

Metadata as the connective tissue

Next steps

Shadow Experiments

Guardrails

Core Concepts

Cost Tracking

Questions & Discussion