Observability

Monitor Prism Gateway with logs, metrics, and distributed tracing.

About

Prism logs every request and response, exports metrics to Prometheus and OpenTelemetry, and propagates trace IDs for distributed tracing. No additional setup needed for basic logging - it’s on by default.


Request logging

Every request through Prism is logged with:

  • Request ID, trace ID, session ID
  • Model requested and model actually used
  • Provider that handled the request
  • Input/output token counts
  • Cost
  • Latency
  • Cache status (hit/miss/skip)
  • Guardrail results
  • Any errors or fallback events

Logs sync to the Future AGI dashboard automatically. View them in Prism > Logs.

Log levels

LevelWhat’s logged
errorFailed requests, provider errors, guardrail blocks
warnFallbacks, slow requests, budget warnings
infoEvery request (default)
debugFull request/response bodies, header details

For self-hosted deployments, set the log level in config.yaml:

logging:
  level: info

Distributed tracing

Prism propagates trace IDs across the request lifecycle. Set x-prism-trace-id on incoming requests and the same ID appears in all downstream provider calls and logs.

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    trace_id="trace-from-my-app-abc123",
    user_id="user-42",
)

print(response.prism.trace_id)    # trace-from-my-app-abc123
print(response.prism.provider)    # openai
print(response.prism.latency_ms)  # 342
print(response.prism.cost)        # 0.00015
raw = client.chat.completions.with_raw_response.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    extra_headers={
        "x-prism-trace-id": "trace-from-my-app-abc123",
        "x-prism-user-id": "user-42",
    },
)
print(raw.headers.get("x-prism-trace-id"))
print(raw.headers.get("x-prism-cost"))
curl -i https://gateway.futureagi.com/v1/chat/completions \
  -H "Authorization: Bearer sk-prism-your-key" \
  -H "x-prism-trace-id: trace-from-my-app-abc123" \
  -H "x-prism-user-id: user-42" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
# Look for x-prism-trace-id in response headers

If you don’t set a trace ID, Prism generates one automatically. Use it for correlating gateway logs with your application logs.

OpenTelemetry integration

Self-hosted deployments can export traces to any OpenTelemetry-compatible backend:

telemetry:
  traces:
    enabled: true
    exporter: otlp
    endpoint: "http://otel-collector:4317"
    service_name: "prism-gateway"

Metrics

Prism exports Prometheus metrics on the /-/metrics endpoint.

Available metrics

MetricTypeDescription
prism_requests_totalCounterTotal requests by model, provider, status code
prism_request_duration_secondsHistogramRequest latency distribution
prism_tokens_totalCounterTotal tokens (input + output) by model
prism_cost_totalCounterTotal cost in USD by model and provider
prism_cache_hits_totalCounterCache hits by strategy (exact/semantic)
prism_cache_misses_totalCounterCache misses
prism_provider_errors_totalCounterProvider errors by provider and error code
prism_circuit_breaker_stateGaugeCircuit breaker state (0=closed, 1=open, 2=half-open)
prism_rate_limit_exceeded_totalCounterRate limit rejections by key
prism_guardrail_triggered_totalCounterGuardrail triggers by guardrail name and action

Scrape configuration

# prometheus.yml
scrape_configs:
  - job_name: "prism-gateway"
    scrape_interval: 15s
    metrics_path: "/-/metrics"
    static_configs:
      - targets: ["prism-gateway:8080"]

Self-hosted metrics config

telemetry:
  metrics:
    enabled: true
    prometheus:
      enabled: true
      path: "/-/metrics"

Session tracking

Group related requests into sessions for conversation-level analytics. Set x-prism-session-id on each request in a conversation:

session_id = "user-123-conversation-456"
messages = []

# Each turn in the conversation shares the same session_id
messages.append({"role": "user", "content": "What's the capital of France?"})
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    session_id=session_id,
    user_id="user-123",
)
messages.append({"role": "assistant", "content": response.choices[0].message.content})

messages.append({"role": "user", "content": "What's its population?"})
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    session_id=session_id,
    user_id="user-123",
)

Sessions appear in the dashboard under Prism > Sessions and show:

  • Total requests in the session
  • Cumulative cost
  • Models and providers used
  • Timeline of requests

Alerting

Configure alerts to get notified about issues. See Cost tracking > Budget alerts for alert configuration.

EventWhen it fires
Budget threshold crossedSpend exceeds configured percentage
Error rate spikeError rate exceeds threshold over a time window
Latency spikeP95 latency exceeds threshold
Guardrail triggeredA guardrail blocks or flags a request

Next Steps

Was this page helpful?

Questions & Discussion