Error handling

Error response format, HTTP status codes, and retry strategies for the Prism Gateway.

About

All Prism errors follow a consistent JSON format with machine-readable codes. This page covers the error structure, HTTP status codes, and retry strategies.

Error format

All errors from Prism follow the same JSON structure:

{
  "error": {
    "message": "Human-readable description of what went wrong",
    "type": "error_category",
    "param": null,
    "code": "machine_readable_code"
  }
}

The type field groups errors into categories. The code field identifies the specific error. Use code for programmatic error handling.

HTTP status codes

Client errors (4xx)

Status	Code	Meaning
400	`invalid_json`	Request body is not valid JSON
400	`missing_model`	`model` field is missing from the request
400	`missing_messages`	`messages` field is missing or empty
400	`invalid_request_error`	Other request validation failures
401	`unauthorized`	API key is missing or invalid
403	`content_blocked`	A guardrail blocked the request (enforce mode)
404	`model_not_found`	Model not configured for any provider. Check `model_map` or use `provider/model` format.
429	`rate_limit_exceeded`	Per-key or per-org rate limit exceeded
429	`budget_exceeded`	Organization budget limit reached

Server errors (5xx)

Status	Code	Meaning
500	`internal_error`	Unexpected gateway error
501	`not_supported`	Provider doesn’t support this endpoint (e.g. embeddings on a chat-only provider)
502	`provider_error`	Provider returned an error
502	`provider_404`	Provider returned 404 (usually wrong API key or model access)
502	`upstream_error`	Generic upstream provider failure
503	`service_unavailable`	Gateway is overloaded or shutting down
504	`timeout`	Request timed out waiting for provider response

Common errors and fixes

model not found (404)

{
  "error": {
    "message": "model \"gpt-4o\" not found in any configured provider. Configure model_map or use 'provider/model' format.",
    "type": "not_found",
    "code": "model_not_found"
  }
}

Causes:

The model isn’t enabled for your organization’s providers
Typo in the model name
Using a model alias without configuring model_map

Fixes:

Check available models: GET /v1/models
Configure a model map
Use the provider/model format: "openai/gpt-4o"

Rate limit exceeded (429)

{
  "error": {
    "message": "Rate limit exceeded. Please retry after the window resets.",
    "type": "rate_limit_error",
    "param": null,
    "code": "rate_limit_exceeded"
  }
}

Check the x-ratelimit-remaining-requests and x-ratelimit-reset-requests response headers to know when to retry. See retry strategies below.

Budget exceeded (429)

{
  "error": {
    "message": "Organization monthly budget of $500.00 exceeded",
    "type": "budget_error",
    "param": null,
    "code": "budget_exceeded"
  }
}

Budget resets at the start of the next period (daily/weekly/monthly). Increase the budget in Rate limiting & budgets or wait for the reset.

Guardrail blocked (403)

{
  "error": {
    "type": "guardrail_triggered",
    "code": "content_blocked",
    "message": "Request blocked by guardrail: pii-detector"
  }
}

The request or response triggered a guardrail in enforce mode. Check the x-prism-guardrail-triggered response header. See Guardrails for configuration.

Provider error (502)

{
  "error": {
    "message": "provider error (HTTP 404): ",
    "type": "upstream_error",
    "code": "provider_404"
  }
}

The gateway reached the provider but got an error back. Common causes:

Provider API key is invalid or expired
Project-scoped key doesn’t have model access enabled
Provider is experiencing an outage

Configure failover to automatically route to backup providers when this happens.

Retry strategies

Exponential backoff

The standard pattern for handling transient errors (429, 5xx):

The Prism SDK retries automatically when you configure RetryConfig:

from prism import Prism, GatewayConfig, RetryConfig

client = Prism(
    api_key="sk-prism-your-key",
    base_url="https://gateway.futureagi.com",
    config=GatewayConfig(
        retry=RetryConfig(
            max_retries=3,
            on_status_codes=[429, 500, 502, 503, 504],
            backoff_factor=0.5,
        ),
    ),
)

# Retries happen automatically on configured status codes
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

The OpenAI SDK has built-in retry logic with exponential backoff:

from openai import OpenAI

client = OpenAI(
    base_url="https://gateway.futureagi.com/v1",
    api_key="sk-prism-your-key",
    max_retries=3,  # built-in retry with backoff
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

import time
import requests

def call_with_retry(max_attempts=4):
    for attempt in range(max_attempts):
        response = requests.post(
            "https://gateway.futureagi.com/v1/chat/completions",
            headers={
                "Authorization": "Bearer sk-prism-your-key",
                "Content-Type": "application/json",
            },
            json={
                "model": "gpt-4o",
                "messages": [{"role": "user", "content": "Hello"}],
            },
        )

        if response.status_code == 200:
            return response.json()

        if response.status_code in (429, 500, 502, 503, 504):
            if attempt < max_attempts - 1:
                wait = min(2 ** attempt, 30)  # 1s, 2s, 4s, capped at 30s
                print(f"Attempt {attempt + 1} failed ({response.status_code}), retrying in {wait}s")
                time.sleep(wait)
                continue

        # Non-retryable error or final attempt
        response.raise_for_status()

    raise Exception(f"Failed after {max_attempts} attempts")

What to retry

Status	Retry?	Why
400	No	Bad request, fix the input
401	No	Bad credentials, fix the API key
403	No	Blocked by guardrail or RBAC
404	No	Model not found, fix the model name
429	Yes	Rate limit, back off and retry
500	Yes	Internal error, may be transient
502	Yes	Provider error, may recover
503	Yes	Service unavailable, may recover
504	Yes	Timeout, may succeed on retry

Using failover instead of retry

For production systems, configure routing with failover instead of client-side retries. Prism automatically routes to the next provider on failure, which is faster than waiting and retrying the same provider.

Error handling in SDKs

Prism SDK exceptions

from prism import Prism, APIStatusError, RateLimitError, AuthenticationError

client = Prism(
    api_key="sk-prism-your-key",
    base_url="https://gateway.futureagi.com",
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
    )
except RateLimitError:
    print("Rate limited, back off and retry")
except AuthenticationError:
    print("Bad API key")
except APIStatusError as e:
    print(f"API error {e.status_code}: {e.message}")

OpenAI SDK exceptions

from openai import OpenAI, RateLimitError, AuthenticationError, APIError

client = OpenAI(
    base_url="https://gateway.futureagi.com/v1",
    api_key="sk-prism-your-key",
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
    )
except RateLimitError:
    print("Rate limited")
except AuthenticationError:
    print("Bad API key")
except APIError as e:
    print(f"API error {e.status_code}: {e.message}")

Error handling

About

Error format

HTTP status codes

Client errors (4xx)

Server errors (5xx)

Common errors and fixes

model not found (404)

Rate limit exceeded (429)

Budget exceeded (429)

Guardrail blocked (403)

Provider error (502)

Retry strategies

Exponential backoff

What to retry

Using failover instead of retry

Error handling in SDKs

Prism SDK exceptions

OpenAI SDK exceptions

Next Steps

Troubleshooting

Routing & failover

Rate limiting

Request & response headers

Questions & Discussion

FutureAGI AI Assistant

About

Error format

HTTP status codes

Client errors (4xx)

Server errors (5xx)

Common errors and fixes

model not found (404)

Rate limit exceeded (429)

Budget exceeded (429)

Guardrail blocked (403)

Provider error (502)

Retry strategies

Exponential backoff

What to retry

Using failover instead of retry

Error handling in SDKs

Prism SDK exceptions

OpenAI SDK exceptions

Next Steps

Troubleshooting

Routing & failover

Rate limiting

Request & response headers

Questions & Discussion