Error handling
Error response format, HTTP status codes, and retry strategies for the Prism Gateway.
About
All Prism errors follow a consistent JSON format with machine-readable codes. This page covers the error structure, HTTP status codes, and retry strategies.
Error format
All errors from Prism follow the same JSON structure:
{
"error": {
"message": "Human-readable description of what went wrong",
"type": "error_category",
"param": null,
"code": "machine_readable_code"
}
}
The type field groups errors into categories. The code field identifies the specific error. Use code for programmatic error handling.
HTTP status codes
Client errors (4xx)
| Status | Code | Meaning |
|---|---|---|
| 400 | invalid_json | Request body is not valid JSON |
| 400 | missing_model | model field is missing from the request |
| 400 | missing_messages | messages field is missing or empty |
| 400 | invalid_request_error | Other request validation failures |
| 401 | unauthorized | API key is missing or invalid |
| 403 | content_blocked | A guardrail blocked the request (enforce mode) |
| 404 | model_not_found | Model not configured for any provider. Check model_map or use provider/model format. |
| 429 | rate_limit_exceeded | Per-key or per-org rate limit exceeded |
| 429 | budget_exceeded | Organization budget limit reached |
Server errors (5xx)
| Status | Code | Meaning |
|---|---|---|
| 500 | internal_error | Unexpected gateway error |
| 501 | not_supported | Provider doesn’t support this endpoint (e.g. embeddings on a chat-only provider) |
| 502 | provider_error | Provider returned an error |
| 502 | provider_404 | Provider returned 404 (usually wrong API key or model access) |
| 502 | upstream_error | Generic upstream provider failure |
| 503 | service_unavailable | Gateway is overloaded or shutting down |
| 504 | timeout | Request timed out waiting for provider response |
Common errors and fixes
model not found (404)
{
"error": {
"message": "model \"gpt-4o\" not found in any configured provider. Configure model_map or use 'provider/model' format.",
"type": "not_found",
"code": "model_not_found"
}
}
Causes:
- The model isn’t enabled for your organization’s providers
- Typo in the model name
- Using a model alias without configuring
model_map
Fixes:
- Check available models:
GET /v1/models - Configure a model map
- Use the
provider/modelformat:"openai/gpt-4o"
Rate limit exceeded (429)
{
"error": {
"message": "Rate limit exceeded. Please retry after the window resets.",
"type": "rate_limit_error",
"param": null,
"code": "rate_limit_exceeded"
}
}
Check the x-ratelimit-remaining-requests and x-ratelimit-reset-requests response headers to know when to retry. See retry strategies below.
Budget exceeded (429)
{
"error": {
"message": "Organization monthly budget of $500.00 exceeded",
"type": "budget_error",
"param": null,
"code": "budget_exceeded"
}
}
Budget resets at the start of the next period (daily/weekly/monthly). Increase the budget in Rate limiting & budgets or wait for the reset.
Guardrail blocked (403)
{
"error": {
"type": "guardrail_triggered",
"code": "content_blocked",
"message": "Request blocked by guardrail: pii-detector"
}
}
The request or response triggered a guardrail in enforce mode. Check the x-prism-guardrail-triggered response header. See Guardrails for configuration.
Provider error (502)
{
"error": {
"message": "provider error (HTTP 404): ",
"type": "upstream_error",
"code": "provider_404"
}
}
The gateway reached the provider but got an error back. Common causes:
- Provider API key is invalid or expired
- Project-scoped key doesn’t have model access enabled
- Provider is experiencing an outage
Configure failover to automatically route to backup providers when this happens.
Retry strategies
Exponential backoff
The standard pattern for handling transient errors (429, 5xx):
The Prism SDK retries automatically when you configure RetryConfig:
from prism import Prism, GatewayConfig, RetryConfig
client = Prism(
api_key="sk-prism-your-key",
base_url="https://gateway.futureagi.com",
config=GatewayConfig(
retry=RetryConfig(
max_retries=3,
on_status_codes=[429, 500, 502, 503, 504],
backoff_factor=0.5,
),
),
)
# Retries happen automatically on configured status codes
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
) The OpenAI SDK has built-in retry logic with exponential backoff:
from openai import OpenAI
client = OpenAI(
base_url="https://gateway.futureagi.com/v1",
api_key="sk-prism-your-key",
max_retries=3, # built-in retry with backoff
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
) import time
import requests
def call_with_retry(max_attempts=4):
for attempt in range(max_attempts):
response = requests.post(
"https://gateway.futureagi.com/v1/chat/completions",
headers={
"Authorization": "Bearer sk-prism-your-key",
"Content-Type": "application/json",
},
json={
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello"}],
},
)
if response.status_code == 200:
return response.json()
if response.status_code in (429, 500, 502, 503, 504):
if attempt < max_attempts - 1:
wait = min(2 ** attempt, 30) # 1s, 2s, 4s, capped at 30s
print(f"Attempt {attempt + 1} failed ({response.status_code}), retrying in {wait}s")
time.sleep(wait)
continue
# Non-retryable error or final attempt
response.raise_for_status()
raise Exception(f"Failed after {max_attempts} attempts") What to retry
| Status | Retry? | Why |
|---|---|---|
| 400 | No | Bad request, fix the input |
| 401 | No | Bad credentials, fix the API key |
| 403 | No | Blocked by guardrail or RBAC |
| 404 | No | Model not found, fix the model name |
| 429 | Yes | Rate limit, back off and retry |
| 500 | Yes | Internal error, may be transient |
| 502 | Yes | Provider error, may recover |
| 503 | Yes | Service unavailable, may recover |
| 504 | Yes | Timeout, may succeed on retry |
Using failover instead of retry
For production systems, configure routing with failover instead of client-side retries. Prism automatically routes to the next provider on failure, which is faster than waiting and retrying the same provider.
Error handling in SDKs
Prism SDK exceptions
from prism import Prism, APIStatusError, RateLimitError, AuthenticationError
client = Prism(
api_key="sk-prism-your-key",
base_url="https://gateway.futureagi.com",
)
try:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)
except RateLimitError:
print("Rate limited, back off and retry")
except AuthenticationError:
print("Bad API key")
except APIStatusError as e:
print(f"API error {e.status_code}: {e.message}")
OpenAI SDK exceptions
from openai import OpenAI, RateLimitError, AuthenticationError, APIError
client = OpenAI(
base_url="https://gateway.futureagi.com/v1",
api_key="sk-prism-your-key",
)
try:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)
except RateLimitError:
print("Rate limited")
except AuthenticationError:
print("Bad API key")
except APIError as e:
print(f"API error {e.status_code}: {e.message}")