Error handling

Error response format, HTTP status codes, and retry strategies for the Prism Gateway.

About

All Prism errors follow a consistent JSON format with machine-readable codes. This page covers the error structure, HTTP status codes, and retry strategies.


Error format

All errors from Prism follow the same JSON structure:

{
  "error": {
    "message": "Human-readable description of what went wrong",
    "type": "error_category",
    "param": null,
    "code": "machine_readable_code"
  }
}

The type field groups errors into categories. The code field identifies the specific error. Use code for programmatic error handling.


HTTP status codes

Client errors (4xx)

StatusCodeMeaning
400invalid_jsonRequest body is not valid JSON
400missing_modelmodel field is missing from the request
400missing_messagesmessages field is missing or empty
400invalid_request_errorOther request validation failures
401unauthorizedAPI key is missing or invalid
403content_blockedA guardrail blocked the request (enforce mode)
404model_not_foundModel not configured for any provider. Check model_map or use provider/model format.
429rate_limit_exceededPer-key or per-org rate limit exceeded
429budget_exceededOrganization budget limit reached

Server errors (5xx)

StatusCodeMeaning
500internal_errorUnexpected gateway error
501not_supportedProvider doesn’t support this endpoint (e.g. embeddings on a chat-only provider)
502provider_errorProvider returned an error
502provider_404Provider returned 404 (usually wrong API key or model access)
502upstream_errorGeneric upstream provider failure
503service_unavailableGateway is overloaded or shutting down
504timeoutRequest timed out waiting for provider response

Common errors and fixes

model not found (404)

{
  "error": {
    "message": "model \"gpt-4o\" not found in any configured provider. Configure model_map or use 'provider/model' format.",
    "type": "not_found",
    "code": "model_not_found"
  }
}

Causes:

  • The model isn’t enabled for your organization’s providers
  • Typo in the model name
  • Using a model alias without configuring model_map

Fixes:

  • Check available models: GET /v1/models
  • Configure a model map
  • Use the provider/model format: "openai/gpt-4o"

Rate limit exceeded (429)

{
  "error": {
    "message": "Rate limit exceeded. Please retry after the window resets.",
    "type": "rate_limit_error",
    "param": null,
    "code": "rate_limit_exceeded"
  }
}

Check the x-ratelimit-remaining-requests and x-ratelimit-reset-requests response headers to know when to retry. See retry strategies below.

Budget exceeded (429)

{
  "error": {
    "message": "Organization monthly budget of $500.00 exceeded",
    "type": "budget_error",
    "param": null,
    "code": "budget_exceeded"
  }
}

Budget resets at the start of the next period (daily/weekly/monthly). Increase the budget in Rate limiting & budgets or wait for the reset.

Guardrail blocked (403)

{
  "error": {
    "type": "guardrail_triggered",
    "code": "content_blocked",
    "message": "Request blocked by guardrail: pii-detector"
  }
}

The request or response triggered a guardrail in enforce mode. Check the x-prism-guardrail-triggered response header. See Guardrails for configuration.

Provider error (502)

{
  "error": {
    "message": "provider error (HTTP 404): ",
    "type": "upstream_error",
    "code": "provider_404"
  }
}

The gateway reached the provider but got an error back. Common causes:

  • Provider API key is invalid or expired
  • Project-scoped key doesn’t have model access enabled
  • Provider is experiencing an outage

Configure failover to automatically route to backup providers when this happens.


Retry strategies

Exponential backoff

The standard pattern for handling transient errors (429, 5xx):

The Prism SDK retries automatically when you configure RetryConfig:

from prism import Prism, GatewayConfig, RetryConfig

client = Prism(
    api_key="sk-prism-your-key",
    base_url="https://gateway.futureagi.com",
    config=GatewayConfig(
        retry=RetryConfig(
            max_retries=3,
            on_status_codes=[429, 500, 502, 503, 504],
            backoff_factor=0.5,
        ),
    ),
)

# Retries happen automatically on configured status codes
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

The OpenAI SDK has built-in retry logic with exponential backoff:

from openai import OpenAI

client = OpenAI(
    base_url="https://gateway.futureagi.com/v1",
    api_key="sk-prism-your-key",
    max_retries=3,  # built-in retry with backoff
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)
import time
import requests

def call_with_retry(max_attempts=4):
    for attempt in range(max_attempts):
        response = requests.post(
            "https://gateway.futureagi.com/v1/chat/completions",
            headers={
                "Authorization": "Bearer sk-prism-your-key",
                "Content-Type": "application/json",
            },
            json={
                "model": "gpt-4o",
                "messages": [{"role": "user", "content": "Hello"}],
            },
        )

        if response.status_code == 200:
            return response.json()

        if response.status_code in (429, 500, 502, 503, 504):
            if attempt < max_attempts - 1:
                wait = min(2 ** attempt, 30)  # 1s, 2s, 4s, capped at 30s
                print(f"Attempt {attempt + 1} failed ({response.status_code}), retrying in {wait}s")
                time.sleep(wait)
                continue

        # Non-retryable error or final attempt
        response.raise_for_status()

    raise Exception(f"Failed after {max_attempts} attempts")

What to retry

StatusRetry?Why
400NoBad request, fix the input
401NoBad credentials, fix the API key
403NoBlocked by guardrail or RBAC
404NoModel not found, fix the model name
429YesRate limit, back off and retry
500YesInternal error, may be transient
502YesProvider error, may recover
503YesService unavailable, may recover
504YesTimeout, may succeed on retry

Using failover instead of retry

For production systems, configure routing with failover instead of client-side retries. Prism automatically routes to the next provider on failure, which is faster than waiting and retrying the same provider.


Error handling in SDKs

Prism SDK exceptions

from prism import Prism, APIStatusError, RateLimitError, AuthenticationError

client = Prism(
    api_key="sk-prism-your-key",
    base_url="https://gateway.futureagi.com",
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
    )
except RateLimitError:
    print("Rate limited, back off and retry")
except AuthenticationError:
    print("Bad API key")
except APIStatusError as e:
    print(f"API error {e.status_code}: {e.message}")

OpenAI SDK exceptions

from openai import OpenAI, RateLimitError, AuthenticationError, APIError

client = OpenAI(
    base_url="https://gateway.futureagi.com/v1",
    api_key="sk-prism-your-key",
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
    )
except RateLimitError:
    print("Rate limited")
except AuthenticationError:
    print("Bad API key")
except APIError as e:
    print(f"API error {e.status_code}: {e.message}")

Next Steps

Was this page helpful?

Questions & Discussion