Supported providers

All LLM providers Prism supports, how to add them, and how to switch providers at request time.

About

Prism supports 20+ cloud and self-hosted LLM providers through a unified OpenAI-compatible API. Add a provider once with its API key, then switch between providers by changing the model name in your request.

Cloud providers

ProviderTypeapi_formatAuthNotes
OpenAIopenaiopenaiAPI keyNative format
AnthropicanthropicanthropicAPI keyAuto-translated to OpenAI format
Google GeminigeminigeminiAPI keyAuto-translated to OpenAI format
Google Vertex AIvertexaigeminiBearer tokenUses GCP project/location headers
AWS BedrockbedrockbedrockSigV4Requires AWS region, cross-region failover supported
Azure OpenAIazureazureAPI keyRequires api_version, supports Azure AD bearer auth
CoherecoherecohereAPI keyAuto-translated to OpenAI format
GroqgroqopenaiAPI keyOpenAI-compatible
Mistral AImistralopenaiAPI keyOpenAI-compatible
Together AItogetheropenaiAPI keyOpenAI-compatible
Fireworks AIfireworksopenaiAPI keyOpenAI-compatible
DeepInfradeepinfraopenaiAPI keyOpenAI-compatible
PerplexityperplexityopenaiAPI keyOpenAI-compatible
CerebrascerebrasopenaiAPI keyOpenAI-compatible
xAI (Grok)xaiopenaiAPI keyOpenAI-compatible
OpenRouteropenrouteropenaiAPI keyOpenAI-compatible
Hugging FacehuggingfaceopenaiAPI keyInference API
AnyscaleanyscaleopenaiAPI keyOpenAI-compatible
ReplicatereplicateopenaiAPI keyOpenAI-compatible

Providers marked “OpenAI-compatible” use the same wire format as OpenAI. No translation needed. Providers with native formats (Anthropic, Gemini, Bedrock, Cohere) are automatically translated by Prism - your code stays identical regardless of which provider handles the request.

Tip

Prism supports all models from each provider, including new releases. Use any model name your provider supports.

Self-hosted providers

ProviderTypeNotes
OllamaollamaAuto-discovers models from /v1/models
vLLMvllmAuto-discovers models from /v1/models
LM StudiolmstudioOpenAI-compatible
HuggingFace TGItgiOpenAI-compatible
LocalAIlocalaiOpenAI-compatible
Any OpenAI-compatible server-Works with any server implementing the OpenAI API

Note

Your self-hosted endpoint must be reachable from the Prism gateway. Use a tunnel (ngrok, Cloudflare Tunnel), a cloud VM with a public IP, or deploy behind a reverse proxy.


Adding a provider

  1. Go to Prism > Providers in the Future AGI dashboard
  2. Click Add Provider
  3. Select the provider from the list
  4. Enter your API key and any required settings
  5. Click Save
from prism import Prism

client = Prism(
    api_key="sk-prism-your-key",
    base_url="https://gateway.futureagi.com",
    control_plane_url="https://api.futureagi.com",
)

client.org_configs.create(
    org_id="your-org-id",
    config={
        "providers": {
            "openai": {
                "api_key": "sk-your-openai-key",
                "api_format": "openai",
                "models": ["gpt-4o", "gpt-4o-mini"],
            },
            "anthropic": {
                "api_key": "sk-ant-your-key",
                "api_format": "anthropic",
            },
        }
    }
)
import { Prism } from "@futureagi/prism";

const client = new Prism({
    apiKey: "sk-prism-your-key",
    baseUrl: "https://gateway.futureagi.com",
    controlPlaneUrl: "https://api.futureagi.com",
});

await client.orgConfigs.create({
    orgId: "your-org-id",
    config: {
        providers: {
            openai: {
                api_key: "sk-your-openai-key",
                api_format: "openai",
                models: ["gpt-4o", "gpt-4o-mini"],
            },
            anthropic: {
                api_key: "sk-ant-your-key",
                api_format: "anthropic",
            },
        },
    },
});

Warning

Provider API keys are stored encrypted and never exposed in API responses.


Switching providers at request time

Change the model name to route to a different provider. Same code, same API, different LLM.

from prism import Prism

client = Prism(
    api_key="sk-prism-your-key",
    base_url="https://gateway.futureagi.com",
)

# OpenAI
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

# Anthropic - same code, different model
response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello"}]
)

# Google Gemini
response = client.chat.completions.create(
    model="gemini-2.0-flash",
    messages=[{"role": "user", "content": "Hello"}]
)
from openai import OpenAI

# Works with the OpenAI SDK - just swap base_url and api_key
client = OpenAI(
    base_url="https://gateway.futureagi.com/v1",
    api_key="sk-prism-your-key",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)
import litellm

response = litellm.completion(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    api_key="sk-prism-your-key",
    base_url="https://gateway.futureagi.com/v1",
)
curl -X POST https://gateway.futureagi.com/v1/chat/completions \
  -H "Authorization: Bearer sk-prism-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Self-hosted setup

Connect models running on your own infrastructure.

  1. Go to Prism > Providers
  2. Click Add Provider
  3. Enter your model’s public endpoint URL
  4. Enter the model name
  5. Click Save
from prism import Prism

client = Prism(
    api_key="sk-prism-your-key",
    base_url="https://gateway.futureagi.com",
    control_plane_url="https://api.futureagi.com",
)

client.org_configs.create(
    org_id="your-org-id",
    config={
        "providers": {
            "ollama": {
                "base_url": "https://your-ollama.example.com",
                "api_format": "openai",
                "type": "ollama",
                # models auto-discovered from /v1/models
            },
            "vllm": {
                "base_url": "https://your-vllm.example.com",
                "api_format": "openai",
                "type": "vllm",
                "models": ["meta-llama/Llama-3.1-8B-Instruct"],
            },
        }
    }
)
import { Prism } from "@futureagi/prism";

const client = new Prism({
    apiKey: "sk-prism-your-key",
    baseUrl: "https://gateway.futureagi.com",
    controlPlaneUrl: "https://api.futureagi.com",
});

await client.orgConfigs.create({
    orgId: "your-org-id",
    config: {
        providers: {
            ollama: {
                base_url: "https://your-ollama.example.com",
                api_format: "openai",
                type: "ollama",
            },
            vllm: {
                base_url: "https://your-vllm.example.com",
                api_format: "openai",
                type: "vllm",
                models: ["meta-llama/Llama-3.1-8B-Instruct"],
            },
        },
    },
});

Provider health

Prism monitors provider health automatically. It tracks response times, error rates, and availability. When a provider becomes unhealthy:

  1. The circuit breaker opens to stop sending requests to the failing provider
  2. Traffic fails over to healthy alternatives
  3. After a cooldown period, Prism sends probe requests to check recovery
  4. Once the provider responds successfully, it’s added back to the rotation

See Failover & circuit breaking for configuration details.


Next Steps

Was this page helpful?

Questions & Discussion