Prism AI Gateway
A unified API gateway for 100+ LLM providers with built-in guardrails, intelligent routing, caching, cost controls, and full observability.
Warning
The prism-ai Python package and @futureagi/prism TypeScript package are being renamed. The current packages will continue to work but are deprecated. Watch for the updated package names in an upcoming release.
About
Prism is Future AGI’s AI Gateway. It sits between your application and LLM providers, giving you a single API that handles routing across 100+ providers, safety guardrails, response caching, cost tracking, and full observability.
Note
Already using the OpenAI SDK? Just change base_url to https://gateway.futureagi.com and swap your API key. No other code changes needed. Switch between 100+ providers by changing the model name.
Quick look
from openai import OpenAI
client = OpenAI(
base_url="https://gateway.futureagi.com/v1",
api_key="sk-prism-your-api-key-here"
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What is the capital of France?"}]
)
print(response.choices[0].message.content)import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://gateway.futureagi.com/v1',
apiKey: 'sk-prism-your-api-key-here'
});
const response = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'What is the capital of France?' }]
});
console.log(response.choices[0].message.content);curl -X POST https://gateway.futureagi.com/v1/chat/completions \
-H "Authorization: Bearer sk-prism-your-api-key-here" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "What is the capital of France?"}]}' Features
Manage Providers
Connect 100+ cloud and self-hosted LLM providers
Set Up Guardrails
Add safety policies and content moderation
Configure Routing
Load balancing, failover, and conditional routing
Enable Caching
Reduce costs and latency with response caching
Cost Tracking
Monitor spend and set budget limits
Shadow Experiments
Mirror traffic to alternative models for zero-risk evaluation
Rate Limiting
Control request throughput to the gateway
MCP & A2A
Connect agents via MCP and A2A protocols
Streaming
Stream responses in real time
Supported providers
Prism connects to cloud providers, API services, and self-hosted models. Providers with different native APIs (Anthropic, Gemini, Bedrock, Cohere) are automatically translated to the standard OpenAI format — your code stays the same regardless of which provider handles the request.
| Provider | Type |
|---|---|
| OpenAI | Cloud API |
| Anthropic | Cloud API |
| Google Gemini | Cloud API |
| AWS Bedrock | Cloud API |
| Azure OpenAI | Cloud API |
| Cohere | Cloud API |
| Groq, Together AI, Fireworks | Cloud API |
| Mistral AI, DeepInfra, Perplexity | Cloud API |
| Cerebras, xAI, OpenRouter | Cloud API |
| Ollama, vLLM, LM Studio, TGI | Self-hosted |
See Manage Providers for the full list and configuration details.
Frequently asked questions
Do I need to change my code?
No. If you use the OpenAI SDK, just change base_url and api_key. All providers work through the same OpenAI-format API.
Which providers are supported?
100+ including OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure, Mistral, Groq, and self-hosted models via Ollama, vLLM, and LM Studio.
What happens if a provider goes down?
Prism automatically fails over to healthy backup providers. Configure routing policies with retries, circuit breaking, and failover order.
Is my data stored?
Prism does not store your prompts or completions by default. Caching is opt-in and configurable per organization.
What's the latency overhead?
Prism adds minimal latency to requests. The exact overhead depends on enabled features (guardrails add more than simple routing).
Can I self-host Prism?
Yes. Prism is distributed as a Go binary and Docker image. See the Self-Hosted Deployment guide.