About
Prism exposes 97 endpoints across 20+ categories. All inference endpoints live under /v1/ and follow the OpenAI API format. Admin endpoints live under /-/ and require an admin token.
Base URL
All endpoints are relative to your Prism gateway URL:
https://gateway.futureagi.com
Inference endpoints use the /v1/ prefix and accept your virtual API key (sk-prism-...) as a Bearer token. Admin endpoints use the /-/ prefix and require the admin token.
Chat and completions
The primary endpoints for generating text with LLMs.
| Method | Path | Description |
|---|
| POST | /v1/chat/completions | Chat completion (streaming and non-streaming) |
| POST | /v1/completions | Text completion (legacy) |
| POST | /v1/count_tokens | Count tokens for a set of messages |
Embeddings, reranking, and search
| Method | Path | Description |
|---|
| POST | /v1/embeddings | Generate text embeddings |
| POST | /v1/rerank | Rerank text passages by relevance |
| POST | /v1/search | Search API |
| POST | /v1/ocr | Optical character recognition |
Audio
| Method | Path | Description |
|---|
| POST | /v1/audio/speech | Text-to-speech |
| POST | /v1/audio/speech/stream | Streaming text-to-speech |
| POST | /v1/audio/transcriptions | Speech-to-text (Whisper) |
| POST | /v1/audio/translations | Translate audio to English |
Images and video
| Method | Path | Description |
|---|
| POST | /v1/images/generations | Generate images from prompts |
| POST | /v1/videos | Submit video generation job |
| GET | /v1/videos | List video jobs |
| GET | /v1/videos/{video_id} | Get video job status |
| DELETE | /v1/videos/{video_id} | Cancel video job |
Files
| Method | Path | Description |
|---|
| POST | /v1/files | Upload a file |
| GET | /v1/files | List files |
| GET | /v1/files/{file_id} | Get file metadata |
| GET | /v1/files/{file_id}/content | Download file content |
| DELETE | /v1/files/{file_id} | Delete a file |
Vector stores
Used with the Assistants API for file-based retrieval.
| Method | Path | Description |
|---|
| POST | /v1/vector_stores | Create vector store |
| GET | /v1/vector_stores | List vector stores |
| GET | /v1/vector_stores/{id} | Get vector store |
| POST | /v1/vector_stores/{id} | Update vector store |
| DELETE | /v1/vector_stores/{id} | Delete vector store |
| POST | /v1/vector_stores/{id}/search | Search a vector store |
| POST | /v1/vector_stores/{id}/files | Add file to vector store |
| GET | /v1/vector_stores/{id}/files | List files in vector store |
| DELETE | /v1/vector_stores/{id}/files/{file_id} | Remove file from vector store |
| POST | /v1/vector_stores/{id}/file_batches | Batch add files |
Assistants API
Full proxy for the OpenAI Assistants API. Create assistants, manage threads, send messages, and execute runs.
Assistants
| Method | Path | Description |
|---|
| POST | /v1/assistants | Create assistant |
| GET | /v1/assistants | List assistants |
| GET | /v1/assistants/{id} | Get assistant |
| POST | /v1/assistants/{id} | Update assistant |
| DELETE | /v1/assistants/{id} | Delete assistant |
Threads
| Method | Path | Description |
|---|
| POST | /v1/threads | Create thread |
| GET | /v1/threads/{id} | Get thread |
| POST | /v1/threads/{id} | Update thread |
| DELETE | /v1/threads/{id} | Delete thread |
Messages
| Method | Path | Description |
|---|
| POST | /v1/threads/{id}/messages | Add message |
| GET | /v1/threads/{id}/messages | List messages |
| GET | /v1/threads/{id}/messages/{msg_id} | Get message |
| POST | /v1/threads/{id}/messages/{msg_id} | Update message |
| DELETE | /v1/threads/{id}/messages/{msg_id} | Delete message |
Runs
| Method | Path | Description |
|---|
| POST | /v1/threads/{id}/runs | Create run |
| GET | /v1/threads/{id}/runs | List runs |
| GET | /v1/threads/{id}/runs/{run_id} | Get run |
| POST | /v1/threads/{id}/runs/{run_id} | Update run |
| POST | /v1/threads/{id}/runs/{run_id}/cancel | Cancel run |
| POST | /v1/threads/{id}/runs/{run_id}/submit_tool_outputs | Submit tool outputs |
| GET | /v1/threads/{id}/runs/{run_id}/steps | List run steps |
| GET | /v1/threads/{id}/runs/{run_id}/steps/{step_id} | Get run step |
| POST | /v1/threads/runs | Create thread and run in one call |
Responses API
| Method | Path | Description |
|---|
| POST | /v1/responses | Create response |
| GET | /v1/responses/{id} | Get response |
| DELETE | /v1/responses/{id} | Delete response |
Async inference
| Method | Path | Description |
|---|
| GET | /v1/async/{job_id} | Get async job status and result |
| DELETE | /v1/async/{job_id} | Cancel async job |
Async jobs are created by sending a regular chat completion request with async mode enabled. The batch API is available via admin endpoints below.
Scheduled completions
| Method | Path | Description |
|---|
| POST | /v1/scheduled | Schedule a completion for later |
| GET | /v1/scheduled | List scheduled jobs |
| GET | /v1/scheduled/{job_id} | Get scheduled job |
| DELETE | /v1/scheduled/{job_id} | Cancel scheduled job |
Realtime (WebSocket)
| Method | Path | Description |
|---|
| GET | /v1/realtime | Upgrade to WebSocket for real-time audio/video streaming |
For clients that prefer a provider’s native API format instead of the OpenAI format.
| Method | Path | Description |
|---|
| POST | /v1/messages | Anthropic Messages API (native format) |
| POST | /v1/messages/count_tokens | Anthropic token counting |
| POST | /v1beta/models/{model}:generateContent | Google GenAI generate content |
| POST | /v1beta/models/{model}:streamGenerateContent | Google GenAI streaming |
Models
| Method | Path | Description |
|---|
| GET | /v1/models | List all available models |
| GET | /v1/models/{model} | Get model details |
MCP (Model Context Protocol)
Prism acts as an MCP server, aggregating tools from upstream MCP tool servers.
| Method | Path | Description |
|---|
| POST | /mcp | MCP protocol endpoint |
| GET | /mcp | MCP SSE streaming endpoint |
Management
| Method | Path | Description |
|---|
| GET | /-/mcp/status | MCP server status and stats |
| GET | /-/mcp/tools | List available tools |
| GET | /-/mcp/resources | List MCP resources |
| GET | /-/mcp/prompts | List MCP prompts |
| POST | /-/mcp/test | Test tool execution |
A2A (Agent-to-Agent)
| Method | Path | Description |
|---|
| GET | /.well-known/agent.json | Agent capabilities card |
| POST | /a2a | A2A protocol messages |
| GET | /v1/agents | List registered A2A agents |
Admin: key management
Requires admin token.
| Method | Path | Description |
|---|
| POST | /-/keys | Create API key |
| GET | /-/keys | List keys |
| GET | /-/keys/{key_id} | Get key details |
| PUT | /-/keys/{key_id} | Update key |
| DELETE | /-/keys/{key_id} | Revoke key |
| POST | /-/keys/{key_id}/credits | Add credits to key |
Admin: organization config
| Method | Path | Description |
|---|
| GET | /-/orgs/{org_id}/config | Get org config |
| PUT | /-/orgs/{org_id}/config | Set org config |
| DELETE | /-/orgs/{org_id}/config | Delete org config |
| GET | /-/orgs/configs | List all org configs |
| POST | /-/orgs/configs/bulk | Bulk load configs |
Admin: operations
| Method | Path | Description |
|---|
| GET | /-/cluster/nodes | List cluster nodes |
| POST | /-/admin/providers/{id}/rotate | Start key rotation |
| GET | /-/admin/providers/{id}/rotation | Get rotation status |
| POST | /-/admin/providers/{id}/rotate/promote | Promote rotated key |
| POST | /-/admin/providers/{id}/rotate/rollback | Rollback rotation |
| POST | /-/batches | Submit batch job |
| GET | /-/batches/{batch_id} | Get batch status |
| POST | /-/batches/{batch_id}/cancel | Cancel batch |
| GET | /-/shadow/stats | Shadow testing statistics |
Health and diagnostics
| Method | Path | Description |
|---|
| GET | /healthz | Liveness probe |
| GET | /livez | Liveness probe (alias) |
| GET | /readyz | Readiness probe |
| POST | /-/reload | Reload config from file |
| GET | /-/config | Server config summary |
| GET | /-/metrics | Prometheus metrics |
| GET | /-/health/providers | Provider health status |
| GET | /-/health/providers/{org_id} | Org-specific provider health |
Next Steps