Endpoints overview

Complete list of all API endpoints available through the Prism Gateway.

About

Prism exposes 97 endpoints across 20+ categories. All inference endpoints live under /v1/ and follow the OpenAI API format. Admin endpoints live under /-/ and require an admin token.

Base URL

All endpoints are relative to your Prism gateway URL:

https://gateway.futureagi.com

Inference endpoints use the /v1/ prefix and accept your virtual API key (sk-prism-...) as a Bearer token. Admin endpoints use the /-/ prefix and require the admin token.

Chat and completions

The primary endpoints for generating text with LLMs.

Method	Path	Description
POST	`/v1/chat/completions`	Chat completion (streaming and non-streaming)
POST	`/v1/completions`	Text completion (legacy)
POST	`/v1/count_tokens`	Count tokens for a set of messages

Embeddings, reranking, and search

Method	Path	Description
POST	`/v1/embeddings`	Generate text embeddings
POST	`/v1/rerank`	Rerank text passages by relevance
POST	`/v1/search`	Search API
POST	`/v1/ocr`	Optical character recognition

Audio

Method	Path	Description
POST	`/v1/audio/speech`	Text-to-speech
POST	`/v1/audio/speech/stream`	Streaming text-to-speech
POST	`/v1/audio/transcriptions`	Speech-to-text (Whisper)
POST	`/v1/audio/translations`	Translate audio to English

Images and video

Method	Path	Description
POST	`/v1/images/generations`	Generate images from prompts
POST	`/v1/videos`	Submit video generation job
GET	`/v1/videos`	List video jobs
GET	`/v1/videos/{video_id}`	Get video job status
DELETE	`/v1/videos/{video_id}`	Cancel video job

Files

Method	Path	Description
POST	`/v1/files`	Upload a file
GET	`/v1/files`	List files
GET	`/v1/files/{file_id}`	Get file metadata
GET	`/v1/files/{file_id}/content`	Download file content
DELETE	`/v1/files/{file_id}`	Delete a file

Vector stores

Used with the Assistants API for file-based retrieval.

Method	Path	Description
POST	`/v1/vector_stores`	Create vector store
GET	`/v1/vector_stores`	List vector stores
GET	`/v1/vector_stores/{id}`	Get vector store
POST	`/v1/vector_stores/{id}`	Update vector store
DELETE	`/v1/vector_stores/{id}`	Delete vector store
POST	`/v1/vector_stores/{id}/search`	Search a vector store
POST	`/v1/vector_stores/{id}/files`	Add file to vector store
GET	`/v1/vector_stores/{id}/files`	List files in vector store
DELETE	`/v1/vector_stores/{id}/files/{file_id}`	Remove file from vector store
POST	`/v1/vector_stores/{id}/file_batches`	Batch add files

Assistants API

Full proxy for the OpenAI Assistants API. Create assistants, manage threads, send messages, and execute runs.

Assistants

Method	Path	Description
POST	`/v1/assistants`	Create assistant
GET	`/v1/assistants`	List assistants
GET	`/v1/assistants/{id}`	Get assistant
POST	`/v1/assistants/{id}`	Update assistant
DELETE	`/v1/assistants/{id}`	Delete assistant

Threads

Method	Path	Description
POST	`/v1/threads`	Create thread
GET	`/v1/threads/{id}`	Get thread
POST	`/v1/threads/{id}`	Update thread
DELETE	`/v1/threads/{id}`	Delete thread

Messages

Method	Path	Description
POST	`/v1/threads/{id}/messages`	Add message
GET	`/v1/threads/{id}/messages`	List messages
GET	`/v1/threads/{id}/messages/{msg_id}`	Get message
POST	`/v1/threads/{id}/messages/{msg_id}`	Update message
DELETE	`/v1/threads/{id}/messages/{msg_id}`	Delete message

Runs

Method	Path	Description
POST	`/v1/threads/{id}/runs`	Create run
GET	`/v1/threads/{id}/runs`	List runs
GET	`/v1/threads/{id}/runs/{run_id}`	Get run
POST	`/v1/threads/{id}/runs/{run_id}`	Update run
POST	`/v1/threads/{id}/runs/{run_id}/cancel`	Cancel run
POST	`/v1/threads/{id}/runs/{run_id}/submit_tool_outputs`	Submit tool outputs
GET	`/v1/threads/{id}/runs/{run_id}/steps`	List run steps
GET	`/v1/threads/{id}/runs/{run_id}/steps/{step_id}`	Get run step
POST	`/v1/threads/runs`	Create thread and run in one call

Responses API

Method	Path	Description
POST	`/v1/responses`	Create response
GET	`/v1/responses/{id}`	Get response
DELETE	`/v1/responses/{id}`	Delete response

Async inference

Method	Path	Description
GET	`/v1/async/{job_id}`	Get async job status and result
DELETE	`/v1/async/{job_id}`	Cancel async job

Async jobs are created by sending a regular chat completion request with async mode enabled. The batch API is available via admin endpoints below.

Scheduled completions

Method	Path	Description
POST	`/v1/scheduled`	Schedule a completion for later
GET	`/v1/scheduled`	List scheduled jobs
GET	`/v1/scheduled/{job_id}`	Get scheduled job
DELETE	`/v1/scheduled/{job_id}`	Cancel scheduled job

Realtime (WebSocket)

Method	Path	Description
GET	`/v1/realtime`	Upgrade to WebSocket for real-time audio/video streaming

Native format passthrough

For clients that prefer a provider’s native API format instead of the OpenAI format.

Method	Path	Description
POST	`/v1/messages`	Anthropic Messages API (native format)
POST	`/v1/messages/count_tokens`	Anthropic token counting
POST	`/v1beta/models/{model}:generateContent`	Google GenAI generate content
POST	`/v1beta/models/{model}:streamGenerateContent`	Google GenAI streaming

Models

Method	Path	Description
GET	`/v1/models`	List all available models
GET	`/v1/models/{model}`	Get model details

MCP (Model Context Protocol)

Prism acts as an MCP server, aggregating tools from upstream MCP tool servers.

Method	Path	Description
POST	`/mcp`	MCP protocol endpoint
GET	`/mcp`	MCP SSE streaming endpoint

Management

Method	Path	Description
GET	`/-/mcp/status`	MCP server status and stats
GET	`/-/mcp/tools`	List available tools
GET	`/-/mcp/resources`	List MCP resources
GET	`/-/mcp/prompts`	List MCP prompts
POST	`/-/mcp/test`	Test tool execution

A2A (Agent-to-Agent)

Method	Path	Description
GET	`/.well-known/agent.json`	Agent capabilities card
POST	`/a2a`	A2A protocol messages
GET	`/v1/agents`	List registered A2A agents

Admin: key management

Requires admin token.

Method	Path	Description
POST	`/-/keys`	Create API key
GET	`/-/keys`	List keys
GET	`/-/keys/{key_id}`	Get key details
PUT	`/-/keys/{key_id}`	Update key
DELETE	`/-/keys/{key_id}`	Revoke key
POST	`/-/keys/{key_id}/credits`	Add credits to key

Admin: organization config

Method	Path	Description
GET	`/-/orgs/{org_id}/config`	Get org config
PUT	`/-/orgs/{org_id}/config`	Set org config
DELETE	`/-/orgs/{org_id}/config`	Delete org config
GET	`/-/orgs/configs`	List all org configs
POST	`/-/orgs/configs/bulk`	Bulk load configs

Admin: operations

Method	Path	Description
GET	`/-/cluster/nodes`	List cluster nodes
POST	`/-/admin/providers/{id}/rotate`	Start key rotation
GET	`/-/admin/providers/{id}/rotation`	Get rotation status
POST	`/-/admin/providers/{id}/rotate/promote`	Promote rotated key
POST	`/-/admin/providers/{id}/rotate/rollback`	Rollback rotation
POST	`/-/batches`	Submit batch job
GET	`/-/batches/{batch_id}`	Get batch status
POST	`/-/batches/{batch_id}/cancel`	Cancel batch
GET	`/-/shadow/stats`	Shadow testing statistics

Health and diagnostics

Method	Path	Description
GET	`/healthz`	Liveness probe
GET	`/livez`	Liveness probe (alias)
GET	`/readyz`	Readiness probe
POST	`/-/reload`	Reload config from file
GET	`/-/config`	Server config summary
GET	`/-/metrics`	Prometheus metrics
GET	`/-/health/providers`	Provider health status
GET	`/-/health/providers/{org_id}`	Org-specific provider health

Endpoints overview

About

Base URL

Chat and completions

Embeddings, reranking, and search

Audio

Images and video

Files

Vector stores

Assistants API

Assistants

Threads

Messages

Runs

Responses API

Async inference

Scheduled completions

Realtime (WebSocket)

Native format passthrough

Models

MCP (Model Context Protocol)

Management

A2A (Agent-to-Agent)

Admin: key management

Admin: organization config

Admin: operations

Health and diagnostics

Next Steps

How it works

Quickstart

Supported providers

Routing

Questions & Discussion

FutureAGI AI Assistant

About

Base URL

Chat and completions

Embeddings, reranking, and search

Audio

Images and video

Files

Vector stores

Assistants API

Assistants

Threads

Messages

Runs

Responses API

Async inference

Scheduled completions

Realtime (WebSocket)

Native format passthrough

Models

MCP (Model Context Protocol)

Management

A2A (Agent-to-Agent)

Admin: key management

Admin: organization config

Admin: operations

Health and diagnostics

Next Steps

How it works

Quickstart

Supported providers

Routing

Questions & Discussion