Rate Limiting and Budget Controls in Agent Command Center

Set per-key, per-org, and global RPM limits. Enforce monthly spend budgets and per-key credit balances to prevent runaway costs and protect provider quotas.

About

Rate limiting controls how many requests a key or org can make per minute. Budgets control how much money can be spent per period. Credits give individual keys a prepaid USD balance. All three work together to prevent runaway costs and protect provider quotas.

When to use

Prevent abuse: Cap RPM per key so one user can’t monopolize gateway capacity
Control spending: Set monthly budgets per org so teams can’t exceed their allocation
Reseller billing: Give each customer key a credit balance that auto-deducts per request
Protect provider quotas: Global RPM limits prevent hitting provider rate limits

Rate limiting

Agent Command Center supports rate limits at three levels: global, per-org, and per-key.

Level	Scope	How to set
Global	All requests to the gateway	`config.yaml`
Per-org	All requests from one organization	Org config via admin API
Per-key	Requests using a specific API key	Key config (RPM and TPM)

The most restrictive limit applies. If the global limit is 1000 RPM and a key’s limit is 100 RPM, that key is capped at 100 RPM.

Configuration

Go to Agent Command Center > Rate Limits in the Future AGI dashboard to set global and per-org limits.

Per-key limits are set when creating or editing a key in Settings > API Keys.

from agentcc import AgentCC

client = AgentCC(
    api_key="sk-agentcc-your-key",
    base_url="https://gateway.futureagi.com",
    control_plane_url="https://api.futureagi.com",
)

# Set per-org rate limits
client.org_configs.create(
    org_id="your-org-id",
    config={
        "rate_limiting": {
            "enabled": True,
            "rpm": 500,     # requests per minute for this org
            "tpm": 100000,  # tokens per minute for this org
        }
    }
)

import { AgentCC } from "@futureagi/agentcc";

const client = new AgentCC({
    apiKey: "sk-agentcc-your-key",
    baseUrl: "https://gateway.futureagi.com",
    controlPlaneUrl: "https://api.futureagi.com",
});

await client.orgConfigs.create({
    orgId: "your-org-id",
    config: {
        rate_limiting: {
            enabled: true,
            rpm: 500,
            tpm: 100000,
        },
    },
});

Self-hosted config.yaml:

# Global rate limit (all requests)
rate_limiting:
  enabled: true
  global_rpm: 1000

# Per-key limits are set on the key itself
auth:
  keys:
    - name: "limited-key"
      key: "sk-agentcc-..."
      rate_limit_rpm: 100
      rate_limit_tpm: 50000

Response headers

Every response includes rate limit headers:

Header	Description
`X-Ratelimit-Limit-Requests`	Maximum requests allowed per minute
`X-Ratelimit-Remaining-Requests`	Requests remaining in the current window
`X-Ratelimit-Reset-Requests`	Unix timestamp when the window resets

Error response (429)

When a rate limit is exceeded:

{
  "error": {
    "type": "rate_limit_exceeded",
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Please retry after the window resets."
  }
}

Retry logic

import time
from agentcc import AgentCC, RateLimitError

client = AgentCC(
    api_key="sk-agentcc-your-key",
    base_url="https://gateway.futureagi.com",
)

def call_with_retry(max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "user", "content": "Hello"}],
            )
        except RateLimitError:
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # 1s, 2s, 4s
                continue
            raise

result = call_with_retry()

import time
from openai import OpenAI, RateLimitError

client = OpenAI(
    base_url="https://gateway.futureagi.com/v1",
    api_key="sk-agentcc-your-key",
)

def call_with_retry(max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "user", "content": "Hello"}],
            )
        except RateLimitError:
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)
                continue
            raise

result = call_with_retry()

# Check rate limit headers with -i flag
curl -i -X POST https://gateway.futureagi.com/v1/chat/completions \
  -H "Authorization: Bearer sk-agentcc-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
# Look for X-Ratelimit-Remaining-Requests in the response headers

Budgets

Set spending limits per org, per key, per user, or per model. Budgets can be daily, weekly, monthly, or total.

Setting	Description
`period`	`daily`, `weekly`, `monthly`, or `total`
`limit`	USD amount
`action`	`block` (hard limit, reject requests) or `warn` (soft limit, log warning)

Go to Agent Command Center > Budgets in the Future AGI dashboard to set org-level budgets and alerts.

client.org_configs.create(
    org_id="your-org-id",
    config={
        "budgets": {
            "enabled": True,
            "org_budget": {
                "period": "monthly",
                "limit": 500.00,
                "action": "block",
            }
        }
    }
)

await client.orgConfigs.create({
    orgId: "your-org-id",
    config: {
        budgets: {
            enabled: true,
            org_budget: {
                period: "monthly",
                limit: 500.00,
                action: "block",
            },
        },
    },
});

Self-hosted config.yaml:

budgets:
  enabled: true
  org_budget:
    period: monthly
    limit: 500.00
    action: block

When a budget is exceeded with action: block, new requests return:

{
  "error": {
    "type": "budget_exceeded",
    "code": "rate_limit_exceeded",
    "message": "Organization monthly budget of $500.00 exceeded"
  }
}

Managed key credits

Managed keys have a USD credit balance that auto-deducts the cost of each request. When credits run out, requests are blocked.

Create a managed key with credits:

curl -X POST https://gateway.futureagi.com/-/keys \
  -H "Authorization: Bearer your-admin-token" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "customer-key",
    "key_type": "managed",
    "credit_balance": 25.00
  }'

Add more credits:

curl -X POST "https://gateway.futureagi.com/-/keys/key_123/credits" \
  -H "Authorization: Bearer your-admin-token" \
  -H "Content-Type: application/json" \
  -d '{"amount": 50.00}'

The remaining balance is returned in the x-agentcc-credits-remaining response header on every request made with a managed key.

Questions & Discussion

Rate Limiting and Budget Controls in Agent Command Center

About

When to use

Rate limiting

Configuration

Response headers

Error response (429)

Retry logic

Budgets

Managed key credits

Next Steps

Cost tracking

Virtual keys

Routing

How it works