Cost tracking
Track LLM costs per request, attribute spend by team and feature, and configure budget alerts.
About
Agent Command Center calculates the cost of every request automatically based on token usage and model pricing. The cost appears in the x-agentcc-cost response header and in the response.agentcc.cost SDK accessor. No setup required.
Cost is calculated as:
cost = (input_tokens * input_price_per_token) + (output_tokens * output_price_per_token)
Exact cache hits return x-agentcc-cost: 0 since no provider call was made.
Reading cost per request
from agentcc import AgentCC
client = AgentCC(
api_key="sk-agentcc-your-key",
base_url="https://gateway.futureagi.com",
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)
print(f"Cost: ${response.agentcc.cost}")
print(f"Provider: {response.agentcc.provider}")
print(f"Model: {response.agentcc.model_used}")The Agent Command Center SDK also tracks cumulative cost across all requests made with a client:
# After several requests...
print(f"Total session cost: ${client.current_cost:.4f}")
# Reset the counter
client.reset_cost() from openai import OpenAI
client = OpenAI(
base_url="https://gateway.futureagi.com/v1",
api_key="sk-agentcc-your-key",
)
raw = client.chat.completions.with_raw_response.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)
print(f"Cost: ${raw.headers.get('x-agentcc-cost')}")
print(f"Provider: {raw.headers.get('x-agentcc-provider')}") curl -i https://gateway.futureagi.com/v1/chat/completions \
-H "Authorization: Bearer sk-agentcc-your-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello"}]
}'
# Look for: x-agentcc-cost: 0.00015 Cost attribution
Tag requests with metadata to break down costs by team, feature, user, or any custom dimension. Metadata is indexed and queryable in the analytics dashboard.
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
request_metadata={"team": "data-science", "feature": "recommendations", "user": "alice"},
) import json
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
extra_headers={
"x-agentcc-metadata": json.dumps({"team": "data-science", "feature": "recommendations", "user": "alice"}),
},
) curl https://gateway.futureagi.com/v1/chat/completions \
-H "Authorization: Bearer sk-agentcc-your-key" \
-H "Content-Type: application/json" \
-H 'x-agentcc-metadata: {"team":"data-science","feature":"recommendations","user":"alice"}' \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello"}]
}' Analytics dashboard
The Future AGI dashboard shows cost breakdowns and trends across your organization.
Available views:
- Total spend for the current period
- Cost by model
- Cost by provider
- Cost by API key
- Cost timeseries (daily/weekly/monthly)
- Cost by metadata dimension (team, feature, user)
SDK analytics
from agentcc import AgentCC
client = AgentCC(
api_key="sk-agentcc-your-key",
base_url="https://gateway.futureagi.com",
control_plane_url="https://api.futureagi.com",
)
# Spending overview
overview = client.analytics.overview(
start_date="2026-01-01",
end_date="2026-01-31",
)
# Cost breakdown by model
costs = client.analytics.cost_breakdown(group_by="model")
# Compare models
comparison = client.analytics.model_comparison(
models=["gpt-4o", "claude-sonnet-4-6"],
)
Budget alerts
Get notified when spending crosses a threshold. Alerts are configured per organization.
Go to Agent Command Center > Settings > Alerts in the Future AGI dashboard. Create a new alert by selecting the event type, setting recipients, and configuring severity.
alert = client.alerts.create(
name="Budget warning at 80%",
condition="cost > 80",
recipients=["team@example.com"],
severity="high",
) const alert = await client.alerts.create({
name: "Budget warning at 80%",
condition: "cost > 80",
recipients: ["team@example.com"],
severity: "high",
}); Alert types
| Event | Trigger |
|---|---|
budget_exceeded | Spend crosses the budget limit |
budget_threshold | Spend crosses a percentage threshold (e.g. 80%) |
error_spike | Error rate exceeds configured threshold |
latency_spike | P95 latency exceeds configured threshold |
guardrail_triggered | A guardrail blocks or flags a request |
Tip
Configure a cooldown period to prevent alert flooding when thresholds are repeatedly crossed.
Budget enforcement
Budgets are configured on the Rate limiting & budgets page. When a budget is exceeded with action: block, new requests return a 429 error until the next period. See that page for configuration details.
Next Steps
Rate limiting & budgets
Configure spending limits and rate controls
Request & response headers
Full reference for cost and metadata headers
Routing
Cost-optimized routing across providers
Caching
Reduce costs with response caching
Custom Properties
Define structured metadata schemas for cost attribution dimensions