Troubleshooting
Step-by-step solutions for common Agent Command Center issues.
About
Common issues and how to diagnose them when requests through Agent Command Center fail.
Debug checklist
When something isn’t working, start here:
- Check the
x-agentcc-request-idresponse header and search for it in your logs - Check
x-agentcc-providerto confirm which provider handled the request - Check
x-agentcc-model-usedto confirm the actual model (may differ from requested if routing changed it) - Compare
x-agentcc-latency-msagainst your expected latency - Check
x-agentcc-costto verify pricing is as expected
Use curl -i to see all response headers:
curl -i https://gateway.futureagi.com/v1/chat/completions \
-H "Authorization: Bearer sk-agentcc-your-key" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "hi"}]}'
Common issues
”model not found” but the model exists
Symptom: 404 with model_not_found even though the model appears in GET /v1/models.
Quick fix: Try the provider/model format to bypass model resolution:
# Check available models
curl https://gateway.futureagi.com/v1/models \
-H "Authorization: Bearer sk-agentcc-your-key" | jq '.data[].id'
# Use explicit provider prefix
curl https://gateway.futureagi.com/v1/chat/completions \
-H "Authorization: Bearer sk-agentcc-your-key" \
-H "Content-Type: application/json" \
-d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "hi"}]}'
If that works, set up a model map. See Error handling for all causes.
Provider returns 404 upstream
Symptom: 502 with provider_404.
The gateway reached the provider, but the provider rejected the request. Most common cause: the provider API key is invalid or doesn’t have access to the model. For OpenAI project-scoped keys (sk-proj-...), enable models in Project Settings > Model access.
See Error handling for details.
Responses are slow
Symptom: High x-agentcc-latency-ms values.
Possible causes:
- Provider latency: Check if the provider itself is slow. Compare
x-agentcc-latency-mswith direct provider calls. - No caching: Repeated identical requests hit the provider every time. Enable caching.
- Wrong routing strategy:
least-latencyrouting picks the fastest provider automatically. See routing. - Large prompts: Token count affects latency. Check
usage.prompt_tokensin the response. - Guardrail overhead: Pre-request guardrails add latency. Check if guardrails are processing-heavy.
Cache isn’t working
Symptom: x-agentcc-cache always shows miss or doesn’t appear.
Checklist:
- Is caching enabled? Check your org config or
GatewayConfig. - Are you sending streaming requests? Streaming bypasses cache entirely.
- Are the requests identical? Exact-match cache requires identical model, messages, temperature, and all parameters.
- Is the TTL too short? Requests may expire before the next identical request arrives.
- Are you using different cache namespaces? Each namespace is isolated.
# Force a cache test: send the same non-streaming request twice
from agentcc import AgentCC, GatewayConfig, CacheConfig
client = AgentCC(
api_key="sk-agentcc-your-key",
base_url="https://gateway.futureagi.com",
config=GatewayConfig(cache=CacheConfig(enabled=True, strategy="exact", ttl=300)),
)
# First call
r1 = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is 2+2?"}],
)
print(f"Call 1 cache: {r1.agentcc.cache_status}") # miss or None
# Second call (same input)
r2 = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is 2+2?"}],
)
print(f"Call 2 cache: {r2.agentcc.cache_status}") # hit_exact
Guardrails blocking legitimate requests
Symptom: 403 with content_blocked on requests that should be allowed.
Diagnosis:
- Check which guardrail fired: the error message includes the guardrail name
- Check
x-agentcc-guardrail-triggered: truein the response headers - Switch the guardrail from
enforcetologmode temporarily to see what’s being flagged without blocking
See Guardrails for configuration options including fail-open behavior.
Rate limits hit unexpectedly
Symptom: 429 errors before you expect to hit limits.
Check the response headers:
x-ratelimit-limit-requests: 100
x-ratelimit-remaining-requests: 0
x-ratelimit-reset-requests: 1714000000
Common causes:
- Per-key limits are lower than per-org limits. The most restrictive limit applies.
- Multiple services share the same API key
- Burst traffic from retries (each retry counts against the limit)
Fix: Increase limits in Rate limiting, use separate keys per service, or add backoff to retry logic.
Cost is higher than expected
Diagnosis:
- Check
x-agentcc-coston individual requests to find expensive calls - Use metadata tagging to identify which team/feature is driving costs:
response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello"}], request_metadata={"team": "search", "feature": "autocomplete"}, ) - Check the analytics dashboard for cost-by-model breakdown
- Look for missing cache hits on repeated queries
- Check if the
racerouting strategy is enabled (bills all providers, not just the winner)
See Cost tracking for attribution and budgets.
Failover isn’t working
Symptom: Requests fail with provider errors but don’t route to backup providers.
Checklist:
- Is failover enabled in your routing config?
- Does
failover_oninclude the status code you’re seeing? (Default:[429, 500, 502, 503, 504]) - Are backup providers configured with valid credentials?
- Check
x-agentcc-fallback-used: trueto confirm failover happened (or didn’t) - Check
x-agentcc-providerto see which provider ultimately handled the request
Getting help
If you can’t resolve the issue:
- Collect the
x-agentcc-request-idfrom the failing request - Note the timestamp and error message
- Check the Error handling guide for the specific error code
- Contact support with the request ID - it links to the full request/response log on our end