SDKs
Evaluate LLM outputs, trace AI calls, optimize prompts, and test voice agents. Python, TypeScript, Java, and C# supported.
- Python: evals, tracing, datasets, prompts, optimization, simulation
- TypeScript: evals, tracing, datasets, prompts
- Java / C#: tracing
pip install ai-evaluationornpm install @future-agi/ai-evaluationto get started
Future AGI is a set of packages that evaluate LLM outputs, trace calls across your stack, optimize prompts, and load-test voice agents. Install what you need, skip what you don’t.
Language Support
| Module | Python | TypeScript | Java | C# |
|---|---|---|---|---|
| Evaluations | Full | Full | — | — |
| Tracing | Full (45+) | Full (40+) | Full (25+) | Full |
| Datasets | Full | Full | — | — |
| Prompts | Full | Full | — | — |
| Prompt Optimization | Full | — | — | — |
| Simulation | Full | — | — | — |
Quickstart
pip install ai-evaluationRequires Python 3.10+. This also installs futureagi (datasets, prompts, knowledge bases) automatically.
export FI_API_KEY="your-api-key"
export FI_SECRET_KEY="your-secret-key"from fi.evals import evaluate
# Local metric — no API key needed
result = evaluate("contains", output="Hello world", keyword="Hello")
print(result.score) # 1.0
print(result.passed) # True
# Cloud metric — needs FI_API_KEY and FI_SECRET_KEY
result = evaluate("toxicity", output="Hello world", model="turing_flash")
print(result.score) # 1.0
print(result.passed) # TrueWant tracing too? Add the instrumentor for your provider:
pip install fi-instrumentation-otel traceai-openai npm install @future-agi/ai-evaluationexport FI_API_KEY="your-api-key"
export FI_SECRET_KEY="your-secret-key"import { Evaluator, Tone } from "@future-agi/ai-evaluation";
const evaluator = new Evaluator();
const result = await evaluator.evaluate({
evalTemplates: [new Tone()],
inputs: [{
query: "Write a professional email",
response: "Dear Sir/Madam, I hope this message finds you well..."
}],
modelName: "turing_flash"
});
console.log(result);Want tracing too?
npm install @traceai/fi-core @traceai/openai Java support covers tracing only. 25+ instrumentors including Spring AI and LangChain4j.
<!-- Maven — add the JitPack repository -->
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
<dependency>
<groupId>com.github.future-agi.traceAI</groupId>
<artifactId>traceai-java-openai</artifactId>
<version>LATEST</version>
</dependency>See the Tracing docs for setup instructions.
C# support covers tracing only.
dotnet add package fi-instrumentation-otelSee the Tracing docs for setup instructions.
Getting an error? Check these common issues
ModuleNotFoundError: No module named 'fi' — The package is called ai-evaluation, not future-agi or futureagi-sdk:
pip install ai-evaluationAuthenticationError — Both FI_API_KEY and FI_SECRET_KEY must be set. The API key alone is not enough.
Python version error — ai-evaluation requires Python 3.10+. Check with python --version.
Packages
Python
Six packages, each installable independently:
| Package | Install | What it does | Python |
|---|---|---|---|
| futureagi | pip install futureagi | Datasets, prompt versioning, knowledge bases | 3.9+ |
| ai-evaluation | pip install ai-evaluation | 76+ local metrics + 100+ cloud templates, guardrails, streaming eval | 3.10+ |
| fi-instrumentation-otel | pip install fi-instrumentation-otel | OpenTelemetry tracing for AI apps | 3.9+ |
| traceai-* | pip install traceai-openai | Auto-instrumentation for 45+ frameworks | 3.9+ |
| agent-opt | pip install agent-opt | Prompt optimization (6 algorithms) | 3.10+ |
| agent-simulate | pip install agent-simulate | Simulate voice AI agents at scale | 3.10+ |
futureagi ← standalone base layer
└── ai-evaluation ← installs futureagi automatically
└── agent-opt ← installs ai-evaluation automatically
fi-instrumentation-otel ← standalone tracing layer
├── traceai-* ← each installs fi-instrumentation-otel
└── agent-simulate ← installs fi-instrumentation-otel
Tip
You don’t need to install dependencies manually. pip install ai-evaluation gives you futureagi too. pip install traceai-openai gives you fi-instrumentation-otel too.
TypeScript
| Package | Install | What it does |
|---|---|---|
| @future-agi/sdk | npm install @future-agi/sdk | Datasets, prompt versioning, knowledge bases |
| @future-agi/ai-evaluation | npm install @future-agi/ai-evaluation | Eval metrics and guardrails |
| @traceai/fi-core | npm install @traceai/fi-core | Tracing core |
| @traceai/openai | npm install @traceai/openai | Framework instrumentors (40+) |
Java and C#
Tracing only. Java has 25+ instrumentors (Maven via JitPack, group ID com.github.future-agi.traceAI). C# has a single NuGet package (fi-instrumentation-otel). See the Tracing reference for details.
Evaluations — ai-evaluation
76+ local metrics for things like tone, hallucination, bias, and factual accuracy. Also includes guardrails (toxicity, PII, prompt injection) that run in under 10ms.
Available in Python and TypeScript.
Evaluations
All 76+ local metrics — browse by category, see config options, and run examples.
Protect
Real-time guardrails for toxicity, PII, prompt injection, and content moderation.
Optional extras (Python)
| Extra | Install | What it adds |
|---|---|---|
| NLI models | pip install ai-evaluation[nli] | DeBERTa for faithfulness and hallucination detection |
| Embeddings | pip install ai-evaluation[embeddings] | Sentence-transformers for semantic similarity |
| Feedback | pip install ai-evaluation[feedback] | ChromaDB-backed feedback collection |
| Distributed | pip install ai-evaluation[celery] | Celery + Redis for distributed eval runs |
| Everything | pip install ai-evaluation[all] | All optional dependencies |
Tracing — fi-instrumentation-otel + traceai-*
Install the core library plus one instrumentor per framework. LLM calls, retrieval steps, and agent actions get traced and sent to your Future AGI dashboard.
Available in Python, TypeScript, Java, and C#.
from fi_instrumentation import register
from fi_instrumentation.fi_types import ProjectType
trace_provider = register(
project_name="my-project",
project_type=ProjectType.OBSERVE,
)
from traceai_openai import OpenAIInstrumentor
OpenAIInstrumentor().instrument(tracer_provider=trace_provider)
# All OpenAI calls are now traced
# Traces appear in your Future AGI dashboard under "my-project"
| Package | Framework |
|---|---|
traceai-openai | OpenAI |
traceai-anthropic | Anthropic |
traceai-google-genai | Google Generative AI |
traceai-vertexai | Google Vertex AI |
traceai-bedrock | AWS Bedrock |
traceai-mistralai | Mistral AI |
traceai-groq | Groq |
traceai-litellm | LiteLLM |
traceai-cohere | Cohere |
traceai-ollama | Ollama |
traceai-deepseek | DeepSeek |
traceai-together | Together AI |
traceai-fireworks | Fireworks AI |
traceai-cerebras | Cerebras |
traceai-xai | xAI / Grok |
traceai-vllm | vLLM |
traceai-portkey | Portkey |
traceai-huggingface | HuggingFace |
| Package | Framework |
|---|---|
traceai-langchain | LangChain / LangGraph |
traceai-llamaindex | LlamaIndex |
traceai-crewai | CrewAI |
traceai-openai-agents | OpenAI Agents SDK |
traceai-autogen | Microsoft AutoGen |
traceai-smolagents | HuggingFace SmolAgents |
traceai-google-adk | Google Agent Dev Kit |
traceai-claude-agent-sdk | Claude Agent SDK |
traceai-pydantic-ai | Pydantic AI |
traceai-strands | AWS Strands Agents |
traceai-agno | Agno |
traceai-beeai | IBM BeeAI |
traceai-haystack | Haystack |
traceai-dspy | DSPy |
traceai-guardrails | Guardrails AI |
traceai-instructor | Instructor |
traceai-mcp | Model Context Protocol |
| Package | Framework |
|---|---|
traceai-pipecat | Pipecat |
traceai-livekit | LiveKit |
| Package | Framework |
|---|---|
traceai-pinecone | Pinecone |
traceai-chromadb | ChromaDB |
traceai-qdrant | Qdrant |
traceai-weaviate | Weaviate |
traceai-milvus | Milvus |
traceai-lancedb | LanceDB |
traceai-mongodb | MongoDB |
traceai-pgvector | pgvector |
traceai-redis | Redis |
Tip
Each instrumentor is lightweight and independent. Only install the ones for frameworks you actually use.
Core SDK — futureagi
Datasets, prompt versioning, and knowledge bases. If you installed ai-evaluation, you already have this.
Available in Python and TypeScript.
Datasets
Create, version, and manage test datasets. Import from CSV, DataFrames, or HuggingFace.
Knowledge Base
Upload documents to build knowledge bases for RAG evaluation and context injection.
Prompt Optimization — agent-opt
Six optimization algorithms: Random Search, Bayesian, ProTeGi, Meta-Prompt, PromptWizard, and GEPA. Each uses eval metrics to score prompt variants and find the best one.
Python only.
pip install agent-opt
Simulation Testing — agent-simulate
Run simulated conversations against your voice AI agents using configurable personas. Captures audio, transcripts, and eval scores.
Python only.
pip install agent-simulate