In-line Evaluations
Run evaluations directly inside a traced span so results are automatically attached to that span in the Future AGI dashboard.
About
Evaluation results are most useful when they sit next to the data that produced them. Running evals as a separate step means matching results back to specific spans after the fact. In-line evaluations remove that gap by running evaluator.evaluate() with trace_eval=True inside an active span. The evaluation result is automatically attached to that span as attributes, so both the trace data and the eval score appear together in the dashboard.
When to use
- Per-span quality checks: Attach groundedness, relevance, or custom eval scores directly to the LLM span that produced the output.
- Simplified evaluation setup: Skip configuring separate evaluation tasks and filters. Run evals inline where the logic runs.
- Side-by-side tracing and evaluation: View both the trace data and the evaluation result in the same span in the dashboard.
How to
Set up your environment
Register a tracer provider and initialize the Evaluator with your API credentials.
import os
import openai
from fi_instrumentation import register, FITracer
from fi_instrumentation.fi_types import (
ProjectType
)
from fi.evals import Evaluator
# Register the tracer
trace_provider = register(
project_type=ProjectType.OBSERVE,
project_name="YOUR_PROJECT_NAME",
set_global_tracer_provider=True
)
# Initialize the Evaluator
evaluator = Evaluator(fi_api_key=os.getenv("FI_API_KEY"), fi_secret_key=os.getenv("FI_SECRET_KEY"))
client = openai.OpenAI()
tracer = FITracer(trace_provider.get_tracer(__name__)) Run an evaluation inside a span
Call evaluator.evaluate() with trace_eval=True inside an active span. The evaluation result will be automatically linked to that span.
with tracer.start_as_current_span("parent_span") as span:
completion = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "hi how are you?"}],
)
span.set_attribute("raw.input", "hi how are you?")
span.set_attribute("raw.output", completion.choices[0].message.content)
# Define evaluation configs
config_groundedness = {
"eval_templates" : "groundedness",
"inputs" : {
"input": "hi how are you?",
"output": completion.choices[0].message.content,
},
"model_name" : "turing_large"
}
# Run the evaluations with trace_eval=True
eval_result1 = evaluator.evaluate(
**config_groundedness,
custom_eval_name="groundedness_check",
trace_eval=True
)
print(eval_result1) Key concepts
trace_eval=True:The essential parameter that enables in-line evaluation. It tells the system to find the current active span and attach the evaluation results to it as span attributes.custom_eval_name:Required. A unique, human-readable name for this evaluation instance. It distinguishes between multiple evaluations of the same type within a trace and appears as the label in the UI.Evaluator:The Future AGI evaluations client. Initialize it with yourFI_API_KEYandFI_SECRET_KEYcredentials.eval_templates:The name of the evaluation template from the Future AGI AI Evaluations library (e.g.,"groundedness").- Active span context:The evaluation must be called while a span is active (inside a
with tracer.start_as_current_span(...)block) so the system knows which span to attach results to.
Next Steps
Set Up Tracing
Register a tracer provider and add instrumentation.
Instrument with traceAI Helpers
Use FITracer decorators and context managers for typed spans.
Add Attributes & Metadata
Attach custom data to spans for filtering and evals.
Auto Instrumentation
Browse all supported framework instrumentors.