Annotate Datasets with Human-in-the-Loop Workflows
Create annotation views, define labels, assign annotators, and log annotations programmatically via the SDK.
Create annotation views with categorical, numeric, and text labels, assign annotators, and log annotations programmatically using the FutureAGI SDK.
| Time | Difficulty | Package |
|---|---|---|
| 15 min | Intermediate | futureagi |
- FutureAGI account → app.futureagi.com
- API keys:
FI_API_KEYandFI_SECRET_KEY(see Get your API keys) - Python 3.9+
- A dataset with at least a few rows (see Dataset Management to create one)
Install
pip install futureagi pandas
export FI_API_KEY="your-api-key"
export FI_SECRET_KEY="your-secret-key"
What are annotations?
Annotations attach human judgments (labels, scores, free-text feedback) to dataset rows or traced spans. They close the feedback loop between automated evals and human review, letting you catch hallucinations, rate quality, and build gold-standard evaluation sets.
Tutorial
Open a dataset and go to the Annotations tab
Navigate to the dataset you want to annotate and switch to the annotation interface.
- Go to app.futureagi.com → Dataset (left sidebar under BUILD)
- Click the name of the dataset you want to annotate
- Click the Annotations tab at the top of the data table
Create an annotation view and define labels
An annotation view groups labels together and maps them to the columns annotators will see.
- Click Create New View
- Give the view a descriptive name (e.g., “Response Quality Review”)
Static Fields
Select the column(s) that provide read-only context to annotators (e.g., user_query, context). Annotators see these as reference material alongside the response but cannot edit them.
Response Fields
Select the column(s) containing the model output you want annotators to evaluate (e.g., response). This is the primary field annotators will judge and label.
Labels
- Click New Label for each annotation type you need. Configure the following fields for each label:
| Field | Description |
|---|---|
| Name | A clear label name (e.g., “Sentiment”, “Relevance Score”, “Reviewer Notes”) |
| Annotation Type | The input type: Categorical (predefined categories), Numeric (score on a scale), or Text (free-form feedback) |
| Description | Optional description to guide annotators on what the label means and how to apply it |
| Display Option | How the label renders in the annotation interface (e.g., dropdown or radio buttons for categorical; slider or number input for numeric) |
| Min / Max Value | For Numeric labels only — the lower and upper bounds of the score range (e.g., min: 1, max: 5) |
For this guide, create three labels:
| Label name | Annotation Type | Description | Min | Max |
|---|---|---|---|---|
| Sentiment | Categorical | Overall tone of the response | — | — |
| Relevance Score | Numeric | How well the response addresses the query | 1 | 5 |
| Reviewer Notes | Text | Free-form feedback or corrections | — | — |
For the Sentiment categorical label, define categories: “Positive”, “Negative”, “Neutral”.
Tip
For Categorical labels, enable Auto-Annotation during label creation. The platform learns from your initial manual annotations and automatically suggests labels for remaining rows — you can review, accept, or override every suggestion.
Annotators
-
In the Annotators section of the view, add workspace members who should contribute annotations. Each assigned annotator can open the view and apply labels to dataset rows.
-
Click Save to finalize the view.
Assign annotators and annotate rows
- In the Annotation View settings, find the Annotators section
- Add workspace members who should contribute annotations
- Each annotator opens the view, navigates through rows, and applies labels
Annotators see the static fields as read-only context and the response fields alongside the label inputs. Changes save automatically.
Log annotations programmatically with the SDK
For bulk annotation or CI pipelines, use Annotation.log_annotations() to push annotations via a pandas DataFrame. Each row references a traced span by its context.span_id.
import os
import pandas as pd
from fi.annotations import Annotation
client = Annotation(
fi_api_key=os.environ["FI_API_KEY"],
fi_secret_key=os.environ["FI_SECRET_KEY"],
)
# Build a DataFrame with annotation columns
# Column format: annotation.{label_name}.{type}
# Types: label (categorical), score (numeric), text, rating (1-5 stars), thumbs (True/False)
df = pd.DataFrame({
"context.span_id": ["span_abc123", "span_def456", "span_ghi789"],
"annotation.sentiment.label": ["Positive", "Negative", "Neutral"],
"annotation.relevance.score": [4.5, 2.0, 3.5],
"annotation.reviewer_notes.text": [
"Accurate and well-structured response",
"Hallucinated a date that wasn't in the context",
"Correct but could be more concise",
],
"annotation.notes": [
"Reviewed by QA team",
"Flagged for retraining",
None,
],
})
result = client.log_annotations(df, project_name="My Tracing Project")
print(f"Annotations created: {result.annotationsCreated}")
print(f"Annotations updated: {result.annotationsUpdated}")
print(f"Notes created: {result.notesCreated}")
print(f"Errors: {result.errorsCount}")Expected output:
Annotations created: 9
Annotations updated: 0
Notes created: 2
Errors: 0Note
The context.span_id values must correspond to spans already recorded in a tracing project. The label names (e.g., sentiment, relevance) must match annotation labels defined in that project. Use client.get_labels(project_id=...) to list available labels.
List available labels and projects
Before logging annotations programmatically, verify which labels and projects exist.
import os
from fi.annotations import Annotation
client = Annotation(
fi_api_key=os.environ["FI_API_KEY"],
fi_secret_key=os.environ["FI_SECRET_KEY"],
)
# List projects
projects = client.list_projects()
for p in projects:
print(f" {p.name} (id: {p.id}, type: {p.project_type})")
# List annotation labels for a specific project
labels = client.get_labels(project_id=projects[0].id)
for label in labels:
print(f" {label.name}: type {label.type} (id: {label.id})")Expected output:
My Tracing Project (id: proj_abc123, type: observe)
sentiment: type categorical (id: lbl_001)
relevance: type numeric (id: lbl_002)
reviewer_notes: type text (id: lbl_003) What you built
You can now create annotation views, define labels, assign annotators, and log annotations programmatically using the FutureAGI SDK.
- Created an annotation view with categorical, numeric, and text labels in the dashboard
- Enabled auto-annotation for categorical labels to speed up large-dataset labeling
- Assigned annotators and reviewed rows in the annotation interface
- Logged annotations programmatically via
Annotation.log_annotations()with a pandas DataFrame - Listed projects and annotation labels using the SDK