Annotate Datasets with Human-in-the-Loop Workflows

Create annotation views with categorical, numeric, and text labels, assign annotators, and log annotations programmatically using the Future AGI SDK.

📝

TL;DR

Create annotation views with categorical, numeric, and text labels, assign annotators, and log annotations programmatically using the FutureAGI SDK.

Time	Difficulty	Package
15 min	Intermediate	`futureagi`

Prerequisites

FutureAGI account → app.futureagi.com
API keys: FI_API_KEY and FI_SECRET_KEY (see Get your API keys)
Python 3.9+
A dataset with at least a few rows (see Dataset Management to create one)

Install

pip install futureagi pandas

export FI_API_KEY="your-api-key"
export FI_SECRET_KEY="your-secret-key"

What are annotations?

Annotations attach human judgments (labels, scores, free-text feedback) to dataset rows or traced spans. They close the feedback loop between automated evals and human review, letting you catch hallucinations, rate quality, and build gold-standard evaluation sets.

Tutorial

Open a dataset and go to the Annotations tab

Navigate to the dataset you want to annotate and switch to the annotation interface.

Go to app.futureagi.com → Dataset (left sidebar under BUILD)
Click the name of the dataset you want to annotate
Click the Annotations tab at the top of the data table

Create an annotation view and define labels

An annotation view groups labels together and maps them to the columns annotators will see.

Click Create New View
Give the view a descriptive name (e.g., “Response Quality Review”)

Static Fields

Select the column(s) that provide read-only context to annotators (e.g., user_query, context). Annotators see these as reference material alongside the response but cannot edit them.

Response Fields

Select the column(s) containing the model output you want annotators to evaluate (e.g., response). This is the primary field annotators will judge and label.

Labels

Click New Label for each annotation type you need. Configure the following fields for each label:

Field	Description
Name	A clear label name (e.g., “Sentiment”, “Relevance Score”, “Reviewer Notes”)
Annotation Type	The input type: Categorical (predefined categories), Numeric (score on a scale), or Text (free-form feedback)
Description	Optional description to guide annotators on what the label means and how to apply it
Display Option	How the label renders in the annotation interface (e.g., dropdown or radio buttons for categorical; slider or number input for numeric)
Min / Max Value	For Numeric labels only — the lower and upper bounds of the score range (e.g., min: 1, max: 5)

For this guide, create three labels:

Label name	Annotation Type	Description	Min	Max
Sentiment	Categorical	Overall tone of the response	—	—
Relevance Score	Numeric	How well the response addresses the query	1	5
Reviewer Notes	Text	Free-form feedback or corrections	—	—

For the Sentiment categorical label, define categories: “Positive”, “Negative”, “Neutral”.

Tip

For Categorical labels, enable Auto-Annotation during label creation. The platform learns from your initial manual annotations and automatically suggests labels for remaining rows — you can review, accept, or override every suggestion.

Annotators

In the Annotators section of the view, add workspace members who should contribute annotations. Each assigned annotator can open the view and apply labels to dataset rows.
Click Save to finalize the view.

Assign annotators and annotate rows

In the Annotation View settings, find the Annotators section
Add workspace members who should contribute annotations
Each annotator opens the view, navigates through rows, and applies labels

Annotators see the static fields as read-only context and the response fields alongside the label inputs. Changes save automatically.

Log annotations programmatically with the SDK

For bulk annotation or CI pipelines, use Annotation.log_annotations() to push annotations via a pandas DataFrame. Each row references a traced span by its context.span_id.

import os
import pandas as pd
from fi.annotations import Annotation

client = Annotation(
    fi_api_key=os.environ["FI_API_KEY"],
    fi_secret_key=os.environ["FI_SECRET_KEY"],
)

# Build a DataFrame with annotation columns
# Column format: annotation.{label_name}.{type}
# Types: label (categorical), score (numeric), text, rating (1-5 stars), thumbs (True/False)
df = pd.DataFrame({
    "context.span_id": ["span_abc123", "span_def456", "span_ghi789"],
    "annotation.sentiment.label": ["Positive", "Negative", "Neutral"],
    "annotation.relevance.score": [4.5, 2.0, 3.5],
    "annotation.reviewer_notes.text": [
        "Accurate and well-structured response",
        "Hallucinated a date that wasn't in the context",
        "Correct but could be more concise",
    ],
    "annotation.notes": [
        "Reviewed by QA team",
        "Flagged for retraining",
        None,
    ],
})

result = client.log_annotations(df, project_name="My Tracing Project")

print(f"Annotations created: {result.annotationsCreated}")
print(f"Annotations updated: {result.annotationsUpdated}")
print(f"Notes created:       {result.notesCreated}")
print(f"Errors:              {result.errorsCount}")

Expected output:

Annotations created: 9
Annotations updated: 0
Notes created:       2
Errors:              0

Note

The context.span_id values must correspond to spans already recorded in a tracing project. The label names (e.g., sentiment, relevance) must match annotation labels defined in that project. Use client.get_labels(project_id=...) to list available labels.

List available labels and projects

Before logging annotations programmatically, verify which labels and projects exist.

import os
from fi.annotations import Annotation

client = Annotation(
    fi_api_key=os.environ["FI_API_KEY"],
    fi_secret_key=os.environ["FI_SECRET_KEY"],
)

# List projects
projects = client.list_projects()
for p in projects:
    print(f"  {p.name} (id: {p.id}, type: {p.project_type})")

# List annotation labels for a specific project
labels = client.get_labels(project_id=projects[0].id)
for label in labels:
    print(f"  {label.name}: type {label.type} (id: {label.id})")

Expected output:

  My Tracing Project (id: proj_abc123, type: observe)
  sentiment: type categorical (id: lbl_001)
  relevance: type numeric (id: lbl_002)
  reviewer_notes: type text (id: lbl_003)

What you built

You can now create annotation views, define labels, assign annotators, and log annotations programmatically using the FutureAGI SDK.

Created an annotation view with categorical, numeric, and text labels in the dashboard
Enabled auto-annotation for categorical labels to speed up large-dataset labeling
Assigned annotators and reviewed rows in the annotation interface
Logged annotations programmatically via Annotation.log_annotations() with a pandas DataFrame
Listed projects and annotation labels using the SDK

Questions & Discussion