Future AGI Datasets: Evaluation and Experimentation Layer

Structured tables of examples for prompts, evaluations, and experiments. Create from file uploads, SDK, production traces, or synthetic generation.

About

Datasets are the core data layer for evaluation and experimentation in Future AGI. Each dataset is a table with columns (e.g. “user query”, “expected answer”, “score”), rows (one row per example), and cells (the value in each column for each row).

Datasets are the single source of truth that prompts, evaluations, experiments, and optimizations run on. You can create them from file uploads, the SDK, observed production traces, or synthetic generation.

Column Types

Datasets support two types of columns:

Static columns: Data you add directly, either manually, via file upload, or through the SDK. These hold your inputs, expected outputs, ground truth labels, or any fixed data.
Dynamic columns: Generated on-the-fly by running a prompt, evaluation, or model against your dataset rows. For example, running GPT-4o on every row creates a dynamic column with the model’s responses.

This distinction matters because dynamic columns let you add model outputs, evaluation scores, and computed fields to your dataset without duplicating data.

How Datasets Connect to Other Features

Evaluation: Run 70+ built-in metrics across your dataset rows to score model outputs. Results are stored as new columns. Learn more
Experiments: Compare two prompts or models by running both against the same dataset and comparing scores side by side. Learn more
Optimization: Use datasets as the training ground for prompt optimization algorithms. Learn more
Observe: Build datasets from production traces to test against real user queries. Learn more

Next Steps

Understanding Datasets: Deeper dive into dataset concepts, column types, and best practices
Generate Synthetic Data: Create realistic datasets from scratch when real data is unavailable
Import from HuggingFace: Bring existing HuggingFace datasets into Future AGI

Was this page helpful?

Questions & Discussion

Future AGI Datasets: Evaluation and Experimentation Layer

About

Column Types

How Datasets Connect to Other Features

Getting Started with Datasets

Create New Dataset

Add Rows to Dataset

Add Columns to Dataset

Run Prompts

Experimentations

Annotate Dataset

Next Steps