Installation

First, install the Future AGI Python client:

pip install futureagi

export FI_API_KEY="your_api_key"
export FI_SECRET_KEY="your_secret_key"

Creating Datasets

1. Initialize the Client

from fi.datasets import DatasetClient, DatasetConfig, HuggingfaceDatasetConfig
from fi.datasets.types import ModelTypes

# Initialize with dataset configuration
dataset_config = DatasetConfig(
    name="my_dataset",
    model_type=ModelTypes.GENERATIVE_LLM
)

dataset_client = DatasetClient(
    fi_api_key="your_api_key",                # Optional: Not required if exported
    fi_secret_key="your_secret_key",          # Optional: Not required if exported
    fi_base_url="https://api.futureagi.com",  # Optional: Not required if exported
    dataset_config=dataset_config
)

2. Create Empty Dataset

# Configure and create empty dataset
dataset_config = DatasetConfig(
    name="my_dataset",
    model_type=ModelTypes.GENERATIVE_LLM
)

# Create empty dataset (returns DatasetClient instance)
dataset_client = DatasetClient.create_dataset(dataset_config=dataset_config)

# Chain operations
dataset_client.download(file_path="output.csv").delete()

3. Create Dataset from File

Supported file formats: CSV, Excel (.xlsx, .xls), JSON, and JSONL.

# Create dataset config
dataset_config = DatasetConfig(
    name="file_dataset",
    model_type=ModelTypes.GENERATIVE_LLM
)

# Create dataset using chained operations
dataset_client = (DatasetClient(dataset_config=dataset_config)
    .create(source="path/to/your/data.csv")
    .download(file_path="output.csv"))

# Delete when done
dataset_client.delete()

4. Create Dataset from Hugging Face

# Create dataset configs
fi_dataset_config = DatasetConfig(
    name="huggingface_dataset",
    model_type=ModelTypes.GENERATIVE_LLM
)

huggingface_dataset_config = HuggingfaceDatasetConfig(
    name="dataset_name",  # Required: Hugging Face dataset name
    split="train",        # Optional: specify dataset split
    num_rows=1000         # Optional: limit number of rows
)

# Create and chain operations
dataset_client = (DatasetClient.create_dataset(
    dataset_config=fi_dataset_config,
    source=huggingface_dataset_config
)
.download(file_path="output.csv"))

# Delete when done
dataset_client.delete()

Working with Datasets using name

Download Dataset using name

# Download as DataFrame or save to file
dataset_df = DatasetClient.download_dataset(
    dataset_name="my_dataset",
    file_path="output.csv",  # Optional: save to file
    load_to_pandas=True      # Optional: return as pandas DataFrame
)

# Delete a dataset
DatasetClient.delete_dataset(dataset_name="my_dataset")

Data Types

When working with datasets, the SDK automatically detects and supports multiple data types for columns by analyzing the provided data.

  • Text
  • Boolean
  • Integer
  • Float
  • JSON
  • Array
  • Image
  • Datetime