Dataset
Use FutureAGI Dataset to create and manage your datasets
You can checkout the colab notebook to quickly get started with the FutureAGI Dataset.
Installing FutureAGI SDK
pip install futureagi
Initializing FutureAGI Dataset
from fi.datasets import Dataset
dataset = Dataset(fi_api_key="<your_api_key>",
fi_secret_key="<your_api_secret>") # Optional, if you want to set the API key and secret key manually
Tip
Click here to learn how to access your API keys. It’s recommended to set the API key and secret key as environment variables.
Create a Dataset
from fi.datasets import Dataset, DatasetConfig, ModelTypes
from fi.datasets.models import Column, Row, Cell, DataTypeChoices, SourceChoices
import uuid
# Create a dataset configuration
config = DatasetConfig(
id=None, # Will be set by the server
name="my_dataset", # Choose a unique name
model_type=ModelTypes.GENERATIVE_LLM
)
# Initialize and create the dataset
dataset = Dataset(dataset_config=config)
dataset = dataset.create()
Add Columns to Dataset
# Define columns
columns = [
Column(
name="Name",
data_type=DataTypeChoices.TEXT,
source=SourceChoices.OTHERS,
source_id=None,
),
Column(
name="Age",
data_type=DataTypeChoices.INTEGER,
source=SourceChoices.OTHERS,
source_id=None,
),
Column(
name="AUDIO_URLS",
data_type=DataTypeChoices.AUDIO,
source=SourceChoices.OTHERS,
source_id=None
)
]
# Add columns to dataset
dataset = dataset.add_columns(columns=columns)
Add Rows to Dataset
# Define rows with cells
rows = [
Row(
order=1,
cells=[
Cell(column_name="Name", value="Alice"),
Cell(column_name="Age", value=25),
Cell(column_name="AUDIO_URLS", value="https://example.com/audio1.mp3")
],
),
Row(
order=2,
cells=[
Cell(column_name="Name", value="Bob"),
Cell(column_name="Age", value=30),
Cell(column_name="AUDIO_URLS", value="https://example.com/audio2.mp3")
],
),
]
# Add rows to dataset
dataset = dataset.add_rows(rows=rows)
Download Dataset
# Download dataset to a CSV file
file_path = "my_dataset.csv"
dataset.download(file_path=file_path)
# Read the downloaded file
with open(file_path, "r") as file:
content = file.read()
print(content)
Delete Dataset
# Delete the dataset
dataset.delete()
Tip
Make sure to handle the downloaded file cleanup after you’re done with it:
import os
if os.path.exists(file_path):
os.remove(file_path) Was this page helpful?