How To
Import from Hugging Face
- Access to Diverse Datasets: Leverage high-quality, curated datasets for training and testing.
- Preprocessed Data: Many datasets are formatted and ready to use.
- Easy Integration: Directly imports into the platform without manual conversion.
This feature streamlines the process of working with established datasets, making it faster and more efficient to get started with data-driven experiments.
Steps to Import a Hugging Face Dataset
1. Access the Dataset Import Section
-
Navigate to the Datasets & Experiments section.
-
Click on “Add Dataset” to access dataset creation options.
-
Select “Import from Hugging Face” from the available choices.
2. Browse and Select a Dataset
- The system presents a catalog of datasets sourced from Hugging Face.
- Each dataset includes key metadata such as:
- Dataset Name (e.g., databricks-dolly-15k)
- Source (e.g., OpenAI, Hugging Face, Microsoft)
- Record Count
- Usage Popularity and Metadata
- Use the search functionality to locate a specific dataset.
3. Configure Dataset Parameters
- Upon selection, a configuration panel appears displaying:
-
Dataset Overview: Summary, source, and dataset reference link.
-
Subset Selection: Options include Default, Train, or Split.
-
Number of Rows: Specify the number of records to be imported.
-
Additional Preferences: Optionally enable “Add selected rows” for precise filtering.
-
4. Initiate the Import Process
- Click “Start Experimenting” to commence the dataset ingestion.
- The imported dataset will be available in the Datasets & Experiments section for further processing and utilization.