How to Run Experiments
Learn how to set up and run experiments in Future AGI platform
This guide walks you through the complete process of setting up and running an experiment in Future AGI platform. You will learn how to execute experiments, analyze results, compare different configurations, and select the best-performing setup based on structured evaluation.
Building a Dataset ๐
Before running any experiments, we need to make sure we have a well-structured dataset. This dataset provides the necessary input information, allowing the model to generate responses that can later be evaluated.
After the dataset is available, verify that the structure is correct by inspecting the table in the dashboard and ensuring all fields are appropriately populated.
Creating an Experiment ๐ฌ
- Navigate to the Experiments tab within the dataset view
- Click โCreate Experimentโ to initiate the setup
- Assign a name to the experiment for easy identification
- Select the dataset that will serve as input for testing
Configuring Experiment โ๏ธ
Input Source ๐ฅ
- Select the column in the dataset that contains the input text for the model
- This column provides the context for the experiment and determines how the model will generate responses
Model Selection ๐ค
Choose the LLM model that will process the input. Adjust key parameters to control how the model generates responses:
- Temperature ๐ก๏ธ - Controls randomness; lower values produce more deterministic outputs
- Top P ๐ - Regulates sampling diversity by restricting token probability mass
- Max Tokens ๐ - Defines the maximum response length
- Presence & Frequency Penalty ๐ - Adjusts token repetition behavior
- Response Format ๐ - Specifies the expected structure of the output
Prompt Template ๐ญ
- Define the prompt template that will be used during inference
- Use placeholders
{{variable}}
to inject dataset column values - Ensure the prompt aligns with your experiment goals
Evaluation Metrics ๐
You can either:
- Create new evaluation metrics tailored to the experiment โจ
- Use saved evaluations from previous experiments ๐พ
Learn more about evaluations โ
Running the Experiment โถ๏ธ
Once configured:
- Review all settings to ensure alignment with objectives ๐
- Click โSave and Runโ to begin ๐
- Monitor progress in the Summary tab ๐
The system will process the dataset through the configured model, applying the defined prompt structure and evaluation criteria.
Choosing the Best Prompt ๐
Accessing Results ๐
- Navigate to the Experiments tab and select the completed experiment
- View detailed performance metrics in the Summary tab
- Compare response time, token usage, accuracy, and quality scores
Selecting the Winner ๐ฏ
- Click โChoose Winnerโ in the summary view โ
- Adjust metric weights based on your priorities โ๏ธ
- Confirm your selection ๐
The winning configuration will be identified as the optimal choice for deployment and future iterations in production.
Was this page helpful?