How to Run Experiments
Learn how to set up and run experiments in Future AGI platform
This guide walks you through the complete process of setting up and running an experiment in Future AGI platform. You will learn how to execute experiments, analyze results, compare different configurations, and select the best-performing setup based on structured evaluation.
Building a Dataset π
Before running any experiments, we need to make sure we have a well-structured dataset. This dataset provides the necessary input information, allowing the model to generate responses that can later be evaluated.
After the dataset is available, verify that the structure is correct by inspecting the table in the dashboard and ensuring all fields are appropriately populated.
Creating an Experiment π¬
- Navigate to the Experiments tab within the dataset view
- Click βCreate Experimentβ to initiate the setup
- Assign a name to the experiment for easy identification
- Select the dataset that will serve as input for testing
Configuring Experiment βοΈ
Input Source π₯
- Select the column in the dataset that contains the input text for the model
- This column provides the context for the experiment and determines how the model will generate responses
Model Selection π€
Choose the LLM model that will process the input. Adjust key parameters to control how the model generates responses:
- Temperature π‘οΈ - Controls randomness; lower values produce more deterministic outputs
- Top P π - Regulates sampling diversity by restricting token probability mass
- Max Tokens π - Defines the maximum response length
- Presence & Frequency Penalty π - Adjusts token repetition behavior
- Response Format π - Specifies the expected structure of the output
Prompt Template π
- Define the prompt template that will be used during inference
- Use placeholders
{{variable}}
to inject dataset column values - Ensure the prompt aligns with your experiment goals
Evaluation Metrics π
You can either:
- Create new evaluation metrics tailored to the experiment β¨
- Use saved evaluations from previous experiments πΎ
Learn more about evaluations β
Running the Experiment βΆοΈ
Once configured:
- Review all settings to ensure alignment with objectives π
- Click βSave and Runβ to begin π
- Monitor progress in the Summary tab π
The system will process the dataset through the configured model, applying the defined prompt structure and evaluation criteria.
Choosing the Best Prompt π
Accessing Results π
- Navigate to the Experiments tab and select the completed experiment
- View detailed performance metrics in the Summary tab
- Compare response time, token usage, accuracy, and quality scores
Selecting the Winner π―
- Click βChoose Winnerβ in the summary view β
- Adjust metric weights based on your priorities βοΈ
- Confirm your selection π
The winning configuration will be identified as the optimal choice for deployment and future iterations in production.