This guide walks you through the complete process of setting up and running an experiment in Future AGI platform. You will learn how to execute experiments, analyze results, compare different configurations, and select the best-performing setup based on structured evaluation.

Building a Dataset ๐Ÿ“Š

Before running any experiments, we need to make sure we have a well-structured dataset. This dataset provides the necessary input information, allowing the model to generate responses that can later be evaluated.

Learn more about datasets โ†’

After the dataset is available, verify that the structure is correct by inspecting the table in the dashboard and ensuring all fields are appropriately populated.

Creating an Experiment ๐Ÿ”ฌ

  1. Navigate to the Experiments tab within the dataset view
  2. Click โ€œCreate Experimentโ€ to initiate the setup
  3. Assign a name to the experiment for easy identification
  4. Select the dataset that will serve as input for testing

Configuring Experiment โš™๏ธ

Input Source ๐Ÿ“ฅ

  • Select the column in the dataset that contains the input text for the model
  • This column provides the context for the experiment and determines how the model will generate responses

Model Selection ๐Ÿค–

Choose the LLM model that will process the input. Adjust key parameters to control how the model generates responses:

  • Temperature ๐ŸŒก๏ธ - Controls randomness; lower values produce more deterministic outputs
  • Top P ๐Ÿ“Š - Regulates sampling diversity by restricting token probability mass
  • Max Tokens ๐Ÿ“ - Defines the maximum response length
  • Presence & Frequency Penalty ๐Ÿ”„ - Adjusts token repetition behavior
  • Response Format ๐Ÿ“ - Specifies the expected structure of the output

Prompt Template ๐Ÿ’ญ

  • Define the prompt template that will be used during inference
  • Use placeholders {{variable}} to inject dataset column values
  • Ensure the prompt aligns with your experiment goals

Learn more about prompts โ†’

Evaluation Metrics ๐Ÿ“ˆ

You can either:

  • Create new evaluation metrics tailored to the experiment โœจ
  • Use saved evaluations from previous experiments ๐Ÿ’พ

Learn more about evaluations โ†’

Running the Experiment โ–ถ๏ธ

Once configured:

  1. Review all settings to ensure alignment with objectives ๐Ÿ”
  2. Click โ€œSave and Runโ€ to begin ๐Ÿš€
  3. Monitor progress in the Summary tab ๐Ÿ“Š

The system will process the dataset through the configured model, applying the defined prompt structure and evaluation criteria.

Choosing the Best Prompt ๐Ÿ†

Accessing Results ๐Ÿ“Š

  • Navigate to the Experiments tab and select the completed experiment
  • View detailed performance metrics in the Summary tab
  • Compare response time, token usage, accuracy, and quality scores

Selecting the Winner ๐ŸŽฏ

  1. Click โ€œChoose Winnerโ€ in the summary view โœ…
  2. Adjust metric weights based on your priorities โš–๏ธ
  3. Confirm your selection ๐ŸŽ‰

The winning configuration will be identified as the optimal choice for deployment and future iterations in production.

Was this page helpful?