This guide walks you through the complete process of setting up and running an experiment in Future AGI platform. You will learn how to execute experiments, analyze results, compare different configurations, and select the best-performing setup based on structured evaluation.

Building a Dataset πŸ“Š

Before running any experiments, we need to make sure we have a well-structured dataset. This dataset provides the necessary input information, allowing the model to generate responses that can later be evaluated.

Learn more about datasets β†’

After the dataset is available, verify that the structure is correct by inspecting the table in the dashboard and ensuring all fields are appropriately populated.

Creating an Experiment πŸ”¬

  1. Navigate to the Experiments tab within the dataset view
  2. Click β€œCreate Experiment” to initiate the setup
  3. Assign a name to the experiment for easy identification
  4. Select the dataset that will serve as input for testing

Configuring Experiment βš™οΈ

Input Source πŸ“₯

  • Select the column in the dataset that contains the input text for the model
  • This column provides the context for the experiment and determines how the model will generate responses

Model Selection πŸ€–

Choose the LLM model that will process the input. Adjust key parameters to control how the model generates responses:

  • Temperature 🌑️ - Controls randomness; lower values produce more deterministic outputs
  • Top P πŸ“Š - Regulates sampling diversity by restricting token probability mass
  • Max Tokens πŸ“ - Defines the maximum response length
  • Presence & Frequency Penalty πŸ”„ - Adjusts token repetition behavior
  • Response Format πŸ“ - Specifies the expected structure of the output

Prompt Template πŸ’­

  • Define the prompt template that will be used during inference
  • Use placeholders {{variable}} to inject dataset column values
  • Ensure the prompt aligns with your experiment goals

Learn more about prompts β†’

Evaluation Metrics πŸ“ˆ

You can either:

  • Create new evaluation metrics tailored to the experiment ✨
  • Use saved evaluations from previous experiments πŸ’Ύ

Learn more about evaluations β†’

Running the Experiment ▢️

Once configured:

  1. Review all settings to ensure alignment with objectives πŸ”
  2. Click β€œSave and Run” to begin πŸš€
  3. Monitor progress in the Summary tab πŸ“Š

The system will process the dataset through the configured model, applying the defined prompt structure and evaluation criteria.

Choosing the Best Prompt πŸ†

Accessing Results πŸ“Š

  • Navigate to the Experiments tab and select the completed experiment
  • View detailed performance metrics in the Summary tab
  • Compare response time, token usage, accuracy, and quality scores

Selecting the Winner 🎯

  1. Click β€œChoose Winner” in the summary view βœ…
  2. Adjust metric weights based on your priorities βš–οΈ
  3. Confirm your selection πŸŽ‰

The winning configuration will be identified as the optimal choice for deployment and future iterations in production.