Overview

Create, manage, and optimize AI prompts for reliable and consistent language model outputs.

About

A prompt is the instruction you give an AI model to produce a response. Getting that instruction right is one of the most impactful things you can do to improve your AI product, but managing prompts without a dedicated tool is messy. They end up hardcoded in application logic, changes are hard to track, and there is no way to compare versions or test them consistently.

Prompt Workbench solves this by giving every prompt a permanent, versioned home on the platform. You write prompts using variables so they can accept dynamic inputs at runtime. Every edit creates a new version, and you can compare any two versions side by side or roll back instantly. Prompts are reusable across the entire platform: run them against dataset rows, use them in simulations, include them in experiments, or fetch them from your application via the SDK.

The workbench is also connected to observability. Link a prompt to your production traces and see exactly how it performs on real traffic, closing the loop between what you write and what you ship.

How Prompt Connects to Other Features

  • Datasets: Run prompts against dataset rows to generate model outputs at scale. Learn more
  • Evaluation: Score prompt outputs with 70+ built-in metrics to measure quality. Learn more
  • Experiments: Compare prompt versions side by side on the same data. Learn more
  • Optimization: Feed eval scores into optimization algorithms to automatically improve prompts. Learn more
  • Observability: Link prompts to production traces to see latency, cost, and token usage per version. Learn more

Getting Started

Was this page helpful?

Questions & Discussion