Understanding Prototype

What Prototype is, the problem it solves, and how versions, traces, and evals work together before you ship.

About

Prototype is a pre-production testing environment for LLM applications. It gives you a structured way to run multiple configurations of your application:different prompts, models, or parameters:and compare them on real outputs before deciding what goes to production.

Without Prototype, the only way to know if a change made things better is to ship it and see. That means real users encounter regressions, hallucinations, or tone problems before you do. Prototype moves that discovery earlier: you run versions, score outputs automatically with evaluations, and compare everything in one dashboard before any version reaches production.


The core workflow

  1. Register your project with a version name and the evaluations you want to run.
  2. Instrument your application so every LLM call is automatically traced.
  3. Run your application:each generation is captured, tagged to its version, and scored.
  4. Compare versions in the Prototype dashboard by evaluation scores, cost, and latency.
  5. Promote the best-performing version to production.

Every step is designed to be low-friction: instrumentation is automatic, scoring happens in the background, and the dashboard surfaces the comparison without manual analysis.


What gets measured

Each version run is measured on three dimensions:

DimensionWhat it captures
Evaluation scoresQuality metrics like context adherence, toxicity, hallucination detection, and tone:scored automatically on every generation.
CostToken usage and estimated cost per generation for the model and configuration used.
LatencyResponse time per generation, so you can see the performance tradeoff of different models or prompts.

These three together give you a complete picture. A cheaper model may cost less but score worse on quality. A longer prompt may improve accuracy but add latency. Prototype shows all three at once.


Key concepts


Next steps

Was this page helpful?

Questions & Discussion