Error Feed Sampling: Controlling Trace Analysis Rate

How sampling rate controls what percentage of traces Error Feed analyzes, and how to configure it per project in Observe settings.

About

Error Feed doesn’t analyze every trace by default. The sampling rate controls what percentage of incoming traces get analyzed. The tradeoff is coverage vs. cost: analyze more traces and you catch more errors, but you pay more for it.

Why sampling exists

Production agents can produce a lot of traces. Analyzing 100% of them at all times gets expensive at scale. Sampling lets you dial in a rate that makes sense for your situation: full coverage during development, a reduced rate in production, or 100% for critical projects where nothing can be missed.

The rate applies to new traces. Previously analyzed traces aren’t affected when you change it.

How to configure sampling

Sampling is configured per project in Observe settings.

Open the Observe project

Navigate to your project in the Observe section of the Future AGI dashboard.

Open the project settings drawer

Click the Configure (gear) icon in the project header to open the settings drawer.

Adjust the sampling slider

Find the Error Feed sampling rate control in the drawer. Drag the slider right to increase coverage, left to decrease.

Click Update

Click Update to apply. The new rate kicks in for traces that arrive after the update.

Note

The new rate only applies to new traces. Previously analyzed traces aren’t re-analyzed or de-analyzed when you change the rate.

Choosing a sampling rate

There’s no universally correct rate. It depends on your trace volume, cost tolerance for the project, and how critical full error coverage is.

SituationRecommended rate
Development or testing100% — catch everything while you’re actively iterating
Low-volume production100% or close to it — the absolute cost is low
High-volume production10–20% — enough to catch systematic issues, affordable at scale
Critical path / safety-sensitive100% — can’t afford to miss errors
Cost-constrained, high volume5–10% — catches recurring patterns even at low rates

At 10% sampling, a systematic error that hits every trace shows up as a cluster with 10% of its true occurrence count. The error still gets detected and surfaced. Sampling reduces counts and may miss rare one-off failures, but it reliably catches recurring patterns.

Tip

Start at 100% when you first set up a project. Once you understand the error landscape and have addressed the biggest issues, drop the rate to something that makes sense for your production volume.

Effect on cluster trace counts

The trace count on an issue reflects how many analyzed traces ended up in the cluster, not the total number of traces where that error might have occurred. At 20% sampling, a cluster with 50 traces likely represents around 250 actual occurrences.

Keep the sampling rate in mind when comparing cluster sizes across projects or time periods. A 100-trace cluster from a 10%-sampled project represents more actual errors than a 100-trace cluster from a 100%-sampled project.

Effect on new issue detection

Rare errors (the ones that show up in only a small fraction of traces) are more likely to be missed at low rates. If you’re hunting an edge case that only triggers occasionally, temporarily bump the sampling rate up for the duration of the investigation.

Next Steps

Was this page helpful?

Questions & Discussion