Alerts and monitors
Define monitors on Observe project metrics (system or evaluation) and get notified by email or Slack when values cross a threshold.
What it is
Alerts and monitors are Observe’s way to get notified when a project metric crosses a threshold—so regressions in error rate, latency, cost, or evaluation quality can trigger email or Slack instead of someone watching the dashboard. Monitors cover system metrics (errors, response time, token usage) and evaluation metrics (e.g. toxicity, bias). Each monitor evaluates on a schedule and, when the threshold is breached, creates a critical or warning alert and sends notifications. Alert history is stored so past triggers can be reviewed and marked resolved; monitors can be muted without being deleted.
Use cases
- Error and reliability — Alert when error rate, LLM API failure rate, or error-free session rate crosses a threshold so you catch outages or degradation early.
- Latency and performance — Monitor span or LLM response time and get notified when p95 or average exceeds a limit.
- Cost and usage — Track token usage or daily/monthly tokens spent and alert when spend crosses a budget threshold.
- Evaluation quality — Monitor an eval (e.g. fail rate for a pass/fail eval, or a numeric score) and alert when quality drops below or goes above a value.
- Notifications — Send alerts to up to five email addresses and/or a Slack webhook so the right people are informed without checking the UI.
How to
Choose the metric
Create a monitor for an Observe project and select the metric type:

- System metrics — e.g. count of errors, error-free session rates, LLM API failure rates, span response time, LLM response time, token usage, daily/monthly tokens spent.
- Evaluation metrics — Attach a CustomEvalConfig (eval) for that project. For pass/fail or choice evals you can set threshold_metric_value to the specific value to monitor (e.g. fail rate or a choice label).
The monitor is scoped to one project (Observe projects only).
Define the threshold
Set how the alert is triggered:

- threshold_operator — Greater than or Less than (the current metric value is compared to the threshold).
- threshold_type — How the threshold is determined:
- Static — You set fixed critical_threshold_value and optionally warning_threshold_value. Alert fires when the metric is greater than (or less than) these values.
- Percentage change — Threshold is based on percentage change from a baseline (e.g. historical mean over a time window). You set critical_threshold_value and optionally warning_threshold_value as percentage values. auto_threshold_time_window (default one week, in minutes) defines the window used to compute the baseline.
When the condition is met, the system creates an alert log (critical or warning) and triggers notifications.
Set alert frequency
alert_frequency is how often the monitor is evaluated, in minutes (minimum 5, default 60). The monitor runs on this schedule and checks the metric over the relevant time window; if the threshold is breached, an alert is created and notifications are sent.
Configure notifications
- Email — Add up to five addresses in notification_emails. They receive an email when an alert is triggered (subject and body include alert name, message, and type).
- Slack — Set slack_webhook_url to your Slack incoming webhook. Optional slack_notes are included in the message.
You can use email only, Slack only, or both. Mute a monitor with is_mute to stop notifications without deleting it.
View and resolve alerts
Alert history is stored as UserAlertMonitorLog records (critical/warning, message, time window, link). You can list logs for a monitor, see when each alert fired, and mark them resolved. Use the monitor detail view in the UI to see trend data and unresolved count.
Note
Monitors are only available for projects with trace_type observe. Optional filters (same structure as eval-task filters) can narrow which spans are included when computing the metric.