Spans
Spans are the fundamental units of tracing in observability frameworks, providing structured, event-level data for monitoring, debugging, and performance analysis. A span represents a discrete operation executed within a system, capturing execution timing, hierarchical relationships, and metadata relevant to the operation’s context.
They are aggregated into traces, which collectively depict the flow of execution across various system components. This document provides an in-depth technical analysis of spans, their attributes, classifications, and their role in system observability.
Structure of Spans
A span consists of multiple attributes that encapsulate its execution details. These attributes can be categorized into the following sections:
-
Identification and context provide the span’s unique ID, trace ID, and optional parent span ID, establishing hierarchical relationships. It may also include a project reference for system-wide organization.
-
Execution details define the operation recorded, including a descriptive name, span type (e.g., function call, API request, database query), and input/output data. If an operation fails, error metadata captures failure details like error codes, messages, and stack traces.
-
Timing and performance track execution efficiency through start and end timestamps, latency measurement, and resource usage, such as computational cost or token consumption for LLM-related spans.
-
Metadata and custom attributes provide additional context via tags, annotations, and JSON-based extensible fields. Execution environment details, including host machine, service instance, and deployment version, further enrich observability.
Types of Spans
Spans are categorized based on the type of operation they capture. This classification ensures structured trace analysis and aids in performance monitoring.
-
Tool Spans
It tracks operations executed by external tools or functions. It captures essential details, including the tool’s name, description, parameters, and performance metrics, enabling comprehensive monitoring of tool interactions. -
Chain Spans
It represents individual steps in a sequential workflow where data flows through multiple interconnected operations. It facilitates the visualization and analysis of execution pipelines, helping optimize process efficiency and detect bottlenecks. -
LLM Spans
It captures interactions with large language models, recording input prompts, generated completions, token usage, and invocation parameters. These spans provide insights into model performance, response times, and computational costs. -
Retriever Spans
It logs data retrieval operations, such as querying a database or fetching documents from an index. It stores search parameters and results, ensuring traceability and facilitating performance assessment of retrieval mechanisms. -
Embedding Spans
It tracks text-to-vector transformations used in machine learning applications. It records embedding vectors, associated model metadata, and processing details, supporting efficient monitoring of vectorization processes. -
Agent Spans
It documents actions performed by autonomous agents, including decision-making logic and tool interactions. It captures the rationale behind an agent’s choices, providing transparency into automated workflows and AI-driven decision processes. -
Reranker Spans
It logs result reordering or ranking adjustments based on specific scoring criteria. It retains input documents and their updated rankings, facilitating analysis of ranking models and relevance optimization. -
Unknown Spans
It serves as a fallback for operations that do not fit predefined span types. It ensures that all observed activities are recorded, even when their category is not explicitly defined. -
Guardrail Spans
It monitors compliance and enforce safety rules within a system. It captures validation results, applied policies, and compliance status, ensuring adherence to predefined operational constraints. -
Evaluator Spans
It represents assessment activities conducted to measure system performance or model effectiveness. It tracks evaluation metrics, scoring data, and feedback, supporting the continuous improvement of models and workflows.