Standardizing span attributes across various models, frameworks, and vendors

When sending traces, you might want to define custom attributes for each span. Semantic conventions are specific attribute keys or values that hold special significance. In Future AGI, certain attribute keys are highlighted more prominently, in addition to showing up in the attributes tab like other keys.

Types of Attributes

  • Span
  • Message
  • Document
  • Reranker
  • Embedding
  • Tool Call
class SpanAttributes:
    # Output related attributes
    OUTPUT_VALUE = "output.value"
    OUTPUT_MIME_TYPE = "output.mime_type"
    # The type of output.value. If unspecified, the type is plain text by default.
    # If type is JSON, the value is a string representing a JSON object.

    INPUT_VALUE = "input.value"
    INPUT_MIME_TYPE = "input.mime_type"
    # The type of input.value. If unspecified, the type is plain text by default.
    # If type is JSON, the value is a string representing a JSON object.

    # Embedding related attributes
    EMBEDDING_EMBEDDINGS = "embedding.embeddings"
    # A list of objects containing embedding data, including the vector and represented piece of text.

    EMBEDDING_MODEL_NAME = "embedding.model_name"
    # The name of the embedding model.

    # LLM related attributes
    LLM_FUNCTION_CALL = "llm.function_call"
    # For models and APIs that support function calling. Records attributes such as the function
    # name and arguments to the called function.

    LLM_INVOCATION_PARAMETERS = "llm.invocation_parameters"
    # Invocation parameters passed to the LLM or API, such as the model name, temperature, etc.

    LLM_INPUT_MESSAGES = "llm.input_messages"
    # Messages provided to a chat API.

    LLM_OUTPUT_MESSAGES = "llm.output_messages"
    # Messages received from a chat API.

    LLM_MODEL_NAME = "llm.model_name"
    # The name of the model being used.

    LLM_PROVIDER = "llm.provider"
    # The provider of the model, such as OpenAI, Azure, Google, etc.

    LLM_SYSTEM = "llm.system"
    # The AI product as identified by the client or server

    LLM_PROMPTS = "llm.prompts"
    # Prompts provided to a completions API.

    LLM_PROMPT_TEMPLATE = "llm.prompt_template.template"
    # The prompt template as a Python f-string.

    LLM_PROMPT_TEMPLATE_VARIABLES = "llm.prompt_template.variables"
    # A list of input variables to the prompt template.

    LLM_PROMPT_TEMPLATE_VERSION = "llm.prompt_template.version"
    # The version of the prompt template being used.

    LLM_TOKEN_COUNT_PROMPT = "llm.token_count.prompt"
    # Number of tokens in the prompt.

    LLM_TOKEN_COUNT_COMPLETION = "llm.token_count.completion"
    # Number of tokens in the completion.

    LLM_TOKEN_COUNT_TOTAL = "llm.token_count.total"
    # Total number of tokens, including both prompt and completion.

    LLM_TOOLS = "llm.tools"
    # List of tools that are advertised to the LLM to be able to call

    # Tool related attributes
    TOOL_NAME = "tool.name"
    # Name of the tool being used.

    TOOL_DESCRIPTION = "tool.description"
    # Description of the tool's purpose, typically used to select the tool.

    TOOL_PARAMETERS = "tool.parameters"
    # Parameters of the tool represented a dictionary JSON string

    RETRIEVAL_DOCUMENTS = "retrieval.documents"

    METADATA = "metadata"
    # Metadata attributes are used to store user-defined key-value pairs.

    TAG_TAGS = "tag.tags"
    # Custom categorical tags for the span.

    FI_SPAN_KIND = "fi.span.kind"

    SESSION_ID = "session.id"
    # The id of the session

    USER_ID = "user.id"
    # The id of the user

    INPUT_IMAGES = "llm.input.images"
    # A list of input images provided to the model.

    EVAL_INPUT = "eval.input"
    # Input being sent to the eval

    RAW_INPUT = "raw.input"
    # Raw input being sent to otel

    RAW_OUTPUT = "raw.output"
    # Raw output being sent from otel

    QUERY = "query"
    # The query being sent to the model

    RESPONSE = "response"
    # The response being sent from the model

For a comprehensive guide to Python semantic conventions, refer to the following resource on GitHub: FI Python Semantic Conventions.

Attribute Overview

AttributeTypeExampleDescription
document.contentString"This is a sample document content."The content of a retrieved document
document.idString/Integer"1234" or 1Unique identifier for a document
document.metadataJSON String"{'author': 'John Doe', 'date': '2023-09-09'}"Metadata associated with a document
document.scoreFloat0.98Score representing the relevance of a document
embedding.embeddingsList of objects[{"embedding.vector": [...], "embedding.text": "hello"}]List of embedding objects including text and vector data
embedding.model_nameString"BERT-base"Name of the embedding model used
embedding.textString"hello world"The text represented in the embedding
embedding.vectorList of floats[0.123, 0.456, ...]The embedding vector consisting of a list of floats
exception.escapedBooleantrueIndicator if the exception has escaped the span’s scope
exception.messageString"Null value encountered"Detailed message describing the exception
exception.stacktraceString"at app.main(app.java:16)"The stack trace of the exception
exception.typeString"NullPointerException"The type of exception that was thrown
input.mime_typeString"text/plain" or "application/json"MIME type representing the format of input.value
input.valueString"{'query': 'What is the weather today?'}"The input value to an operation
llm.function_callJSON String"{function_name: 'add', args: [1, 2]}"Object recording details of a function call in models or APIs
llm.input_messagesList of objects†[{"message.role": "user", "message.content": "hello"}]List of messages sent to the LLM in a chat API request
llm.invocation_parametersJSON string"{'model_name': 'gpt-3', 'temperature': 0.7}"Parameters used during the invocation of an LLM or API
llm.model_nameString"gpt-3.5-turbo"The name of the language model being utilized
llm.output_messagesList of objects†[{"message.role": "user", "message.content": "hello"}]List of messages received from the LLM in a chat API request
llm.prompt_template.templateString"Weather forecast for {city} on {date}"Template used to generate prompts as Python f-strings
llm.prompt_template.variablesJSON String"{'context': '<context from retrieval>', 'subject': 'math'}"JSON of key value pairs applied to the prompt template
llm.prompt_template.versionString"v1.0"The version of the prompt template
llm.token_count.completionInteger15The number of tokens in the completion
llm.token_count.promptInteger5The number of tokens in the prompt
llm.token_count.totalInteger20Total number of tokens, including prompt and completion
message.contentString"What's the weather today?"The content of a message in a chat
message.function_call_arguments_jsonJSON String"{'x': 2}"The arguments to the function call in JSON
message.function_call_nameString"multiply" or "subtract"Function call function name
message.roleString"user" or "system"Role of the entity in a message (e.g., user, system)
message.tool_callsList of objects†[{"tool_call.function.name": "get_current_weather"}]List of tool calls (e.g. function calls) generated by the LLM
metadataJSON String"{'author': 'John Doe', 'date': '2023-09-09'}"Metadata associated with a span
fi.span.kindString"CHAIN"The kind of span (e.g., CHAIN, LLM, RETRIEVER, RERANKER)
output.mime_typeString"text/plain" or "application/json"MIME type representing the format of output.value
output.valueString"Hello, World!"The output value of an operation
reranker.input_documentsList of objects†[{"document.id": "1", "document.score": 0.9, "document.content": "..."}]List of documents as input to the reranker
reranker.model_nameString"cross-encoder/ms-marco-MiniLM-L-12-v2"Model name of the reranker
reranker.output_documentsList of objects†[{"document.id": "1", "document.score": 0.9, "document.content": "..."}]List of documents outputted by the reranker
reranker.queryString"How to format timestamp?"Query parameter of the reranker
reranker.top_kInteger3Top K parameter of the reranker
retrieval.documentsList of objects†[{"document.id": "1", "document.score": 0.9, "document.content": "..."}]List of retrieved documents
session.idString"26bcd3d2-cad2-443d-a23c-625e47f3324a"Unique identifier for a session
tag.tagsList of strings["shopping", "travel"]List of tags to give the span a category
tool.descriptionString"An API to get weather data."Description of the tool’s purpose and functionality
tool.nameString"WeatherAPI"The name of the tool being utilized
tool.parametersJSON string"{'a': 'int'}"The parameters definition for invoking the tool
tool_call.function.argumentsJSON string"{'city': 'London'}"The arguments for the function being invoked by a tool call
tool_call.function.nameString"get_current_weather"The name of the function being invoked by a tool call
user.idString"9328ae73-7141-4f45-a044-8e06192aa465"Unique identifier for a user

Using Semantic Conventions

Here’s an example of how to implement a semantic convention. Treat them as strings when setting an attribute on a span:

% pip install fi-instrumentation-otel

from fi_instrumentation.fi_types import SpanAttributes, FiSpanKindValues

def chat(message: str):
    with tracer.start_as_current_span("an_llm_span") as span:
        span.set_attribute(
            SpanAttributes.FI_SPAN_KIND,
            FiSpanKindValues.LLM.value
        )
        
        # Equivalent to:
        # span.set_attribute(
        #     "fi.span.kind",
        #     "LLM",
        # )
        
        span.set_attribute(
            SpanAttributes.INPUT_VALUE,
            message,
        )

Converting Messages to OpenTelemetry Span Attributes

To export a list of objects as OpenTelemetry span attributes, flatten the list until the attribute values are simple types, such as bool, str, bytes, int, float, or simple lists like List[bool], List[str], List[bytes], List[int], List[float].

Python Example

# List of messages from OpenAI or another LLM provider
messages = [{"message.role": "user", "message.content": "hello"},
            {"message.role": "assistant", "message.content": "hi"}]

# Assuming you have a span object already created
for i, obj in enumerate(messages):
    for key, value in obj.items():
        span.set_attribute(f"input.messages.{i}.{key}", value)