OpenAI Java Integration with Future AGI Tracing

Trace OpenAI chat completions, embeddings, and streaming responses in Java with TracedOpenAIClient for automatic LLM observability in Future AGI.

📝

TL;DR

TracedOpenAIClient wraps the official com.openai Java SDK
Traces chat completions, embeddings, and streaming
Captures messages, token counts, model info, finish reason
Streaming collects all chunks into a single span

Prerequisites

Complete the Java SDK setup first. You need TraceAI.init() called before using this wrapper.

Installation

<dependency>
    <groupId>com.github.future-agi.traceAI</groupId>
    <artifactId>traceai-java-openai</artifactId>
    <version>main-SNAPSHOT</version>
</dependency>

implementation 'com.github.future-agi.traceAI:traceai-java-openai:main-SNAPSHOT'

You also need the OpenAI Java SDK:

<dependency>
    <groupId>com.openai</groupId>
    <artifactId>openai-java</artifactId>
    <version>0.8.0</version>
</dependency>

implementation 'com.openai:openai-java:0.8.0'

Wrap the client

import ai.traceai.TraceAI;
import ai.traceai.openai.TracedOpenAIClient;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;

// Initialize TraceAI (once, at startup)
TraceAI.initFromEnvironment();

// Create the OpenAI client
OpenAIClient client = OpenAIOkHttpClient.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .build();

// Wrap it
TracedOpenAIClient traced = new TracedOpenAIClient(client);

Or with an explicit tracer:

import ai.traceai.FITracer;

FITracer tracer = TraceAI.getTracer();
TracedOpenAIClient traced = new TracedOpenAIClient(client, tracer);

Chat completions

import com.openai.models.*;

ChatCompletion response = traced.createChatCompletion(
    ChatCompletionCreateParams.builder()
        .model("gpt-4o-mini")
        .addMessage(ChatCompletionMessageParam.ofChatCompletionSystemMessageParam(
            ChatCompletionSystemMessageParam.builder()
                .role(ChatCompletionSystemMessageParam.Role.SYSTEM)
                .content(ChatCompletionSystemMessageParam.Content.ofTextContent(
                    "You are a helpful assistant."))
                .build()))
        .addMessage(ChatCompletionMessageParam.ofChatCompletionUserMessageParam(
            ChatCompletionUserMessageParam.builder()
                .role(ChatCompletionUserMessageParam.Role.USER)
                .content(ChatCompletionUserMessageParam.Content.ofTextContent(
                    "What is the capital of France?"))
                .build()))
        .temperature(0.7)
        .build()
);

System.out.println(response.choices().get(0).message().content().orElse(""));

Span created: “OpenAI Chat Completion” with kind LLM

Embeddings

import com.openai.models.*;

CreateEmbeddingResponse response = traced.createEmbedding(
    EmbeddingCreateParams.builder()
        .model("text-embedding-3-small")
        .input(EmbeddingCreateParams.Input.ofString("Hello world"))
        .build()
);

System.out.println("Dimensions: " + response.data().get(0).embedding().size());

Span created: “OpenAI Embedding” with kind EMBEDDING

Streaming

The streaming wrapper collects all chunks, records the full response in the span, then returns them as an Iterable:

import com.openai.models.*;

Iterable<ChatCompletionChunk> chunks = traced.streamChatCompletion(
    ChatCompletionCreateParams.builder()
        .model("gpt-4o-mini")
        .addMessage(ChatCompletionMessageParam.ofChatCompletionUserMessageParam(
            ChatCompletionUserMessageParam.builder()
                .role(ChatCompletionUserMessageParam.Role.USER)
                .content(ChatCompletionUserMessageParam.Content.ofTextContent(
                    "Write a haiku about Java."))
                .build()))
        .build()
);

for (ChatCompletionChunk chunk : chunks) {
    chunk.choices().get(0).delta().content().ifPresent(System.out::print);
}

Span created: “OpenAI Chat Completion (Stream)” with kind LLM. The span captures the accumulated full response, not individual chunks.

What gets captured

Chat completion spans

Attribute	Example
`llm.provider`	`openai`
`llm.request.model`	`gpt-4o-mini`
`llm.response.model`	`gpt-4o-mini-2024-07-18`
`llm.response.id`	`chatcmpl-abc123`
`llm.request.temperature`	`0.7`
`llm.request.top_p`	`1.0`
`llm.request.max_tokens`	`1024`
`llm.token_count.prompt`	`15`
`llm.token_count.completion`	`42`
`llm.token_count.total`	`57`
`llm.response.finish_reason`	`stop`
Input/output messages	Structured role + content JSON
`fi.raw_input` / `fi.raw_output`	Full request/response JSON

Embedding spans

Attribute	Example
`embedding.model_name`	`text-embedding-3-small`
`embedding.vector_count`	`1`
`embedding.dimensions`	`1536`
`llm.token_count.prompt`	`2`
`llm.token_count.total`	`2`

Accessing the original client

If you need the unwrapped client for operations that aren’t traced:

OpenAIClient original = traced.unwrap();

Was this page helpful?

Questions & Discussion