Provider instrumentation

Use wrap_agent / wrapAgent for the agent run and OpenInference for child model spans. If you’re calling a provider SDK directly, this is the piece that adds prompt, completion, model, and token data to the trace.

Looking for the full run/step/tool-call lifecycle docs? Start with Custom Instrumentation Overview.

What OpenInference adds

OpenInference patches supported provider SDKs so each LLM call emits a child span in the active trace, usually named something like gen_ai.chat. Those spans show prompt, completion, model, and token attributes in Lemma. It does not replace wrapAgent, and it does not trace your tool execution or app logic between model calls. For that, use wrap_agent / wrapAgent and manual spans where needed. For the full list of supported SDKs and frameworks, see the OpenInference instrumentation docs.

Getting Started

Install the Lemma tracing package, your provider SDK, and the matching OpenInference instrumentor. The examples below use OpenAI, Anthropic, and LiteLLM, but the same pattern applies to other SDKs supported by OpenInference.

TypeScript
Python

npm install @uselemma/tracing @opentelemetry/instrumentation

# Example: OpenAI
npm install openai @arizeai/openinference-instrumentation-openai

# Example: Anthropic
npm install anthropic @arizeai/openinference-instrumentation-anthropic

# Example: OpenAI
pip install uselemma-tracing openai openinference-instrumentation-openai

# Example: Anthropic
pip install uselemma-tracing anthropic openinference-instrumentation-anthropic

# Example: LiteLLM
pip install uselemma-tracing litellm openinference-instrumentation-litellm

TypeScript
Python

import { registerOTel } from '@uselemma/tracing';
import { registerInstrumentations } from '@opentelemetry/instrumentation';
import { OpenAIInstrumentation } from '@arizeai/openinference-instrumentation-openai';
import { AnthropicInstrumentation } from '@arizeai/openinference-instrumentation-anthropic';

const provider = registerOTel();

registerInstrumentations({
  instrumentations: [
    new OpenAIInstrumentation(),
    new AnthropicInstrumentation(),
  ],
  tracerProvider: provider,
});

from uselemma_tracing import register_otel
from openinference.instrumentation.openai import OpenAIInstrumentor
from openinference.instrumentation.anthropic import AnthropicInstrumentor
from openinference.instrumentation.litellm import LiteLLMInstrumentor

register_otel()

# Choose the one(s) you use.
OpenAIInstrumentor().instrument()
AnthropicInstrumentor().instrument()
LiteLLMInstrumentor().instrument()

Then call the provider inside a wrapped agent function:

TypeScript
Python

import OpenAI from 'openai';
import { wrapAgent } from '@uselemma/tracing';

const client = new OpenAI();

const callAgent = wrapAgent('my-agent', async ({ onComplete }, input) => {
  const response = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: input.userMessage }],
  });

  const text = response.choices[0].message.content ?? '';
  onComplete(text);
  return text;
});

from openai import AsyncOpenAI
from uselemma_tracing import TraceContext, wrap_agent

client = AsyncOpenAI()

async def run_agent(ctx: TraceContext, user_message: str):
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": user_message}],
    )
    text = response.choices[0].message.content or ""
    ctx.on_complete(text)
    return text

call_agent = wrap_agent("my-agent", run_agent)

Anthropic and LiteLLM follow the same pattern: register the matching OpenInference instrumentor, then make the SDK call inside the wrapped function.

Set LEMMA_API_KEY and LEMMA_PROJECT_ID in the environment for the process that runs your app.

Common edge cases

Import and call-path pitfalls (Python)

Some Python instrumentors patch symbols on a module. If you bind a function before instrumentation runs, you can keep calling an unpatched reference.

Prefer module calls such as litellm.acompletion(...).
Avoid stale function imports such as from litellm import acompletion when debugging missing spans.
The same idea can apply to openai and other provider packages, depending on how the instrumentor patches the SDK.

Streaming

Keep stream consumption inside the wrapped function so the run span stays open. For more on that lifecycle, see Wrapping your agent. For example, with Anthropic, prefer messages.create({ ..., stream: true }) over the higher-level messages.stream() helper if you hit instrumentation compatibility issues.

Next Steps

Wrapping your agent for run lifecycle and streaming behavior
Custom Instrumentation for manual run, step, and tool-call spans
Guides: Dual export for a destination-focused walkthrough

Getting Started

Tracing

Experiments

Integrations

Provider instrumentation

What OpenInference adds

Getting Started

Common edge cases

Import and call-path pitfalls (Python)

Streaming

Next Steps

Getting Started

Tracing

Experiments

Integrations

​What OpenInference adds

​Getting Started

​Common edge cases

​Import and call-path pitfalls (Python)

​Streaming

​Next Steps

What OpenInference adds

Getting Started

Common edge cases

Import and call-path pitfalls (Python)

Streaming

Next Steps