Skip to main content
Use wrap_agent / wrapAgent for the agent run and OpenInference for child model spans. If you’re calling a provider SDK directly, this is the piece that adds prompt, completion, model, and token data to the trace.
Looking for the full run/step/tool-call lifecycle docs? Start with Custom Instrumentation Overview.

What OpenInference adds

OpenInference patches supported provider SDKs so each LLM call emits a child span in the active trace, usually named something like gen_ai.chat. Those spans show prompt, completion, model, and token attributes in Lemma. It does not replace wrapAgent, and it does not trace your tool execution or app logic between model calls. For that, use wrap_agent / wrapAgent and manual spans where needed. For the full list of supported SDKs and frameworks, see the OpenInference instrumentation docs.

Getting Started

Install the Lemma tracing package, your provider SDK, and the matching OpenInference instrumentor. The examples below use OpenAI, Anthropic, and LiteLLM, but the same pattern applies to other SDKs supported by OpenInference.
npm install @uselemma/tracing @opentelemetry/instrumentation

# Example: OpenAI
npm install openai @arizeai/openinference-instrumentation-openai

# Example: Anthropic
npm install anthropic @arizeai/openinference-instrumentation-anthropic
Register instrumentation once at startup, before importing or calling your provider client.
import { registerOTel } from '@uselemma/tracing';
import { registerInstrumentations } from '@opentelemetry/instrumentation';
import { OpenAIInstrumentation } from '@arizeai/openinference-instrumentation-openai';
import { AnthropicInstrumentation } from '@arizeai/openinference-instrumentation-anthropic';

const provider = registerOTel();

registerInstrumentations({
  instrumentations: [
    new OpenAIInstrumentation(),
    new AnthropicInstrumentation(),
  ],
  tracerProvider: provider,
});
Then call the provider inside a wrapped agent function:
import OpenAI from 'openai';
import { wrapAgent } from '@uselemma/tracing';

const client = new OpenAI();

const callAgent = wrapAgent('my-agent', async ({ onComplete }, input) => {
  const response = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: input.userMessage }],
  });

  const text = response.choices[0].message.content ?? '';
  onComplete(text);
  return text;
});
Anthropic and LiteLLM follow the same pattern: register the matching OpenInference instrumentor, then make the SDK call inside the wrapped function.
Set LEMMA_API_KEY and LEMMA_PROJECT_ID in the environment for the process that runs your app.

Common edge cases

Import and call-path pitfalls (Python)

Some Python instrumentors patch symbols on a module. If you bind a function before instrumentation runs, you can keep calling an unpatched reference.
  • Prefer module calls such as litellm.acompletion(...).
  • Avoid stale function imports such as from litellm import acompletion when debugging missing spans.
  • The same idea can apply to openai and other provider packages, depending on how the instrumentor patches the SDK.

Streaming

Keep stream consumption inside the wrapped function so the run span stays open. For more on that lifecycle, see Wrapping your agent. For example, with Anthropic, prefer messages.create({ ..., stream: true }) over the higher-level messages.stream() helper if you hit instrumentation compatibility issues.

Next Steps