Skip to main content
When you call OpenAI, Anthropic, or LiteLLM directly (not through a framework like Vercel AI SDK or LangChain), you need to add OpenInference instrumentation to get per-call child spans that show prompts, completions, model, and token usage. Without it, you only get the top-level ai.agent.run span — inputs, outputs, and timing are captured, but there are no child spans showing individual LLM calls. The examples below cover the providers documented in Lemma’s integration pages. OpenInference supports additional providers — see the OpenInference docs for the full list and setup guides.

How it works

OpenInference patches your provider SDK at startup. Every subsequent call to openai.chat.completions.create() (or equivalent) emits a gen_ai.chat child span that automatically nests under the currently active agent() span.
ai.agent.run          ← agent()
  gen_ai.chat         ← OpenInference (OpenAI call 1)
  gen_ai.chat         ← OpenInference (OpenAI call 2)

Setup

1. Install

npm install @uselemma/tracing openai @opentelemetry/instrumentation @arizeai/openinference-instrumentation-openai
For full registration steps (including registerOTel() + registerInstrumentations()), see your provider’s Integration page:
Register instrumentation before importing or instantiating the provider client. If you bind a function reference before the instrumentor runs, you keep an unpatched reference. For LiteLLM, always call litellm.acompletion(...) as a module call rather than importing the function directly.

What you’ll see in Lemma

SpanSourceContains
ai.agent.runagent()Run input, output, timing, run ID
gen_ai.chatOpenInferenceModel name, prompt, completion, token usage

Sending traces to multiple destinations

→ See Dual export for adding Lemma alongside an existing OTel destination.

Next Steps

  • Manual instrumentation — if you also need custom tool(), llm(), or retrieval() spans
  • Troubleshooting — spans not appearing, missing child spans, import pitfalls
  • Recipes — copy-paste examples for tool loops, streaming, multi-step agents