When you call OpenAI, Anthropic, or LiteLLM directly (not through a framework like Vercel AI SDK or LangChain), you need to add OpenInference instrumentation to get per-call child spans that show prompts, completions, model, and token usage.
Without it, you only get the top-level ai.agent.run span — inputs, outputs, and timing are captured, but there are no child spans showing individual LLM calls.
The examples below cover the providers documented in Lemma’s integration pages. OpenInference supports additional providers — see the OpenInference docs for the full list and setup guides.
How it works
OpenInference patches your provider SDK at startup. Every subsequent call to openai.chat.completions.create() (or equivalent) emits a gen_ai.chat child span that automatically nests under the currently active agent() span.
ai.agent.run ← agent()
gen_ai.chat ← OpenInference (OpenAI call 1)
gen_ai.chat ← OpenInference (OpenAI call 2)
Setup
1. Install
OpenAI (TypeScript)
Anthropic (TypeScript)
OpenAI (Python)
Anthropic (Python)
LiteLLM (Python)
npm install @uselemma/tracing openai @opentelemetry/instrumentation @arizeai/openinference-instrumentation-openai
npm install @uselemma/tracing anthropic @opentelemetry/instrumentation @arizeai/openinference-instrumentation-anthropic
pip install uselemma-tracing openai openinference-instrumentation-openai
pip install uselemma-tracing anthropic openinference-instrumentation-anthropic
pip install uselemma-tracing litellm openinference-instrumentation-litellm
For full registration steps (including registerOTel() + registerInstrumentations()), see your provider’s Integration page:
Register instrumentation before importing or instantiating the provider client. If you bind a function reference before the instrumentor runs, you keep an unpatched reference. For LiteLLM, always call litellm.acompletion(...) as a module call rather than importing the function directly.
What you’ll see in Lemma
| Span | Source | Contains |
|---|
ai.agent.run | agent() | Run input, output, timing, run ID |
gen_ai.chat | OpenInference | Model name, prompt, completion, token usage |
Sending traces to multiple destinations
→ See Dual export for adding Lemma alongside an existing OTel destination.
Next Steps
- Manual instrumentation — if you also need custom
tool(), llm(), or retrieval() spans
- Troubleshooting — spans not appearing, missing child spans, import pitfalls
- Recipes — copy-paste examples for tool loops, streaming, multi-step agents