What you’ll build
One agent execution becomes one trace. A realistic support agent retrieves context, calls a tool, and asks a model — all nested under a single root:1. Install
- TypeScript
- Python
2. Configure the exporter (once, at startup)
Point Langfuse at Lemma and register it before any agent or model client runs. Late initialization is the most common reason spans go missing.- TypeScript
- Python
LANGFUSE_* credentials.
3. The complete instrumented agent
Here is the whole agent in one piece. Each part is explained below.- TypeScript
- Python
How each part maps to the contract
- Root span — one execution, one trace. Everything nests inside this callback. → Traces
- Trace context — agent name, thread (
sessionId), and user, set once for the whole trace. → Threads & context - Span — groups a sub-task so its work nests beneath it. → Spans
- Tool call (nested) — a tool invoked as part of retrieval;
inputis the args,outputis the result. → Tool calls - Tool call (top-level) — a tool directly under the root.
- Generation — the LLM call, carrying
modelandusageDetailsso Lemma can compute cost and tokens. → Generations - Output — the final answer recorded on the root.
- Errors —
level: "ERROR"on the failing span so failures are visible. - Flush — force a flush before a short-lived process exits. → Setup
4. Instrumenting an agent loop
Most agents loop: the model calls tools, you feed results back, repeat. The rule is unchanged — the whole loop is one trace. Open the root once, then create a generation per model turn and a tool call per tool invocation, all inside the root callback.- TypeScript
- Python
Already using a framework like the Vercel AI SDK, OpenAI Agents, or LangGraph? It can emit these spans for you — see Frameworks. You still wrap the run in one root span so every turn nests into a single trace.
5. Verify in Lemma
Open the Lemma dashboard → Traces. Confirm:- One trace per run — the whole execution is one trace, not separate traces per call.
- Root has input and output — the user message and the final answer.
- Generations are nested — each LLM call shows model and token usage.
- Tools are nested — each tool call shows arguments and result.
- Context is set — agent name, thread, and user appear on the trace.
Go deeper
Traces
The root span, errors, and context propagation.
Generations
LLM calls with model and token usage.
Tool calls
Tool arguments, results, and a reusable wrapper.
Trace contract
The exact shape Lemma reads.