Skip to main content
This is the full path from an uninstrumented agent to one complete, well-shaped trace in Lemma. Everything you need is on this page; the per-primitive pages (Traces, Generations, Tool calls, Spans, Threads & context) go deeper on each piece.
Lemma is opinionated, not a generic OTLP destination. It reads a specific trace shape. Follow this walkthrough and your agent produces that shape; forward arbitrary spans and your traces will render empty or broken.

What you’ll build

One agent execution becomes one trace. A realistic support agent retrieves context, calls a tool, and asks a model — all nested under a single root:
support-agent                  ← trace root (input, output, agent name, thread, user)
├─ retrieve-context            ← span
│  └─ search_docs              ← tool call (args, result)
├─ lookup_order                ← tool call (args, result)
└─ answer                      ← generation (model, tokens, prompt, completion)

1. Install

npm install @langfuse/tracing @langfuse/otel @opentelemetry/sdk-trace-node @opentelemetry/exporter-trace-otlp-proto

2. Configure the exporter (once, at startup)

Point Langfuse at Lemma and register it before any agent or model client runs. Late initialization is the most common reason spans go missing.
// instrumentation.ts — imported first, before your app code
import { LangfuseSpanProcessor } from "@langfuse/otel";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-proto";
import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node";

export const lemmaProcessor = new LangfuseSpanProcessor({
  exporter: new OTLPTraceExporter({
    url: process.env.LEMMA_BASE_URL,
    headers: {
      Authorization: `Bearer ${process.env.LEMMA_API_KEY}`,
      "X-Lemma-Project-ID": process.env.LEMMA_PROJECT_ID,
    },
  }),
});

new NodeTracerProvider({ spanProcessors: [lemmaProcessor] }).register();
Set the environment (find these in Lemma project settings). Lemma-only export needs no LANGFUSE_* credentials.
export LEMMA_BASE_URL="https://api.uselemma.ai/otel/v1/traces"
export LEMMA_API_KEY="lma_..."
export LEMMA_PROJECT_ID="proj_..."

3. The complete instrumented agent

Here is the whole agent in one piece. Each part is explained below.
import {
  propagateAttributes,
  startActiveObservation,
} from "@langfuse/tracing";
import { lemmaProcessor } from "./instrumentation";

export async function handleSupportRequest(req: {
  message: string;
  conversationId: string;
  userId: string;
}): Promise<string> {
  // (1) One agent execution = one trace. Open the root span.
  return startActiveObservation("support-agent", async (root) => {
    root.update({ input: req.message });

    // (2) Trace-level context: agent name, thread, user.
    return propagateAttributes(
      {
        traceName: "support-agent",
        sessionId: req.conversationId,
        userId: req.userId,
        metadata: { "gen_ai.agent.name": "support-agent" },
      },
      async () => {
        try {
          // (3) A span groups a multi-step sub-task (retrieval).
          const docs = await startActiveObservation(
            "retrieve-context",
            async (span) => {
              span.update({ input: { query: req.message } });

              // (4) A tool call nested inside the span.
              const found = await startActiveObservation(
                "search_docs",
                async (tool) => {
                  const result = await searchDocs(req.message);
                  tool.update({ input: { query: req.message }, output: result });
                  return result;
                },
                { asType: "tool" },
              );

              span.update({ output: { count: found.length } });
              return found;
            },
          );

          // (5) Another tool call, directly under the root.
          const order = await startActiveObservation(
            "lookup_order",
            async (tool) => {
              const result = await lookupOrder(req.userId);
              tool.update({ input: { userId: req.userId }, output: result });
              return result;
            },
            { asType: "tool" },
          );

          // (6) A generation: the LLM call, with model + token usage.
          const answer = await startActiveObservation(
            "answer",
            async (gen) => {
              const messages = buildPrompt(req.message, docs, order);
              const r = await callModel(messages);
              gen.update({
                input: messages,
                output: r.text,
                model: "gpt-4o",
                usageDetails: { input: r.usage.inputTokens, output: r.usage.outputTokens },
              });
              return r.text;
            },
            { asType: "generation" },
          );

          // (7) Record the final output on the root.
          root.update({ output: answer });
          return answer;
        } catch (error) {
          // (8) Mark failures on the root so they surface in Lemma.
          root.update({
            level: "ERROR",
            statusMessage: error instanceof Error ? error.message : String(error),
          });
          throw error;
        } finally {
          // (9) Flush in short-lived / serverless runtimes.
          await lemmaProcessor.forceFlush();
        }
      },
    );
  });
}

How each part maps to the contract

  1. Root span — one execution, one trace. Everything nests inside this callback. → Traces
  2. Trace context — agent name, thread (sessionId), and user, set once for the whole trace. → Threads & context
  3. Span — groups a sub-task so its work nests beneath it. → Spans
  4. Tool call (nested) — a tool invoked as part of retrieval; input is the args, output is the result. → Tool calls
  5. Tool call (top-level) — a tool directly under the root.
  6. Generation — the LLM call, carrying model and usageDetails so Lemma can compute cost and tokens. → Generations
  7. Output — the final answer recorded on the root.
  8. Errorslevel: "ERROR" on the failing span so failures are visible.
  9. Flush — force a flush before a short-lived process exits. → Setup

4. Instrumenting an agent loop

Most agents loop: the model calls tools, you feed results back, repeat. The rule is unchanged — the whole loop is one trace. Open the root once, then create a generation per model turn and a tool call per tool invocation, all inside the root callback.
await startActiveObservation("support-agent", async (root) => {
  root.update({ input: req.message });
  const messages = [{ role: "user", content: req.message }];

  while (true) {
    const turn = await startActiveObservation(
      "model-turn",
      async (gen) => {
        const r = await callModel(messages);
        gen.update({ input: messages, output: r, model: "gpt-4o", usageDetails: r.usage });
        return r;
      },
      { asType: "generation" },
    );

    if (!turn.toolCalls?.length) {
      root.update({ output: turn.text });
      return turn.text;
    }

    for (const call of turn.toolCalls) {
      const result = await startActiveObservation(
        call.name,
        async (tool) => {
          const out = await runTool(call.name, call.args);
          tool.update({ input: call.args, output: out });
          return out;
        },
        { asType: "tool" },
      );
      messages.push({ role: "tool", name: call.name, content: JSON.stringify(result) });
    }
  }
});
Already using a framework like the Vercel AI SDK, OpenAI Agents, or LangGraph? It can emit these spans for you — see Frameworks. You still wrap the run in one root span so every turn nests into a single trace.

5. Verify in Lemma

Open the Lemma dashboardTraces. Confirm:
  • One trace per run — the whole execution is one trace, not separate traces per call.
  • Root has input and output — the user message and the final answer.
  • Generations are nested — each LLM call shows model and token usage.
  • Tools are nested — each tool call shows arguments and result.
  • Context is set — agent name, thread, and user appear on the trace.

Go deeper

Traces

The root span, errors, and context propagation.

Generations

LLM calls with model and token usage.

Tool calls

Tool arguments, results, and a reusable wrapper.

Trace contract

The exact shape Lemma reads.