> ## Documentation Index
> Fetch the complete documentation index at: https://docs.uselemma.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Instrument an agent

> A complete, start-to-finish walkthrough for instrumenting an agent so Lemma can read it

This is the full path from an uninstrumented agent to one complete, well-shaped trace in Lemma. Everything you need is on this page; the per-primitive pages ([Traces](/tracing/instrumentation/traces), [Generations](/tracing/instrumentation/generations), [Tool calls](/tracing/instrumentation/tool-calls), [Spans](/tracing/instrumentation/spans), [Threads & context](/tracing/instrumentation/context)) go deeper on each piece.

<Warning>
  Lemma is opinionated, not a generic OTLP destination. It reads a specific [trace shape](/reference/trace-contract). Follow this walkthrough and your agent produces that shape; forward arbitrary spans and your traces will render empty or broken.
</Warning>

## What you'll build

One agent execution becomes one trace. A realistic support agent retrieves context, calls a tool, and asks a model — all nested under a single root:

```text theme={null}
support-agent                  ← trace root (input, output, agent name, thread, user)
├─ retrieve-context            ← span
│  └─ search_docs              ← tool call (args, result)
├─ lookup_order                ← tool call (args, result)
└─ answer                      ← generation (model, tokens, prompt, completion)
```

## 1. Install

<Tabs>
  <Tab title="TypeScript">
    ```bash theme={null}
    npm install @langfuse/tracing @langfuse/otel @opentelemetry/sdk-trace-node @opentelemetry/exporter-trace-otlp-proto
    ```
  </Tab>

  <Tab title="Python">
    ```bash theme={null}
    pip install "langfuse>=3,<4" opentelemetry-sdk opentelemetry-exporter-otlp
    ```
  </Tab>
</Tabs>

## 2. Configure the exporter (once, at startup)

Point Langfuse at Lemma and register it before any agent or model client runs. Late initialization is the most common reason spans go missing.

<Tabs>
  <Tab title="TypeScript">
    ```typescript theme={null}
    // instrumentation.ts — imported first, before your app code
    import { LangfuseSpanProcessor } from "@langfuse/otel";
    import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-proto";
    import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node";

    export const lemmaProcessor = new LangfuseSpanProcessor({
      exporter: new OTLPTraceExporter({
        url: process.env.LEMMA_BASE_URL,
        headers: {
          Authorization: `Bearer ${process.env.LEMMA_API_KEY}`,
          "X-Lemma-Project-ID": process.env.LEMMA_PROJECT_ID,
        },
      }),
    });

    new NodeTracerProvider({ spanProcessors: [lemmaProcessor] }).register();
    ```
  </Tab>

  <Tab title="Python">
    ```python theme={null}
    # instrumentation.py — imported first, before your app code
    import os
    from opentelemetry import trace
    from opentelemetry.sdk.trace import TracerProvider
    from opentelemetry.sdk.trace.export import BatchSpanProcessor
    from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

    provider = TracerProvider()
    provider.add_span_processor(
        BatchSpanProcessor(
            OTLPSpanExporter(
                endpoint=os.environ["LEMMA_BASE_URL"],
                headers={
                    "Authorization": f"Bearer {os.environ['LEMMA_API_KEY']}",
                    "X-Lemma-Project-ID": os.environ["LEMMA_PROJECT_ID"],
                },
            )
        )
    )
    trace.set_tracer_provider(provider)
    ```
  </Tab>
</Tabs>

Set the environment (find these in [Lemma project settings](https://platform.uselemma.ai)). Lemma-only export needs no `LANGFUSE_*` credentials.

```bash theme={null}
export LEMMA_BASE_URL="https://api.uselemma.ai/otel/v1/traces"
export LEMMA_API_KEY="lma_..."
export LEMMA_PROJECT_ID="proj_..."
```

## 3. The complete instrumented agent

Here is the whole agent in one piece. Each part is explained below.

<Tabs>
  <Tab title="TypeScript">
    ```typescript theme={null}
    import {
      propagateAttributes,
      startActiveObservation,
    } from "@langfuse/tracing";
    import { lemmaProcessor } from "./instrumentation";

    export async function handleSupportRequest(req: {
      message: string;
      conversationId: string;
      userId: string;
    }): Promise<string> {
      // (1) One agent execution = one trace. Open the root span.
      return startActiveObservation("support-agent", async (root) => {
        root.update({ input: req.message });

        // (2) Trace-level context: agent name, thread, user.
        return propagateAttributes(
          {
            traceName: "support-agent",
            sessionId: req.conversationId,
            userId: req.userId,
            metadata: { "gen_ai.agent.name": "support-agent" },
          },
          async () => {
            try {
              // (3) A span groups a multi-step sub-task (retrieval).
              const docs = await startActiveObservation(
                "retrieve-context",
                async (span) => {
                  span.update({ input: { query: req.message } });

                  // (4) A tool call nested inside the span.
                  const found = await startActiveObservation(
                    "search_docs",
                    async (tool) => {
                      const result = await searchDocs(req.message);
                      tool.update({ input: { query: req.message }, output: result });
                      return result;
                    },
                    { asType: "tool" },
                  );

                  span.update({ output: { count: found.length } });
                  return found;
                },
              );

              // (5) Another tool call, directly under the root.
              const order = await startActiveObservation(
                "lookup_order",
                async (tool) => {
                  const result = await lookupOrder(req.userId);
                  tool.update({ input: { userId: req.userId }, output: result });
                  return result;
                },
                { asType: "tool" },
              );

              // (6) A generation: the LLM call, with model + token usage.
              const answer = await startActiveObservation(
                "answer",
                async (gen) => {
                  const messages = buildPrompt(req.message, docs, order);
                  const r = await callModel(messages);
                  gen.update({
                    input: messages,
                    output: r.text,
                    model: "gpt-4o",
                    usageDetails: { input: r.usage.inputTokens, output: r.usage.outputTokens },
                  });
                  return r.text;
                },
                { asType: "generation" },
              );

              // (7) Record the final output on the root.
              root.update({ output: answer });
              return answer;
            } catch (error) {
              // (8) Mark failures on the root so they surface in Lemma.
              root.update({
                level: "ERROR",
                statusMessage: error instanceof Error ? error.message : String(error),
              });
              throw error;
            } finally {
              // (9) Flush in short-lived / serverless runtimes.
              await lemmaProcessor.forceFlush();
            }
          },
        );
      });
    }
    ```
  </Tab>

  <Tab title="Python">
    ```python theme={null}
    from langfuse import get_client

    langfuse = get_client()

    def handle_support_request(message: str, conversation_id: str, user_id: str) -> str:
        # (1) One agent execution = one trace. Open the root span.
        with langfuse.start_as_current_span(name="support-agent") as root:
            root.update(input=message)

            # (2) Trace-level context: agent name, thread, user.
            langfuse.update_current_trace(
                name="support-agent",
                session_id=conversation_id,
                user_id=user_id,
                metadata={"gen_ai.agent.name": "support-agent"},
            )

            try:
                # (3) A span groups a multi-step sub-task (retrieval).
                with langfuse.start_as_current_span(name="retrieve-context") as span:
                    span.update(input={"query": message})

                    # (4) A tool call nested inside the span.
                    with langfuse.start_as_current_observation(
                        name="search_docs", as_type="tool"
                    ) as tool:
                        docs = search_docs(message)
                        tool.update(input={"query": message}, output=docs)

                    span.update(output={"count": len(docs)})

                # (5) Another tool call, directly under the root.
                with langfuse.start_as_current_observation(
                    name="lookup_order", as_type="tool"
                ) as tool:
                    order = lookup_order(user_id)
                    tool.update(input={"user_id": user_id}, output=order)

                # (6) A generation: the LLM call, with model + token usage.
                with langfuse.start_as_current_generation(name="answer", model="gpt-4o") as gen:
                    messages = build_prompt(message, docs, order)
                    r = call_model(messages)
                    gen.update(
                        input=messages,
                        output=r.text,
                        usage_details={"input": r.usage.input_tokens, "output": r.usage.output_tokens},
                    )
                    answer = r.text

                # (7) Record the final output on the root.
                root.update(output=answer)
                return answer
            except Exception as error:
                # (8) Mark failures on the root so they surface in Lemma.
                root.update(level="ERROR", status_message=str(error))
                raise
            finally:
                # (9) Flush in short-lived / serverless runtimes.
                langfuse.flush()
    ```
  </Tab>
</Tabs>

### How each part maps to the contract

1. **Root span** — one execution, one trace. Everything nests inside this callback. → [Traces](/tracing/instrumentation/traces)
2. **Trace context** — agent name, thread (`sessionId`), and user, set once for the whole trace. → [Threads & context](/tracing/instrumentation/context)
3. **Span** — groups a sub-task so its work nests beneath it. → [Spans](/tracing/instrumentation/spans)
4. **Tool call (nested)** — a tool invoked as part of retrieval; `input` is the args, `output` is the result. → [Tool calls](/tracing/instrumentation/tool-calls)
5. **Tool call (top-level)** — a tool directly under the root.
6. **Generation** — the LLM call, carrying `model` and `usageDetails` so Lemma can compute cost and tokens. → [Generations](/tracing/instrumentation/generations)
7. **Output** — the final answer recorded on the root.
8. **Errors** — `level: "ERROR"` on the failing span so failures are visible.
9. **Flush** — force a flush before a short-lived process exits. → [Setup](/tracing/instrumentation/setup#flushing)

## 4. Instrumenting an agent loop

Most agents loop: the model calls tools, you feed results back, repeat. The rule is unchanged — **the whole loop is one trace**. Open the root once, then create a generation per model turn and a tool call per tool invocation, all inside the root callback.

<Tabs>
  <Tab title="TypeScript">
    ```typescript theme={null}
    await startActiveObservation("support-agent", async (root) => {
      root.update({ input: req.message });
      const messages = [{ role: "user", content: req.message }];

      while (true) {
        const turn = await startActiveObservation(
          "model-turn",
          async (gen) => {
            const r = await callModel(messages);
            gen.update({ input: messages, output: r, model: "gpt-4o", usageDetails: r.usage });
            return r;
          },
          { asType: "generation" },
        );

        if (!turn.toolCalls?.length) {
          root.update({ output: turn.text });
          return turn.text;
        }

        for (const call of turn.toolCalls) {
          const result = await startActiveObservation(
            call.name,
            async (tool) => {
              const out = await runTool(call.name, call.args);
              tool.update({ input: call.args, output: out });
              return out;
            },
            { asType: "tool" },
          );
          messages.push({ role: "tool", name: call.name, content: JSON.stringify(result) });
        }
      }
    });
    ```
  </Tab>

  <Tab title="Python">
    ```python theme={null}
    with langfuse.start_as_current_span(name="support-agent") as root:
        root.update(input=message)
        messages = [{"role": "user", "content": message}]

        while True:
            with langfuse.start_as_current_generation(name="model-turn", model="gpt-4o") as gen:
                turn = call_model(messages)
                gen.update(input=messages, output=turn, usage_details=turn.usage)

            if not turn.tool_calls:
                root.update(output=turn.text)
                break

            for call in turn.tool_calls:
                with langfuse.start_as_current_observation(name=call.name, as_type="tool") as tool:
                    out = run_tool(call.name, call.args)
                    tool.update(input=call.args, output=out)
                messages.append({"role": "tool", "name": call.name, "content": str(out)})
    ```
  </Tab>
</Tabs>

<Note>
  Already using a framework like the Vercel AI SDK, OpenAI Agents, or LangGraph? It can emit these spans for you — see [Frameworks](/frameworks/vercel-ai-sdk). You still wrap the run in one root span so every turn nests into a single trace.
</Note>

<Expandable title="Every span appears as a separate trace">
  If LLM or tool calls show up as their own separate traces, the work ran outside the root's active context — usually a lost async context across a queue, worker, stream, or `setTimeout`. The default fix is to keep all work inside the root callback so child spans nest automatically.

  **Keep the root open until children finish.** The root must encompass all of its children. Do not let the root end — and do not flush — until every child span has completed, or those children can be orphaned into separate traces.

  **If context cannot propagate automatically** — across a queue, worker, or separate service — carry the IDs manually. Capture `getActiveTraceId()` and `getActiveSpanId()` on the parent, and attach the child with `parentSpanContext`:

  ```typescript theme={null}
  import { getActiveSpanId, getActiveTraceId, startObservation } from "@langfuse/tracing";

  const traceId = getActiveTraceId();
  const parentSpanId = getActiveSpanId();

  const child = startObservation(
    "search_docs",
    { input: { query } },
    { asType: "tool", parentSpanContext: { traceId, spanId: parentSpanId, traceFlags: 1 } },
  );
  child.update({ output: await searchDocs(query) }).end();
  ```

  See [Troubleshooting](/tracing/troubleshooting/common-problems#every-span-appears-as-a-separate-trace) and Langfuse's [trace and observation IDs](https://langfuse.com/docs/observability/sdk/instrumentation#trace-ids).
</Expandable>

## 5. Verify in Lemma

Open the [Lemma dashboard](https://platform.uselemma.ai) → **Traces**. Confirm:

* **One trace per run** — the whole execution is one trace, not separate traces per call.
* **Root has input and output** — the user message and the final answer.
* **Generations are nested** — each LLM call shows model and token usage.
* **Tools are nested** — each tool call shows arguments and result.
* **Context is set** — agent name, thread, and user appear on the trace.

## Go deeper

<CardGroup cols={2}>
  <Card title="Traces" icon="git-branch" href="/tracing/instrumentation/traces">
    The root span, errors, and context propagation.
  </Card>

  <Card title="Generations" icon="sparkles" href="/tracing/instrumentation/generations">
    LLM calls with model and token usage.
  </Card>

  <Card title="Tool calls" icon="wrench" href="/tracing/instrumentation/tool-calls">
    Tool arguments, results, and a reusable wrapper.
  </Card>

  <Card title="Trace contract" icon="file-check" href="/reference/trace-contract">
    The exact shape Lemma reads.
  </Card>
</CardGroup>
