Claude Agent SDK

The Claude Agent SDK runs Claude as a tool-using agent. Use its Langfuse integration to capture model and tool activity, point Langfuse at Lemma, and wrap each run in one root span so the whole execution is a single nested trace.

One agent execution = one trace. Wrap the run in a single root span so every model and tool call nests under it. See the trace contract.

Claude Agent SDK traces render fully in Lemma today. Automated issue detection is being expanded to this shape — see Good trace vs bad trace for current status.

Recipe

Install

npm install @anthropic-ai/claude-agent-sdk @langfuse/tracing @langfuse/otel @opentelemetry/sdk-trace-node @opentelemetry/exporter-trace-otlp-proto

// instrumentation.ts — imported first, before your app code
import { LangfuseSpanProcessor } from "@langfuse/otel";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-proto";
import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node";

export const lemmaProcessor = new LangfuseSpanProcessor({
  exporter: new OTLPTraceExporter({
    url: process.env.LEMMA_BASE_URL,
    headers: {
      Authorization: `Bearer ${process.env.LEMMA_API_KEY}`,
      "X-Lemma-Project-ID": process.env.LEMMA_PROJECT_ID,
    },
  }),
});

new NodeTracerProvider({ spanProcessors: [lemmaProcessor] }).register();

Set the environment variables. Lemma-only export needs no LANGFUSE_* credentials.

export LEMMA_BASE_URL="https://api.uselemma.ai/otel/v1/traces"
export LEMMA_API_KEY="lma_..."
export LEMMA_PROJECT_ID="proj_..."

Enable the Langfuse integration

Follow the Langfuse Claude Agent SDK guide to enable span emission for the SDK. The exporter you registered above ships those spans to Lemma.

Wrap the whole run in one root span

Wrap the agent loop in a single root span so each model turn and tool call nests under one trace. Record the input and final output on the root, set a stable agent name, and type child observations as generation and tool.

import { query } from "@anthropic-ai/claude-agent-sdk";
import { propagateAttributes, startActiveObservation } from "@langfuse/tracing";

export async function runSupportAgent(userMessage: string, threadId: string) {
  return await startActiveObservation("support-agent", async (root) => {
    root.update({ input: userMessage });

    return await propagateAttributes(
      {
        traceName: "support-agent",
        sessionId: threadId,
        metadata: { "gen_ai.agent.name": "support-agent" },
      },
      async () => {
        let finalText = "";

        for await (const message of query({ prompt: userMessage })) {
          if (message.type === "assistant") {
            await startActiveObservation(
              "claude",
              async (gen) => {
                gen.update({
                  output: message.message.content,
                  model: message.message.model,
                  usageDetails: {
                    input: message.message.usage.input_tokens,
                    output: message.message.usage.output_tokens,
                  },
                });
              },
              { asType: "generation" },
            );
          }

          if (message.type === "result") {
            finalText = message.result;
          }
        }

        root.update({ output: finalText });
        return finalText;
      },
    );
  });
}

Record each tool invocation as a child tool observation the same way, so the trace nests cleanly:

support-agent              ← trace root (input, output)
├─ claude                  ← generation (model, tokens)
├─ search_docs             ← tool call (args, result)
└─ claude                  ← generation (final answer)

Flush before the process exits

In serverless or other short-lived runtimes, flush so the whole trace ships in one batch.

import { lemmaProcessor } from "./instrumentation";

// at the end of a request / serverless handler
await lemmaProcessor.forceFlush();

If model or tool calls show up as their own separate traces, the work ran outside the root’s active context. Keep the agent loop inside the startActiveObservation callback. See Troubleshooting.

Verify in Lemma

Open the Lemma dashboard → Traces and confirm:

One trace per run — a full agent run is one trace, not one per turn.
Root has input and output — the root span shows the user message and the final response.
Generations are nested — each model turn appears as a child generation with model and token usage.
Tools are nested — each tool invocation appears as a child tool span with arguments and result.

Next steps

Trace contract

The exact shape Lemma reads.

Setup

Wire the Langfuse → Lemma exporter.

Threads and sessions

Group multi-turn conversations with a thread id.

Good vs bad traces

What issue detection looks for, per shape.

​Recipe

​Verify in Lemma

​Next steps

Trace contract

Setup

Threads and sessions

Good vs bad traces

Recipe

Verify in Lemma

Next steps