Skip to main content

What tracing gives you

Every agent execution becomes a trace — a complete record of what happened, how long it took, and where it went wrong. Traces contain spans organized as a tree: the agent() call is the root, and every LLM call, tool invocation, or retrieval step is a child span nested under it.
agent()
ai.agent.run
run ID · input · output · timing
gen_ai.chat
prompt · completion · tokens
tool.lookup-order
input · output
gen_ai.chat
second LLM call
The gen_ai.chat spans come from provider instrumentation (OpenInference) or framework-native telemetry. The tool. spans come from Lemma’s tool() helper or your own tracer.startActiveSpan calls.

Setup guides

Using a supported framework

You use a framework with built-in OTel support. Spans emit automatically — minimal wiring required. This is where most users should start.

Adding provider instrumentation

You call OpenAI, Anthropic, LiteLLM, or another provider SDK directly (not through a higher-level framework). Add OpenInference to get per-call child spans with prompts, completions, and token counts.

Manual instrumentation

You have a custom agent framework or need precise control over every span. Use Lemma’s typed span helpers (tool(), llm(), retrieval()) or the raw OpenTelemetry API.

How agent() works with any setup

Regardless of integration depth, agent() creates the root span that everything else nests under.
The wrapper captures the return value as ai.agent.output and closes the span automatically.
// Non-streaming — just return
const myAgent = agent("my-agent", async (input: string) => {
  const result = await doWork(input);
  return result; // span closes here automatically
});

// Streaming — opt into manual lifecycle
const streamingAgent = agent("my-agent", async (input: string, ctx) => {
  const result = await streamText({
    ...,
    onFinish({ text }) { ctx.complete(text); },
  });
  return result.toDataStreamResponse();
}, { streaming: true });
Marking a run as failed: call ctx.fail(error) to record the exception on the span. The wrapper closes the span automatically on exit.
const myAgent = agent("my-agent", async (input, ctx) => {
  try {
    return await doWork(input);
  } catch (error) {
    ctx.fail(error); // records exception, marks span errored
    throw error;
  }
});
The runId returned by the wrapper is Lemma’s stable identifier for the trace. Use it to look up runs in the dashboard or attach metric events.

Next Steps

  • Troubleshooting — spans not appearing, nesting issues, export timing
  • Integrations — framework-specific setup details
  • Concepts — runs, spans, thread IDs, and how they relate
  • Streamingctx.complete() lifecycle for streaming agents
  • Multi-turn threads — link related runs into a conversation thread
  • Custom attributes — attach user ID, session, and environment metadata
  • Dual export — send traces to Lemma alongside an existing OTel destination
  • MCP server — query and inspect traces from your IDE