Skip to main content
Lemma is an observability platform for AI agents. Wrap your agent function, and every execution becomes a trace — a structured tree of timing, inputs, outputs, LLM calls, and tool invocations you can search, filter, and debug in the Lemma dashboard.

Get started

Quickstart

Send your first trace in under 2 minutes.

How it works

You add two things to your code:
  1. registerOTel() — sets up the tracer provider at startup
  2. agent() — wraps your agent function to create the root span
Everything inside the wrapped function — LLM calls, tool executions, retrieval steps — automatically nests as child spans under that root. The wrapper captures your return value as the run output and closes the span when the function returns.
import { registerOTel, agent } from "@uselemma/tracing";

registerOTel();

const myAgent = agent("my-agent", async (input: string) => {
  const result = await generateText({ model: openai("gpt-4o"), prompt: input });
  return result.text;
});

const { result, runId } = await myAgent("What is the capital of France?");
For streaming agents, pass { streaming: true } and call ctx.complete() when the stream finishes. For Python, there’s also a context manager form that instruments existing code without extracting a function.

What you can do with Lemma

Trace agents — See the full execution tree for every run: which LLM calls were made, what tools were invoked, how long each step took, and where errors occurred. Debug failures — Click into any run to inspect inputs, outputs, and errors at every level of the span tree. Filter by user, session, environment, or custom attributes. Monitor production — Set up monitors on metrics like latency, error rate, or tool call success rate. When something breaks, Lemma creates an incident with root cause analysis and notifies you via webhook. Connect your IDE — Use the Lemma MCP server to query traces directly from Cursor, Claude Desktop, or Claude Code without switching to the dashboard.

Choose your path

My framework has built-in OTel support

Start here — child spans emit automatically with no extra code.

I want per-call LLM visibility

Start here — get child spans with prompts, completions, and token counts for every LLM call.

I need full control over every span

Start here — use typed helpers or the raw OpenTelemetry API to instrument anything.

Just want to copy-paste and go?

Browse Recipes — complete working examples for common patterns.