What is Tracing?
Tracing in Lemma provides complete visibility into your AI agent’s execution. Using OpenTelemetry, the industry-standard observability framework, Lemma captures:- Inputs and outputs — What goes into your agent and what comes out
- Execution flow — The sequence of operations (LLM calls, tool invocations, database queries)
- Timing and performance — How long each operation takes
- Token usage — Prompt and completion tokens for model calls
- Errors and exceptions — Where and why failures occur
Core Concepts
wrapAgent
wrapAgent is the primary interface for tracing in Lemma. It wraps your agent function and creates a top-level OpenTelemetry span that captures the full execution — inputs, outputs, timing, and any nested operations (LLM calls, tool invocations, etc.) that happen inside it.
Any OpenTelemetry-instrumented code that runs within the wrapped function automatically becomes a child span of the agent trace. This means frameworks like the Vercel AI SDK, which have built-in telemetry support, will have their spans nested under your agent’s trace with no additional setup.
Provider SDKs (raw OpenAI, Anthropic, etc.) do not emit those child spans by default. Add OpenInference via Provider instrumentation, or hand-roll spans.
Traces
A trace represents a single execution of your agent from start to finish. Each trace has:- Inputs — The initial state passed to
wrapAgent - Outputs — The final result recorded via
onComplete - Spans — Nested operations (LLM calls, tool invocations)
- Timing — Duration and timing of each operation
- Metadata — Model names, token counts, error states
Spans
Spans are the building blocks of a trace. Each span represents a single operation:- LLM generation call
- Tool or function invocation
- Database query
- Custom operation
wrapAgent; child spans are created automatically by instrumented frameworks or manually via the OpenTelemetry API.
Run ID
The run ID is Lemma’s unique identifier for a trace. It’s returned bywrapAgent and used to:
- Link metric events to specific traces
- Associate experiment results with test cases
- Query and filter traces in the dashboard
Thread ID
Optionallemma.thread_id links separate runs (multi-turn chats, retries) in the dashboard. Pass it via the second argument to the function returned by wrapAgent / wrap_agent (threadId / thread_id). See Wrapping your agent.
What Gets Traced
Once tracing is set up, Lemma automatically captures:- Top-level agent span — Created by
wrapAgent, contains inputs and outputs - Framework-specific spans — Automatically captured by supported integrations
- Custom spans — Any spans you create manually with OpenTelemetry
- Vercel AI SDK — Model calls, tool invocations, streaming events
- OpenAI Agents SDK — Agent runs, tool calls, handoffs, guardrails
- Langfuse — LLM calls, generation spans
- Arize Phoenix — Agent runs, LLM calls, tool invocations
- Azure Application Insights — Agent runs, LLM calls, distributed traces
- Claude Agent SDK — Coming soon
- LangGraph — Coming soon
- Custom instrumentation — Run/step/tool-call lifecycle docs
- Provider instrumentation — OpenInference for OpenAI/Anthropic SDKs, streaming, dual export
Next Steps
- Use wrapAgent — See the full API and usage patterns in wrapAgent
- Use custom instrumentation — Follow Custom Instrumentation for language-specific setup and lifecycle docs
- Choose your integration — Vercel AI SDK, OpenAI Agents SDK, and more in the Integrations tab
- Record metric events — Learn how to capture feedback in Recording Metric Events
- Run experiments — Learn about Experiments and Running Experiments

