> ## Documentation Index
> Fetch the complete documentation index at: https://docs.uselemma.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Trace contract

> The exact shape and fields Lemma reads from a trace, and how to satisfy it with Langfuse

Lemma is an opinionated sink, **not a generic OpenTelemetry backend**. It does not just store whatever spans you send — it reads a specific **trace shape** to power input/output display, model and token stats, tool visibility, threads, and automated issue detection. Spans that do not match this contract still arrive, but render as broken or empty and are skipped by issue detection. This page is the canonical contract; every other page builds on it.

The rule that everything else follows:

<Note>
  **One agent execution = one trace.** The trace has a single root span. LLM calls, tool calls, retrieval, and app logic are **child spans** of that root, not separate traces.
</Note>

## The product contract

Think in four nouns. This is the vocabulary used across the docs and the Lemma dashboard.

| Concept        | What it is                                                        | Lemma primitive                   |
| -------------- | ----------------------------------------------------------------- | --------------------------------- |
| **Trace**      | One end-to-end agent execution, from user input to final response | Root span                         |
| **Span**       | A unit of work inside the trace (retrieval, ranking, app logic)   | Child span                        |
| **Generation** | A single LLM call (prompt, completion, model, tokens)             | Child span, typed as a generation |
| **Tool call**  | A single tool invocation (name, arguments, result)                | Child span, typed as a tool       |

A useful trace has:

* A **root span** with the user **input** and the final **output** (or error).
* A stable **agent name** so traces are groupable by workflow.
* **Generation** spans carrying model and token usage.
* **Tool** spans carrying arguments and results.
* A **thread id** when the execution is part of a multi-turn conversation.

## How you satisfy it with Langfuse

Lemma standardizes on [Langfuse](https://langfuse.com) as the instrumentation library. You write normal Langfuse code — Lemma reads the result. You never touch raw OpenTelemetry attributes.

```ts theme={null}
import { propagateAttributes, startActiveObservation } from "@langfuse/tracing";

await startActiveObservation("support-agent", async (root) => {
  root.update({ input: userMessage });

  await propagateAttributes(
    {
      traceName: "support-agent",
      sessionId: threadId,                       // groups multi-turn conversations
      metadata: { "gen_ai.agent.name": "support-agent" },
    },
    async () => {
      // generation: a single LLM call
      const reply = await startActiveObservation(
        "draft-reply",
        async (gen) => {
          const r = await callModel(userMessage);
          gen.update({
            input: r.messages,
            output: r.text,
            model: "gpt-4o",
            usageDetails: { input: r.usage.inputTokens, output: r.usage.outputTokens },
          });
          return r;
        },
        { asType: "generation" },
      );

      // tool: a single tool invocation
      const docs = await startActiveObservation(
        "search_docs",
        async (tool) => {
          const result = await searchDocs(query);
          tool.update({ input: { query }, output: result });
          return result;
        },
        { asType: "tool" },
      );

      root.update({ output: reply.text });
    },
  );
});
```

Each Langfuse field maps to a part of the contract:

| Contract field      | Langfuse field                                     | Set on                |
| ------------------- | -------------------------------------------------- | --------------------- |
| Trace input         | `.update({ input })`                               | Root span             |
| Trace output        | `.update({ output })`                              | Root span             |
| Agent name          | `traceName` + `metadata["gen_ai.agent.name"]`      | `propagateAttributes` |
| Thread id           | `sessionId` (and/or `metadata["lemma.thread_id"]`) | `propagateAttributes` |
| User id             | `userId`                                           | `propagateAttributes` |
| LLM model           | `.update({ model })`                               | Generation span       |
| Token usage         | `.update({ usageDetails })`                        | Generation span       |
| Prompt / completion | `.update({ input, output })`                       | Generation span       |
| Tool name           | observation `name`                                 | Tool span             |
| Tool args / result  | `.update({ input, output })`                       | Tool span             |
| Error               | `.update({ level: "ERROR", statusMessage })`       | Any span              |

<Note>
  You do not set OpenTelemetry attribute keys by hand. Use the Langfuse fields above; Lemma reads the exported observation. See [Setup](/tracing/instrumentation/setup) to wire the exporter, then [Traces](/tracing/instrumentation/traces) for the full pattern.
</Note>

## Required vs optional

| Field                          | Required?    | Without it                                     |
| ------------------------------ | ------------ | ---------------------------------------------- |
| Single root span per execution | **Required** | Each call becomes its own trace; no agent view |
| Root input                     | **Required** | Traces show timing only                        |
| Root output or error           | **Required** | You cannot tell success from failure           |
| Agent name                     | Recommended  | Traces are hard to group and filter            |
| Generation model + usage       | Recommended  | No cost, token, or model analysis              |
| Tool name + args + result      | Recommended  | Tool calls are invisible or opaque             |
| Thread id                      | Optional     | Multi-turn conversations are not grouped       |
| User / session / environment   | Optional     | No per-user or per-environment slicing         |

## Issue detection eligibility

Beyond rendering a trace, Lemma runs automated **issue detection** (silent failures, bad tool calls, loops). Today this runs for traces that arrive in a recognized shape:

* **Vercel AI SDK** traces (the AI SDK's `experimental_telemetry` output).
* **OpenInference / LangGraph** traces.

If you instrument with a [supported framework](/frameworks/vercel-ai-sdk), issue detection works automatically. Pure manual Langfuse traces render today and are being brought to full issue-detection parity. For the current status of each shape, see [Good trace vs bad trace](/reference/good-vs-bad-traces).

## Appendix: underlying OpenTelemetry keys

You do not need this section for Langfuse instrumentation. It is here for teams exporting from an existing OpenTelemetry / OpenInference / Vercel AI SDK pipeline, documenting the literal attribute keys Lemma reads.

**Trace root** — the earliest span in the trace with no parent. All other spans must be its descendants.

| Field              | Attribute keys Lemma reads (priority order)                                                                                             |
| ------------------ | --------------------------------------------------------------------------------------------------------------------------------------- |
| Agent name         | `gen_ai.agent.name`, then `ai.agent.name` (on an agent-run root)                                                                        |
| Thread id          | `lemma.thread_id`                                                                                                                       |
| User id            | `user.id`, then `enduser.id`                                                                                                            |
| Input              | `ai.agent.input` → `ai.prompt` → `ai.prompt.messages` → `gen_ai.prompt` → OpenInference `llm.input_messages.*` → root `input.value`     |
| Output             | `ai.response.text` → `ai.response.object` → `gen_ai.completion` → OpenInference `llm.output_messages.*` → root `output.value`           |
| Model              | `ai.model.id`, `gen_ai.request.model`, `gen_ai.response.model`, `llm.model_name`                                                        |
| Tokens (in/out)    | `ai.usage.inputTokens` / `gen_ai.usage.input_tokens` / `gen_ai.usage.prompt_tokens` / `llm.token_count.prompt` (and output equivalents) |
| Generation span    | `openinference.span.kind="llm"`, a Vercel generation `ai.operationId`, or span name `response`                                          |
| Tool span          | `ai.toolCall.*` (Vercel), or `openinference.span.kind="tool"` + `tool.name`                                                             |
| Tool args / result | `ai.toolCall.args`/`ai.toolCall.input` and `ai.toolCall.result`/`ai.toolCall.output`, else `input.value`/`output.value`                 |

<Warning>
  If any child references a parent span that is not in the same export batch, the trace can be dropped. In short-lived or serverless runtimes, flush before the process exits so the whole trace ships together. See [Setup](/tracing/instrumentation/setup#flushing).
</Warning>