Skip to main content
By default, agent() auto-closes the span when the wrapped function returns and captures the return value as ai.agent.output. This works perfectly for non-streaming agents, but breaks for streaming: the function returns a stream object before the output text is assembled. Pass { streaming: true } / streaming=True to opt into manual span lifecycle.

The contract

const streamingAgent = agent("my-agent", async (input: string, ctx) => {
  // ... build and start the stream ...
  // call ctx.complete(assembledText) once the full output is known
  // return the stream response for the caller
}, { streaming: true });
  • The span stays open when the function returns.
  • Call ctx.complete(output) exactly once, with the assembled text, to record ai.agent.output and close the span.
  • If ctx.complete() is never called, the SDK emits a console warning and the span stays open indefinitely — it will not appear in the Lemma dashboard until closed.

Vercel AI SDK — onFinish pattern

The cleanest approach: call ctx.complete() inside the onFinish callback, which fires once the full output is assembled.
import { agent } from "@uselemma/tracing";
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";

const streamingAgent = agent("my-agent", async (input: string, ctx) => {
  const result = await streamText({
    model: openai("gpt-4o"),
    prompt: input,
    experimental_telemetry: { isEnabled: true },
    onFinish({ text }) {
      ctx.complete(text); // records output, closes span
    },
  });
  return result.toDataStreamResponse();
}, { streaming: true });

Raw stream consumption

When you consume the stream yourself (e.g. to forward chunks to an SSE response), accumulate the text and call ctx.complete() when the loop ends.
import { agent } from "@uselemma/tracing";
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic();

const streamingAgent = agent("my-agent", async (input: string, ctx) => {
  let fullText = "";

  const stream = anthropic.messages.stream({
    model: "claude-haiku-4-5",
    max_tokens: 512,
    messages: [{ role: "user", content: input }],
  });

  for await (const event of stream) {
    if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
      fullText += event.delta.text;
    }
  }

  ctx.complete(fullText); // assembled text recorded as output
  return fullText;
}, { streaming: true });

What happens if you forget ctx.complete()

The SDK emits a warning at runtime:
[lemma] Streaming agent "my-agent" returned without calling ctx.complete().
Call ctx.complete(output) inside the stream's onFinish callback to close the run span.
The span remains open and the run will not appear in the Lemma dashboard until the process exits (at which point it is flushed without an output value). Always call ctx.complete() inside a stream finish callback or after your accumulation loop.

Next Steps