Skip to main content
Use this when your agent streams response chunks back to a caller in real time while keeping the span open for the full duration of the stream. The key constraint: the stream must be consumed inside the wrapped function. Returning a generator or stream handle directly ends the span before any chunk is produced. See Streaming and async generators for why. To forward chunks to a caller, pass a streaming bridge into the wrapped function and read from it externally:
import { registerOTel, wrapAgent } from "@uselemma/tracing";

registerOTel();

const wrapped = wrapAgent("my-agent", async ({ onComplete }, input: {
  userMessage: string;
  controller: ReadableStreamDefaultController<string>;
}) => {
  let fullResponse = "";

  for await (const chunk of streamLLM(input.userMessage)) {
    fullResponse += chunk;
    input.controller.enqueue(chunk);
  }

  onComplete(fullResponse);
  input.controller.close();
  return fullResponse;
});

export function handleRequest(userMessage: string) {
  let runIdPromise: Promise<string>;

  const stream = new ReadableStream<string>({
    start(controller) {
      runIdPromise = wrapped({ userMessage, controller }).then(({ runId }) => runId);
    },
  });

  return { stream, runIdPromise };
}
Key points:
  • The stream bridge decouples chunk forwarding from span lifetime — the span stays open until the wrapped function returns after the last chunk.
  • Wait for the wrapped invocation to finish before relying on runId or assuming the span is closed.
  • Replace the queue or ReadableStream example with whatever your framework uses to write SSE events.