Skip to main content
A step is one LLM request/response inside a run. In Lemma custom instrumentation, steps are child spans under wrap_agent.

Required

Create steps with start_as_current_span inside the wrapped run:
from opentelemetry import trace
from uselemma_tracing import TraceContext, wrap_agent

tracer = trace.get_tracer("my-agent")

async def run_agent(ctx: TraceContext, user_message: str):
    with tracer.start_as_current_span("llm.step.generate") as step_span:
        response = await llm_call(user_message)
        step_span.set_attribute("llm.prompt", user_message)
        step_span.set_attribute("llm.response", response)

    ctx.on_complete(response)
    return response

Optional step data

Add attributes for model, tokens, cost, and finish reason:
step_span.set_attribute("llm.model.requested", "gpt-4o")
step_span.set_attribute("llm.model.used", "gpt-4o-2024-08-06")
step_span.set_attribute("llm.tokens.prompt_uncached", 320)
step_span.set_attribute("llm.tokens.prompt_cached", 80)
step_span.set_attribute("llm.tokens.completion", 140)
step_span.set_attribute("llm.cost.usd", 0.0042)
step_span.set_attribute("llm.finish_reason", "stop")
If you use provider instrumentation such as instrument_openai() or instrument_anthropic(), many step attributes are emitted automatically.

Mark a step as failed

with tracer.start_as_current_span("llm.step.generate") as step_span:
    try:
        response = await llm_call("hello")
    except Exception as err:
        step_span.record_exception(err)
        step_span.set_attribute("step.status", "error")
        raise

Dashboard outcome

Each step appears nested under the run so you can inspect:
  • per-call latency
  • model and token usage
  • finish reason
  • where failures occurred in the reasoning chain

Next Steps