ReqLLM.Telemetry (ReqLLM v1.14.0)

View Source

Native :telemetry emitter for ReqLLM request and reasoning lifecycle.

Every event for a logical request shares the same request_id, so request lifecycle, reasoning lifecycle, and token usage can be correlated without provider-specific parsing. The OpenTelemetry bridge (ReqLLM.OpenTelemetry) is built on top of these events — attach handlers here for billing, tenant attribution, or any integration that should not depend on an OpenTelemetry SDK.

Event families

EventMeasurements
[:req_llm, :request, :start]system_time
[:req_llm, :request, :stop]duration, system_time
[:req_llm, :request, :exception]duration, system_time
[:req_llm, :reasoning, :start]system_time
[:req_llm, :reasoning, :update]system_time
[:req_llm, :reasoning, :stop]duration, system_time
[:req_llm, :token_usage]token + cost counters

duration is in native monotonic time units — convert with System.convert_time_unit/3 if you want milliseconds.

Request metadata

All request lifecycle events carry the same metadata map: request_id, operation, mode, provider, model, transport, reasoning, request_summary, response_summary, http_status, finish_reason, usage, request_options, server, streaming. The full shape and a worked example live in the Telemetry guide.

Reasoning events never include raw thinking text — they are metadata-only even with payload capture enabled.

Payload modes

Default is metadata-only. Opt into raw payloads globally or per call:

config :req_llm, telemetry: [payloads: :raw]

ReqLLM.generate_text(model, prompt, telemetry: [payloads: :raw])

Raw payloads are still sanitized — reasoning text is redacted, binary parts are summarized by byte size and media type, embeddings report vector counts rather than vectors. Use with care in multi-tenant systems.

Token usage compatibility

[:req_llm, :token_usage] remains available for existing consumers and now fires for streaming as well as non-streaming requests. For new integrations, prefer [:req_llm, :request, :stop] — it includes duration, finish reason, summaries, and normalized reasoning metadata alongside usage.

See also

Summary

Functions

Emits the compatibility token usage event.

Emits request exception telemetry and returns the updated context.

Builds a telemetry context for a request lifecycle.

Observes a terminal response and updates response and reasoning state.

Folds a streaming chunk into the telemetry context and emits milestone reasoning events when applicable.

Stores telemetry context in a Req request.

Stores telemetry context in a Req response private map.

Pre-populates context.server from a request source that start_request cannot read directly (e.g. an HTTPContext for streaming flows). Has no effect if the source yields no server info.

Returns the normalized metadata map for reasoning lifecycle events.

Reads telemetry context from a Req request.

Returns the private key used to store telemetry context on Req requests.

Returns the normalized request metadata map for request lifecycle events.

Emits request start telemetry and returns the updated context.

Emits request stop telemetry and returns the updated context.

Extracts token usage metadata from a Req response private map.

Types

context()

@type context() :: %{
  request_id: String.t(),
  model: LLMDB.Model.t(),
  operation: atom(),
  mode: lifecycle_mode(),
  transport: transport(),
  payload_mode: payload_mode(),
  reasoning_contract: reasoning_contract(),
  original_opts: keyword(),
  request_options: map(),
  server: map(),
  request_summary: map(),
  request_payload: any(),
  request_started?: boolean(),
  request_stopped?: boolean(),
  started_at: integer() | nil,
  request_started_system_time: integer() | nil,
  first_chunk_at: integer() | nil,
  requested_reasoning: map(),
  effective_reasoning: map(),
  reasoning_started?: boolean(),
  reasoning_started_at: integer() | nil,
  reasoning_observation: map(),
  response_summary_state: map()
}

lifecycle_mode()

@type lifecycle_mode() :: :sync | :stream

payload_mode()

@type payload_mode() :: :none | :raw

reasoning_contract()

@type reasoning_contract() ::
  :openai_effort
  | :openai_or_thinking
  | :anthropic_thinking
  | :platform_anthropic
  | :google_budget
  | :alibaba_thinking
  | :thinking_toggle
  | :zenmux_reasoning
  | :unsupported

transport()

@type transport() :: :req | :finch

Functions

emit_token_usage(model, usage, metadata \\ [])

@spec emit_token_usage(LLMDB.Model.t(), map() | nil, keyword()) :: :ok

Emits the compatibility token usage event.

exception_request(context, error, opts \\ [])

@spec exception_request(context(), Exception.t() | term(), keyword()) :: context()

Emits request exception telemetry and returns the updated context.

new_context(model, opts, extra \\ [])

@spec new_context(LLMDB.Model.t(), keyword(), keyword()) :: context()

Builds a telemetry context for a request lifecycle.

observe_response(context, response)

@spec observe_response(context(), any()) :: context()

Observes a terminal response and updates response and reasoning state.

observe_stream_chunk(context, chunk)

@spec observe_stream_chunk(context(), ReqLLM.StreamChunk.t()) :: context()

Folds a streaming chunk into the telemetry context and emits milestone reasoning events when applicable.

Called by ReqLLM's streaming pipeline (ReqLLM.StreamServer) for every chunk produced during a streaming request. Returns an updated context with:

  • first_chunk_at set to System.monotonic_time/0 on the first non-empty content chunk or first tool call — this feeds streaming.time_to_first_chunk on the request lifecycle metadata and gen_ai.client.operation.time_to_first_chunk on the OpenTelemetry bridge.
  • response_summary counters incremented for text bytes, thinking bytes, tool calls, etc.
  • A [:req_llm, :reasoning, :update] event emitted with milestone: :content_started the first time a reasoning chunk is observed.

Hosts integrating against the low-level streaming API (ReqLLM.Streaming) do not normally call this directly — ReqLLM threads it through the streaming pipeline. The high-level stream_text/3 / stream_object/4 APIs use it transparently.

put_request_context(request, context)

@spec put_request_context(Req.Request.t(), context()) :: Req.Request.t()

Stores telemetry context in a Req request.

put_response_context(response, context)

@spec put_response_context(Req.Response.t(), context()) :: Req.Response.t()

Stores telemetry context in a Req response private map.

put_server_from_source(context, source)

@spec put_server_from_source(context(), any()) :: context()

Pre-populates context.server from a request source that start_request cannot read directly (e.g. an HTTPContext for streaming flows). Has no effect if the source yields no server info.

reasoning_metadata(context, extra \\ %{})

@spec reasoning_metadata(context(), map()) :: map()

Returns the normalized metadata map for reasoning lifecycle events.

request_context(request)

@spec request_context(Req.Request.t()) :: context() | nil

Reads telemetry context from a Req request.

request_context_key()

@spec request_context_key() :: atom()

Returns the private key used to store telemetry context on Req requests.

request_metadata(context, extra)

@spec request_metadata(context(), map()) :: map()

Returns the normalized request metadata map for request lifecycle events.

start_request(context, request_source)

@spec start_request(context(), any()) :: context()

Emits request start telemetry and returns the updated context.

stop_request(context, response, opts \\ [])

@spec stop_request(context(), any(), keyword()) :: context()

Emits request stop telemetry and returns the updated context.

usage_from_response(arg1)

@spec usage_from_response(any()) :: map() | nil

Extracts token usage metadata from a Req response private map.