ALLM.Telemetry (allm v0.3.1)

Copy Markdown View Source

Layer-B helper that wraps :telemetry.span/3 with ALLM-specific metadata defaults.

Every public Layer-C entry point (ALLM.generate/3, ALLM.stream_generate/3, ALLM.step/3, ALLM.stream_step/3, ALLM.chat/3, ALLM.stream/3, ALLM.generate_image/3 & siblings) is wrapped in a span. Attach :telemetry.attach_many/4 handlers to observe every execution.

Emitted events

Event nameWhenMeasurementsMetadata
[:allm, :generate, :start]non-streaming generate enterssystem_time, monotonic_timerequest_id, engine, model
[:allm, :generate, :stop]non-streaming generate exitsduration, monotonic_timestart metadata + response
[:allm, :generate, :exception]closure raisedduration, monotonic_timestart metadata + kind, reason, stacktrace
[:allm, :stream, :start | :stop | :exception]streaming generateas above; :stop response is nil (lazy)start metadata + response: nil
[:allm, :step, :start | :stop | :exception]single chat stepas abovestart metadata + step_result
[:allm, :chat, :start | :stop | :exception]multi-turn chat loopas abovestart metadata + chat_result
[:allm, :tool, :start | :stop | :exception]per-tool executionas abovestart metadata + tool_call_id, tool_name, result
[:allm, :image, :start | :stop | :exception]image generationduration, plus image_count on :stoprequest_id, engine, model, operation, n, plus usage, response, error on :stop
[:allm, :adapter, :retry]per-attempt retry (non-streaming)system_timeattempt, delay_ms, reason, request_id

Common metadata

Every span carries :request_id, :engine, and :model on :start, and the same on :stop (plus the per-span *_result / response / error extras). :request_id is generated at the outermost public call and threaded into nested calls via opts[:request_id], so per-tool spans inside a chat/3 loop share the parent chat's id.

Why request_id is metadata-only

:request_id lives on telemetry span metadata and on Response.request_id post-collection — it does NOT extend the ALLM.Event closed tagged-tuple union. A consumer who folds events by hand (without ALLM.StreamCollector) reads :request_id from the surrounding span context, not from individual event payloads.

Span nesting

Per-tool spans ([:allm, :tool, ...]) execute inside their parent step span ([:allm, :step, ...]); the parent step span itself nests inside its parent chat span when invoked from chat/3 / stream/3.

Exception handling

span/3 delegates exception trapping to :telemetry.span/3 which automatically catches raises in the closure, emits [:allm, name, :exception] with %{kind, reason, stacktrace} metadata, and re-raises the exception to the caller — preserving the existing bubble-up semantics of every wrapped function.

Summary

Types

Common metadata attached to every Layer-C span.

Span suffix; the prefix [:allm] is fixed.

Functions

Return the fixed event-name prefix for every ALLM telemetry event.

Emit a single non-span event under the [:allm | suffix_path] event name. Used by ALLM.Retry for [:allm, :adapter, :retry]. No metadata merging or measurement injection — the caller supplies both maps in full.

Generate a fresh 22-character URL-safe Base64 request id.

Wrap a closure in a :telemetry.span/3 call under the [:allm, name] prefix.

Types

common_metadata()

@type common_metadata() :: %{
  optional(:request_id) => String.t(),
  optional(:engine) => term(),
  optional(:model) => term(),
  optional(atom()) => term()
}

Common metadata attached to every Layer-C span.

span_name()

@type span_name() :: :generate | :stream | :step | :chat | :tool | :image

Span suffix; the prefix [:allm] is fixed.

Functions

event_prefix()

@spec event_prefix() :: [:allm]

Return the fixed event-name prefix for every ALLM telemetry event.

Examples

iex> ALLM.Telemetry.event_prefix()
[:allm]

execute(suffix_path, measurements, metadata)

@spec execute([atom(), ...], map(), map()) :: :ok

Emit a single non-span event under the [:allm | suffix_path] event name. Used by ALLM.Retry for [:allm, :adapter, :retry]. No metadata merging or measurement injection — the caller supplies both maps in full.

Examples

iex> :telemetry.attach(
...> "doctest-execute",
...> [:allm, :doctest, :ping],
...> fn _n, _m, _md, %{owner: pid} -> send(pid, :got_event) end,
...> %{owner: self()}
...>)
iex> ALLM.Telemetry.execute([:doctest, :ping], %{count: 1}, %{tag: :doc})
:ok
iex> :telemetry.detach("doctest-execute")
iex> receive do
...> :got_event -> :ok
...> after
...> 100 -> :no_event
...> end
:ok

request_id()

@spec request_id() :: String.t()

Generate a fresh 22-character URL-safe Base64 request id.

Sourced from 16 cryptographic random bytes. Used to correlate the start, stop, and any exception events of a single Layer-C call. The outermost public function generates the id; inner functions inherit via opts[:request_id].

Examples

iex> id = ALLM.Telemetry.request_id()
iex> byte_size(id)
22
iex> id =~ ~r/^[A-Za-z0-9_-]{22}$/
true

span(name, start_metadata, fun)

@spec span(
  span_name(),
  common_metadata(),
  (-> {result, map()} | {result, map(), map()})
) :: result
when result: var

Wrap a closure in a :telemetry.span/3 call under the [:allm, name] prefix.

The closure must return either:

  • {result, stop_metadata_extras} — the 2-tuple form (default, used by :generate | :stream | :step | :chat | :tool spans). stop_metadata_extras is shallow-merged on top of the start metadata at :stop emission time so per-span extras (:response, :step_result, :chat_result, :result) appear alongside the common keys.
  • {result, extra_measurements, stop_metadata_extras} — the 3-tuple form, for spans that inject custom :stop measurements beyond :duration and :monotonic_time. :image spans use this form to carry :image_count as a measurement (numeric metrics → measurements; structured context → metadata).

Caller-supplied start_metadata is forwarded to the :start event unchanged and used as the base for the :stop event's metadata.

Raises ArgumentError for an unrecognised name (typo guard against :chats / :steps); valid names are :generate | :stream | :step | :chat | :tool | :image.

Carve-out: :stream :stop :response is nil

Per ALLM.StreamRunner.run/3, the :stream span's :stop metadata carries :response => nil — materialising the wrapped enumerable to populate the %Response{} would defeat consumer-driven laziness. Consumers needing the canonical response should either fold the returned stream themselves (ALLM.StreamCollector.to_response/1) or call ALLM.generate/3, whose :generate :stop DOES carry the reduced %Response{}.

Examples

iex> :telemetry.attach(
...> "doctest-handler",
...> [:allm, :generate, :stop],
...> fn _name, _measurements, %{result: r}, %{owner: pid} ->
...> send(pid, {:doctest_event, r})
...> end,
...> %{owner: self()}
...>)
iex> ALLM.Telemetry.span(:generate, %{request_id: "x"}, fn ->
...> {:done, %{result: :ok}}
...> end)
:done
iex> :telemetry.detach("doctest-handler")
iex> receive do
...> {:doctest_event, payload} -> payload
...> after
...> 100 -> :no_event
...> end
:ok