SkillKit instruments all meaningful runtime activity through the :telemetry library. Every agent turn, LLM call, tool execution, and rate-limit retry emits a structured event that you can forward to any metrics or logging backend without modifying SkillKit itself.

Two namespaces are used:

  • [:skill_kit, ...] — agent boundary spans and LLM pipeline events
  • [:anthropic, ...] — low-level HTTP client events for the Anthropic API

All durations are in :native time units (convert with System.convert_time_unit/3).


SkillKit events

Boundary spans

Each agent boundary emits a telemetry span, letting you measure latency per boundary type and observe which crossings were allowed, denied, or suspended.

EventKindDescription
[:skill_kit, :tool_use, :start/:stop]spanIndividual tool execution
[:skill_kit, :tool_batch, :start/:stop]spanBatch of parallel tool calls (wraps all :tool_use spans in one LLM turn)
[:skill_kit, :subagent, :start/:stop]spanSpawning a subagent
[:skill_kit, :conversation_save, :start/:stop]spanPersisting conversation history
[:skill_kit, :conversation_load, :start/:stop]spanLoading conversation history
[:skill_kit, :llm_request, :start/:stop]spanSending a request to the LLM
[:skill_kit, :turn, :start/:stop]spanProcessing a batch of messages

Each span emits a :start event (with :system_time) and a :stop event (with :duration). The metadata map contains the boundary context keys described in the Hooks guide.

The :tool_batch span wraps the entire parallel execution of tool calls returned by a single LLM response. Its metadata includes:

  • :agent_name — the agent executing the batch
  • :tool_count — number of tool calls in the batch
  • :tool_names — list of tool names being executed

If any tool in the batch suspends (via {:pending, state}), the :tool_batch span includes the time waiting for SkillKit.respond/3.

To observe every tool-use boundary crossing:

SkillKit.Telemetry.attach_many(
  :tool_use_spans,
  [
    [:skill_kit, :tool_use, :start],
    [:skill_kit, :tool_use, :stop]
  ],
  fn event, measurements, meta, _ ->
    IO.inspect({List.last(event), meta.agent_name, meta.tool})
  end,
  %{}
)

To measure total batch execution time (including suspension waits):

SkillKit.Telemetry.attach_many(
  :tool_batch_spans,
  [[:skill_kit, :tool_batch, :stop]],
  fn _event, %{duration: d}, meta, _ ->
    ms = System.convert_time_unit(d, :native, :millisecond)
    IO.puts("[#{meta.agent_name}] #{meta.tool_count} tools completed in #{ms}ms")
  end,
  %{}
)

LLM events

EventKindDescription
[:skill_kit, :llm, :stream, :start]span startAn LLM stream is about to begin
[:skill_kit, :llm, :stream, :stop]span stopStream completed (success or error)
[:skill_kit, :llm, :stream, :error]pointModel URI could not be resolved before the stream

Measurements and metadata

EventMeasurementsMetadata keys
:stream, :start:system_time:provider (module), :model (string)
:stream, :stop:duration:provider, :model, :error (on failure)
:stream, :error%{}:error (the {:error, _} tuple), :model (string)

Anthropic events

These events are emitted by the HTTP client layer regardless of which SkillKit agent triggered the request.

EventKindDescription
[:anthropic, :request, :start]span startBefore an API request is sent
[:anthropic, :request, :stop]span stopAfter a successful response
[:anthropic, :request, :exception]span exceptionOn request failure or exception
[:anthropic, :rate_limited]pointA 429 response triggered an automatic retry

Measurements and metadata

EventMeasurementsMetadata keys
:request, :startsystem_time(provider-defined)
:request, :stopduration(provider-defined)
:request, :exceptionduration, kind, reason, stacktrace(provider-defined)
:rate_limited:retry_after (ms), :attempt (integer):endpoint (string)

Attaching handlers

SkillKit.Telemetry.attach_many/4 delegates to :telemetry.attach_many/4. Handler functions must match (event, measurements, metadata, config).

SkillKit.Telemetry.attach_many(
  :my_app_telemetry,
  [
    [:skill_kit, :turn, :stop],
    [:skill_kit, :llm_request, :stop],
    [:anthropic, :rate_limited]
  ],
  &MyApp.TelemetryHandler.handle_event/4,
  %{}
)

# Cleanup:
SkillKit.Telemetry.detach(:my_app_telemetry)

Alternatively, implement SkillKit.Telemetry.Handler to create a supervised GenServer handler:

defmodule MyApp.Handlers.TurnLogger do
  use SkillKit.Telemetry.Handler, events: [
    [:skill_kit, :turn, :stop]
  ]

  @impl true
  def handle_event([:skill_kit, :turn, :stop], measurements, metadata) do
    Logger.info("[#{metadata.agent_name}] turn completed in #{measurements.duration}ns")
    :ok
  end
end

Add it to your supervision tree and it will subscribe automatically on startup.


Testing telemetry

SkillKit.TelemetryHelper wires up a per-test telemetry handler that forwards events to the test process as messages.

defmodule MyApp.AgentTest do
  use ExUnit.Case, async: true

  import SkillKit.TelemetryHelper

  setup :telemetry

  @tag telemetry: [
    [:skill_kit, :turn, :stop],
    [:skill_kit, :tool_use, :stop]
  ]
  test "agent emits turn and tool_use spans" do
    # ... trigger agent activity ...

    assert_receive {__MODULE__, [:skill_kit, :turn, :stop], meta}
    assert meta.agent_name == "my_agent"

    assert_receive {__MODULE__, [:skill_kit, :tool_use, :stop], _meta}
  end
end

setup :telemetry is a no-op when no @tag telemetry: is present, so it is safe in a shared setup block. Handlers are detached after each test.


Example: logger and metrics

A handler module that logs key events:

defmodule MyApp.TelemetryLogger do
  require Logger

  @events [
    [:skill_kit, :turn, :stop],
    [:skill_kit, :llm_request, :stop],
    [:anthropic, :rate_limited]
  ]

  def attach, do: SkillKit.Telemetry.attach_many(__MODULE__, @events, &handle_event/4, %{})

  def handle_event([:skill_kit, :turn, :stop], %{duration: d}, %{agent_name: name}, _) do
    Logger.info("[#{name}] turn completed in #{System.convert_time_unit(d, :native, :millisecond)}ms")
  end

  def handle_event([:skill_kit, :llm_request, :stop], %{duration: d}, meta, _) do
    Logger.debug("[#{meta.agent_name}] LLM request in #{System.convert_time_unit(d, :native, :millisecond)}ms")
  end

  def handle_event([:anthropic, :rate_limited], %{retry_after: ms, attempt: n}, _, _) do
    Logger.warning("Rate limited — retrying in #{ms}ms (attempt #{n})")
  end
end

For structured metrics with :telemetry_metrics (Prometheus, StatsD, etc.):

def metrics do
  [
    Metrics.distribution("skill_kit.turn.stop.duration",
      unit: {:native, :millisecond}, tags: [:agent_name]),
    Metrics.distribution("skill_kit.tool_use.stop.duration",
      unit: {:native, :millisecond}, tags: [:agent_name]),
    Metrics.counter("anthropic.rate_limited", tags: [:endpoint])
  ]
end