ObanPowertools.Telemetry.metrics/0 is opt-in and reporter-agnostic. It returns a list of Telemetry.Metrics definitions covering the five frozen Powertools control-plane event families. Powertools does not start a Telemetry supervisor or bundle a reporter — the host owns both. Swap in any reporter your stack already uses.

Wire it up

Add :telemetry_metrics to your host application's deps:

# mix.exs
defp deps do
  [
    {:telemetry_metrics, "~> 1.0"},
    # Optional: periodic VM and custom measurements
    {:telemetry_poller, "~> 1.0"}
  ]
end

Then mount ObanPowertools.Telemetry.metrics/0 inside your own Telemetry supervisor. Use Telemetry.Metrics.ConsoleReporter for local smoke-testing — it ships with :telemetry_metrics and requires no extra dependency. Replace it with your production reporter (Prometheus, StatsD, Datadog, etc.) before go-live.

defmodule MyApp.Telemetry do
  use Supervisor

  def start_link(arg) do
    Supervisor.start_link(__MODULE__, arg, name: __MODULE__)
  end

  def init(_arg) do
    children = [
      # Swap ConsoleReporter for your chosen reporter in production
      {Telemetry.Metrics.ConsoleReporter,
       metrics: ObanPowertools.Telemetry.metrics()},
      # Optional: periodic VM measurements
      {:telemetry_poller,
       measurements: periodic_measurements(),
       period: 10_000}
    ]

    Supervisor.init(children, strategy: :one_for_one)
  end

  defp periodic_measurements do
    [
      {:process_info,
       event: [:my_app, :vm],
       name: self(),
       keys: [:message_queue_len, :memory]}
    ]
  end
end

Add MyApp.Telemetry to your application's supervision tree:

# lib/my_app/application.ex
children = [
  ...,
  MyApp.Telemetry
]

ObanPowertools.Telemetry.metrics/0 raises a clear error if :telemetry_metrics is not loaded — it does not silently return an empty list.

The four golden signals for Oban-backed work

Latency, throughput, errors, and saturation are the four golden signals for any Oban-backed system. Three of the four come from Oban-core events, not Powertools.

SignalEventMeasurement
Latency[:oban, :job, :stop]:duration (execution), :queue_time (wait)
Throughput[:oban, :job, :stop]:count over time
Errors[:oban, :job, :exception]:count by worker/queue
Saturationhost-sourced queue depthdeferred to v1.9 (QRY-06)

These events are emitted by Oban, not by Powertools. ObanPowertools.Telemetry.metrics/0 deliberately does not re-emit them — duplicating Oban-core signals would create drift. Parapet's Universal Phoenix Metrics and every standard reporter already instrument these events out-of-the-box.

For saturation (live available/executing counts), a query-backed measurement over oban_jobs is deferred to v1.9 and will require an optional oban_met read source. It is not part of this phase.

Powertools control-plane SLIs

ObanPowertools.Telemetry.metrics/0 contributes the control-plane SLIs that Oban-core cannot see: what your limiters, lifeline repair pipeline, workflows, and cron scheduler are doing. These are the events Oban itself is unaware of.

All tags are low-cardinality string values (e.g. scope: "partitioned", outcome: "ok"). The frozen contract explicitly excludes job_id, args, preview tokens, and free-form reasons.

Limiter saturation

MetricTagsWhat it tracks
oban_powertools.limiter.blocked.countblocker_code, resource, scopeJob enqueues blocked by limiter
oban_powertools.limiter.released.countresource, scopeLimiter reservations released
oban_powertools.limiter.cooled_down.countresource, scopeLimiter buckets cooled down

Use blocker_code to distinguish limit_reached from window_exhausted. Use resource and scope to isolate which partitioned or global limiter is firing.

Lifeline repair and incident outcomes

MetricTagsWhat it tracks
oban_powertools.lifeline.repair_previewed.countaction, incident_class, target_typeRepair actions previewed
oban_powertools.lifeline.repair_executed.countaction, incident_class, target_typeRepair actions executed
oban_powertools.lifeline.archive_prune_completed.countoutcomeArchive-prune cycle completions
oban_powertools.lifeline.heartbeat_refresh.count(none)Heartbeat refresh cycles
oban_powertools.lifeline.incident_projection.count(none)Incident projection cycles

incident_class values (e.g. workflow_stuck, orphaned_job) let you break down repair activity by the type of issue being resolved. heartbeat_refresh and incident_projection counters are useful liveness signals confirming the Lifeline process is cycling.

Workflow terminal causes

MetricTagsWhat it tracks
oban_powertools.workflow.workflow_terminal.countstate, outcome, terminal_cause, semantics_versionWorkflows reaching terminal state
oban_powertools.workflow.step_completed.countoutcome, terminal_cause, semantics_versionIndividual workflow steps completed
oban_powertools.workflow.step_unblocked.countscope, state, semantics_versionWorkflow steps unblocked by dependency
oban_powertools.workflow.cascade_cancelled.countscope, outcome, terminal_cause, semantics_versionCascade cancellations

terminal_cause (e.g. "completed", "failed_after_retries", "cancelled_by_dependency") gives you visibility into why workflows ended — not just that they did.

Cron schedule events

MetricTagsWhat it tracks
oban_powertools.cron.slot_claimed.countsource, overlap_policy, catch_up_policyCron slots claimed at schedule time
oban_powertools.cron.paused.countsource, overlap_policyCron entries paused by operator
oban_powertools.cron.resumed.countsource, overlap_policyCron entries resumed by operator
oban_powertools.cron.run_now.countsource, overlap_policyCron entries triggered run-now by operator

Operator actions

MetricTagsWhat it tracks
oban_powertools.operator_action.complete.countaction, sourceOperator actions completed (pause, resume, run-now)
oban_powertools.operator_action.previewed.countaction, sourceOperator actions previewed

Feeding Parapet SLOs

Powertools telemetry drops cleanly into Parapet SLO alerting because the frozen contract already satisfies Parapet's core tenets:

  • Telemetry as a Strict Public API — the @contract is versioned, documented, and SemVer-governed. Event names and tag sets do not change without a major version.
  • Cardinality Safety — the contract explicitly excludes job_id, args, preview tokens, and free-form reasons. Every tag is a low-cardinality string enum.

No oban_met dependency is required, referenced, or needed. The metrics surface described here uses only :telemetry, :telemetry_metrics, and the Powertools event families. oban_met is an optional live queue-depth read source for a future release (v1.9, QRY-06) and has no connection to these control-plane SLIs.

Example: repair success rate SLO

Track the ratio of successful to total repair executions — a burn-rate SLO on your Lifeline pipeline:

# Prometheus/Grafana example (no Parapet required — any reporter works)
#
# SLO: 99% of repair_executed events should have outcome = "ok"
# Alert if error budget burns faster than your chosen rate
#
# Numerator:   oban_powertools_lifeline_repair_executed_count{action="retry", incident_class="orphaned_job", target_type="job"}
# Denominator: oban_powertools_lifeline_repair_executed_count (all outcomes)
#
# In Parapet (one consumer — works identically with Prometheus + Grafana without Parapet):
#
# Parapet.SLO.define(:lifeline_repair_success,
#   target: 0.99,
#   window: :rolling_28d,
#   good_events: [
#     metric: "oban_powertools.lifeline.repair_executed.count",
#     tag_filters: [outcome: "ok"]
#   ],
#   total_events: [
#     metric: "oban_powertools.lifeline.repair_executed.count"
#   ]
# )

Because the event names are stable, documented public API and the tag sets are low-cardinality, any alerting system (Parapet, Prometheus alerting rules, Datadog monitors) can consume them with confidence that the signal will not change without a SemVer-major announcement.

Parapet is one consumer, not a coupling. The same metrics work equally well with Prometheus

What this is not

  • metrics/0 returns metric definitions, not a reporter or a running process.
  • Powertools does not start a Telemetry supervisor. The host owns the supervision tree.
  • Reporter choice is the host's. The library never bundles telemetry_metrics_prometheus, StatsD exporters, or any other reporter.
  • Tag values are strings (e.g. scope: "partitioned", outcome: "ok"), not atoms.
  • Powertools does not re-emit Oban-core [:oban, :job, :stop|:exception] events. Golden-signal latency, throughput, and error-rate come from Oban itself.
  • oban_met is not required and is not referenced by this feature.