CouncilEx exposes two complementary observability layers: a PubSub event bus (structured messages on a per-run topic) and :telemetry spans (standard Erlang/Elixir instrumentation). Both are zero-cost when unused.


PubSub events

Ten events fire on the run-scoped topic "council_ex:run:#{run_id}":

EventWhen
:run_startedOnce, before any round begins
:round_startedAt the start of each round
:member_startedBefore each member executes in a round
:member_tokenEach streaming chunk (streaming members only)
:tool_call_requestJust before each tool execution
:tool_call_resultAfter each tool execution (success or error)
:member_completedAfter each member's turn finishes
:round_completedAfter all members in a round finish
:run_completedOnce, on the success terminal path
:run_failedOnce, on any non-success terminal path

:member_token only fires for members configured with stream: true.

:member_completed carries the full %CouncilEx.MemberResult{} and fires for both successful and failed members — inspect member_result.status (:ok | :error | :timeout | :skipped | :invalid_output | :eliminated) to discriminate. This replaces the v0.6 split between :member_completed (ok-only) and :member_failed (error-only).

:run_completed and :run_failed are mutually exclusive terminal events.

Subscribing:

:ok = CouncilEx.PubSub.subscribe("council_ex:run:#{run_id}")

receive do
  {:run_started, ^run_id, council_module, input} -> ...
  {:round_started, ^run_id, round_name, idx} -> ...
  {:member_started, ^run_id, round_name, member_id} -> ...
  {:member_token, ^run_id, round_name, member_id, %CouncilEx.StreamChunk{}} -> ...
  {:tool_call_request, ^run_id, round_name, member_id, %CouncilEx.ToolCall{}} -> ...
  {:tool_call_result, ^run_id, round_name, member_id, %CouncilEx.ToolCallResult{}} -> ...
  {:member_completed, ^run_id, round_name, member_id, %CouncilEx.MemberResult{}} -> ...
  {:round_completed, ^run_id, round_name, %CouncilEx.RoundResult{}} -> ...
  {:run_completed, ^run_id, %CouncilEx.Result{}} -> ...
  {:run_failed, ^run_id, [%CouncilEx.Error{}, ...], %CouncilEx.Result{}} -> ...
end

Subscribe before calling CouncilEx.start/2 to avoid missing :run_started. Run IDs are returned from CouncilEx.start/2 and also embedded in CouncilEx.run/2 results.


Event surface

The full event surface is documented in CouncilEx.Events. It is frozen as of v0.7 — the stable extension contract. Host apps may safely build persistence layers, dashboards, and durable-execution integrations against it. Future versions may add new events but will not remove or rename existing events without a major version bump.

Event ordering guarantees:

:run_started
 :round_started (round 1)
    :member_started × N
    :member_token × M       (streaming members)
    :tool_call_request × K  (tool-calling members)
    :tool_call_result × K
    :member_completed × N
    :round_completed
 :round_started (round 2)
    ...
 :run_completed | :run_failed

Ordering is guaranteed within a single run (broadcasts originate from a single RunServer process). Across rounds: strictly ordered. Within a round: member event groups may interleave with other members, but each member's own events are ordered.


Phoenix.PubSub adapter

By default CouncilEx uses an internal :pg-based PubSub. Apps with an existing Phoenix.PubSub server can route CouncilEx events through it:

# In your supervision tree:
children = [
  {Phoenix.PubSub, name: MyApp.PubSub},
  # ...
]

# In config:
config :council_ex,
  pubsub: {CouncilEx.PubSub.Phoenix, name: MyApp.PubSub}

phoenix_pubsub is listed as an optional: true dependency; it is only pulled in if the host app declares it. CouncilEx never starts a PubSub server itself — the host app owns the supervision tree.

See examples/phoenix_pubsub_example.exs for a runnable end-to-end example, and docs/RUNNING_IN_PHOENIX.md for broader Phoenix integration guidance.


Default telemetry logger

CouncilEx emits :telemetry spans for four event kinds:

KindEvent prefix
:run[:council_ex, :run, :start|:stop|:exception]
:round[:council_ex, :round, :start|:stop|:exception]
:member[:council_ex, :member, :start|:stop|:exception]
:provider[:council_ex, :provider, :request, :start|:stop|:exception]

CouncilEx.Telemetry ships a built-in Logger handler that forwards all spans to Logger:

# Attach Logger handlers to all CouncilEx telemetry events.
:ok = CouncilEx.Telemetry.attach_default_logger()

# Subset of event kinds + custom log level:
:ok = CouncilEx.Telemetry.attach_default_logger(events: [:run, :round], level: :debug)

# Only provider spans:
:ok = CouncilEx.Telemetry.attach_default_logger(events: [:provider])

# Detach:
:ok = CouncilEx.Telemetry.detach_default_logger()

Options:

OptionDefaultDescription
:level:infoLogger level for :start / :stop events
:events[:run, :round, :member, :provider]Subset of event kinds to attach

Both functions return :ok.

:exception phases always log at :warning regardless of the :level option.

Re-attach is idempotent: calling attach_default_logger/0,1 again detaches any existing handlers first, so it is safe to call at startup without guard logic.

[:council_ex, :member, :stop] measurements

The member stop event carries a self-contained observability row — enough to persist a full trace entry without replaying round events:

# measurements
%{
  duration:      integer,          # native time units
  input_tokens:  non_neg_integer,
  output_tokens: non_neg_integer
}

# metadata
%{
  run_id:        String.t,
  member_id:     atom | String.t,
  member_module: module,
  round_name:    atom,
  round_idx:     non_neg_integer,
  model:         String.t | nil,
  provider:      atom | nil,
  status:        :ok | :error | :timeout | :skipped | :invalid_output | :eliminated,
  attempts:      pos_integer,
  confidence:    float | nil
}

cost_usd is intentionally omitted — integrators compute spend from input_tokens, output_tokens, and their own pricing model.


Verbose mode

Per-run debug timeline printed to stdout. Zero production cost when off — it is a pure PubSub event consumer over the run's topic.

# Summary timeline (run/round/member lifecycle + durations):
{:ok, result} = CouncilEx.run(MyCouncil, input, verbose: true)

# Full debug output (also dumps responses):
{:ok, result} = CouncilEx.run(MyCouncil, input, verbose: :debug)

Accepted values: true, :debug, false (default). Redirect output with the verbose_io: option (any IO.device).

Sample output for verbose: true:

 run Xy7Q started council=MyCouncil
   round independent_analysis (#0)
     alice
     alice 842ms in=120 out=380
   round independent_analysis
 run Xy7Q ok

Examples honor a VERBOSE=1 env var:

VERBOSE=1 OPENAI_API_KEY=sk-... mix run examples/debate_example.exs

Diagrams

CouncilEx.Diagram reflects a council module's static topology to ASCII or Mermaid flowchart TB. It is also the integration point for a future live-overlay UI: to_ir/1 returns a JSON-encodable IR that a web frontend can combine with PubSub events (see the Web overlay section of the Diagrams doc).

Quick reference:

mix council.diagram MyApp.MyCouncil --ascii
mix council.diagram MyApp.MyCouncil | pbcopy    # → mermaid.live
iex> CouncilEx.Diagram.topology(MyApp.MyCouncil, format: :ascii) |> IO.puts
iex> CouncilEx.Diagram.topology(MyApp.MyCouncil) |> IO.puts
iex> CouncilEx.Diagram.to_ir(MyApp.MyCouncil)      # JSON-encodable IR

Full documentation — output samples, Mermaid shape conventions, env-var controls, mix task, iex recipes, and the web overlay event table — is in docs/DIAGRAMS.md.