All notable changes to CouncilEx are documented here.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased

0.1.0 - 2026-05-31

First public release. CouncilEx is a framework for building multi-model, multi-agent LLM council workflows in Elixir: define a council of members, run structured rounds, aggregate or judge their outputs, and synthesize a final answer. Provider-agnostic, OTP-supervised, single-BEAM by default.

Added

Councils & runs

  • Static & dynamic councils — declare councils with the use CouncilEx DSL, or build them as data via %CouncilEx.DynamicCouncil{} (pipeable builder, JSON ser/de, registry-by-string-name).
  • Polymorphic dispatchCouncilEx.run/3, start/3, and start_link/3 accept either a module-form council or a %DynamicCouncil{}; one execution path, identical semantics.
  • Sync + async runs — blocking run/3 for short workflows; start/3 (GenServer.start/3 semantics: unsupervised, unlinked) and start_link/3 (linked) for async, both returning {:ok, pid}. Communicate with the runner via message passing.
  • Pre-run validationCouncilEx.validate/1 returns structured [%{path, code, message}] errors for module-form and %DynamicCouncil{} councils. start/3 gates on it, so config errors return {:error, {:invalid_council, errs}} before any process spawns or token is spent.
  • Optional run groupingCouncilEx.Supervisor, a thin DynamicSupervisor wrapper for tenant isolation, bulk-terminate, and in-flight visibility. Runs are unsupervised by default (caller-owned pids).
  • Failure handling — per-round failure_mode: :continue | :fail_fast, retry policies, member timeouts, run-level cancel/1, and structured %CouncilEx.Error{}.

  • Sub-councils — nest a council as a member (registered name, module atom, or nested %DynamicCouncil{}) with optional input mappers.
  • Routers — dynamic next-step selection between members or rounds, declared inline or registered by name.
  • Registry — config + runtime registration of profiles, tools, schemas, routers, rounds, sub-councils, and input mappers, all resolvable by string name.

Rounds, councils & aggregators

  • Round libraryIndependentAnalysis, Critique, Vote, Synthesis, WeightedSynthesis, Iterate, Ranking, PairwiseElimination, PeerReview, AnonymizedPeerReview, plus a custom-round behaviour.
  • Built-in councilsParallelPanel, PeerReview, Voting, Specialist, Consensus, Tournament, Chairman, WeightedConsensus, JuryWithRetry.
  • Confidence-triggered retryCouncils.JuryWithRetry runs K judges in parallel and re-samples on low average confidence (default threshold 0.7, max 2 iterations). Judges do not see each other across retries — independent re-sample, not debate (respects Wu et al. Can LLM Agents Really Debate?, arXiv:2511.07784).
  • Reliability-weighted consensusCouncils.WeightedConsensus weights members by static :weight, per-member :confidence, or historical Reliability lookups. Inspired by Wu et al. Council Mode (arXiv:2604.02923); mapping in docs/COUNCIL_MODE_PAPER.md.
  • AggregatorsPlurality, Borda, Condorcet, WeightedMean, Median, PeerRanking.

AutoCouncil

  • AutoCouncil — opt-in routing layer: a council that picks itself. Pluggable strategies (:rules, :cascade, plus stub :embedding / :llm_classify / :llm_build) select an existing council per prompt or synthesize a fresh %DynamicCouncil{}. Same CouncilEx.run/3 entry; routing decision surfaced in result.metadata.auto.

Per-member capabilities

  • Profiles — reusable per-member capability bundles (provider, model, temperature, tools, retry); 9 prebaked profiles plus user-defined use CouncilEx.Profile modules.
  • Structured output — Ecto-schema or inline JSON Schema per member, with native responseSchema (Gemini) and tool-shaped fallback (OpenAI/Anthropic).
  • Streaming — token-level streaming with sink callbacks, integrated with the tool-loop so tool-spanning turns read as one continuous response.
  • Tool calling — parallel tool execution with concurrency + timeout knobs, multi-iteration tool-loops in both complete/2 and stream/3, and :tool_choice (:auto | :required | :none | "name").
  • RAG via tools — council-level add_council_tool/2 shares a toolset across members; per-member :tools keeps specialist corpora private. CouncilEx.Tools.InMemoryDocs is a zero-dep BM25 retrieval tool. See docs/RAG.md.
  • Per-member confidence — opt-in :confidence strategies (:self_report, :logprob) populate %MemberResult{}.confidence for downstream weighting.

Providers

Reliability & bias

  • Reliability storeCouncilEx.Reliability (ETS default, pluggable) tracks per-member historical accuracy by query features; feeds WeightedConsensus.
  • BiasDetector — diagnostic-only CouncilEx.BiasDetector.analyze/2 flags when member disagreement correlates with demographic axes. Lexicon backend in core.

Persistence (optional)

  • Optional backends*.Ecto backends for Reliability, Registry, and Recorder, plus *.Redis for Reliability and Registry.
  • Oban workerCouncilEx.Workers.Oban for running councils as background jobs. See docs/RUNNING_WITH_OBAN.md.
  • Config :mode knob:single_node / :multi_node flips all backends in one place; migration + recovery helpers under CouncilEx.Persistence.

Observability

  • PubSub events — 10 frozen events on "council_ex:run:#{run_id}" (CouncilEx.Events); idempotent subscribe across :pg and Phoenix.PubSub adapters.
  • Telemetry[:council_ex, :run | :round | :member | :tool, :*] events with full parity on the async path (~3µs/event overhead), enriched with per-member token, model, provider, round, and confidence metadata.

  • Verbose tracerverbose: true | :debug prints a human-readable per-run timeline; pure event consumer, zero production cost when off.

  • Diagram toolingCouncilEx.Diagram.{to_ir, topology, sequence} for both council shapes (ASCII / Mermaid / sequence), plus the mix council.diagram task. IR is React-Flow-friendly JSON.

Testing