Multi-model LLM council workflows for Elixir.
Define a council of specialized members, run structured rounds of analysis, and synthesize a final answer. Works against popular providers (OpenAI, Anthropic, Gemini, Ollama, OpenRouter). Built to get richer answers from multiple models while keeping control over the process.
Inspired by Andrej Karpathy's karpathy/llm-council: the multi-stage peer-review pattern that motivated this framework.

Contents
- Features: what ships in core, ordered by abstraction layer
- Installation: adding the dep
- Quickstart: runnable OpenRouter council in 4 steps
- Examples: index of
examples/*.exsby topic - Concepts: vocabulary used throughout the rest of the doc
- Council forms: static (DSL) vs dynamic (data) councils
- Providers: OpenAI / Anthropic / Gemini / Ollama / OpenRouter
- Profiles: reusable per-member capability bundles
- Running councils: sync, async, Phoenix integration, retries
- Council topologies: pre-built templates (Specialist, Tournament, WeightedConsensus, JuryWithRetry, …)
- Per-member capabilities: structured outputs, streaming, tools
- Composition: sub-councils + adaptive routers
- Auto-routing with AutoCouncil: councils that pick themselves
- Observability: PubSub events, telemetry, verbose tracer, diagrams, introspection
- Testing: Mock provider, test helpers, capture-events
- Deployment Considerations: single-node / cluster / replica / ephemeral
- Roadmap & changelog: shipped + planned
- License
Features
Ordered roughly from core primitives → execution → per-member capabilities → reliability → observability → dev tooling.
- 🏛️ Static & dynamic councils: declare councils with the
use CouncilExDSL or build them as data via%CouncilEx.DynamicCouncil{}(pipeable builder, JSON ser/de, registry-by-string-name). - 🔌 Multi-provider adapters: OpenAI, Anthropic, Gemini, and OpenRouter implement the
CouncilEx.Provider.Adapterbehaviour; Ollama ships as a config preset over the OpenAI adapter. All five are built in. - 🥊 Round library:
:independent_analysis,:peer_review,:vote,:pairwise_elimination, plus prebuiltCouncils.{Specialist,Consensus,Tournament,WeightedConsensus,JuryWithRetry}and a custom-round behaviour. - ⚖️ Confidence-triggered retry:
Councils.JuryWithRetryruns K judges in parallel and re-samples on low average confidence (default threshold0.7, max2iterations). Judges DO NOT see each other across retries: independent re-sample, not debate. Pattern convergent across Chaos-MoA / Adjudicator / production systems; respects Wu et al. Can LLM Agents Really Debate? (arXiv:2511.07784). - ⚖️ Reliability-weighted consensus:
Councils.WeightedConsensusweights member contributions by static:weightopts, per-member:confidencescores, or historicalReliabilitylookups. Inspired by Wu et al. Council Mode (arXiv:2604.02923); full mapping indocs/COUNCIL_MODE_PAPER.md. - 🎯 Per-member confidence: opt-in
:confidencestrategies (:self_report,:logprob) populate%MemberResult{}.confidencefor downstream weighting. - 🔍 BiasDetector: diagnostic-only
CouncilEx.BiasDetector.analyze/2flags when member disagreement correlates with demographic axes (gender, ethnicity, religion, age, ability). Lexicon backend in core. LLM-judge and embedding-cluster backends planned. - 📚 Reliability store:
CouncilEx.Reliability(ETS default, pluggable) tracks per-member historical accuracy by query features. FeedsWeightedConsensusfor adaptive weighting. - ⚡ Sync + async runs: blocking
run/3for short workflows,start/3(GenServer.start/3semantics: unsupervised, unlinked) andstart_link/3(linked to caller) for async. Both return{:ok, pid}. Communicate with the runner via message passing, like any GenServer. - 🛂 Pre-run validation:
CouncilEx.validate/1returns structured[%{path, code, message}]errors for module-form or%DynamicCouncil{}councils.start/3gates on it so config errors return{:error, {:invalid_council, errs}}before any process spawns or token is spent. - 🌳 Optional run grouping:
CouncilEx.Supervisoris a thinDynamicSupervisorwrapper for callers who want tenant isolation, bulk-terminate, or in-flight visibility. Library has no bundled supervisor: runs are unsupervised by default (caller's responsibility, likeGenServer.start/3). - 🪆 Sub-councils: nest a council as a member; works in static and dynamic forms (registered name, module atom, or nested
%DynamicCouncil{}) with optional input mappers. - 🚦 Routers: dynamic next-step selection between members or rounds, declared inline or registered by name.
- 🤖 AutoCouncil: opt-in routing layer. A council that picks itself. Pluggable strategies (
:rules,:cascade, plus stub:embedding/:llm_classify/:llm_build) select an existing council per prompt, or synthesize a fresh%DynamicCouncil{}on the fly. SameCouncilEx.run/3entry. Routing decision surfaced inresult.metadata.auto. 🛠️ Tool calling: parallel tool execution with concurrency + timeout knobs, multi-iteration tool-loops in both
complete/2andstream/3, and:tool_choice(:auto | :required | :none | "name").- 📚 RAG via tools: council-level
add_council_tool/2exposes a shared toolset to every member. Per-member:toolskeeps specialist corpora private.CouncilEx.Tools.InMemoryDocsis a zero-dep BM25 retrieval tool baked from a compile-time corpus, useful for examples and tests. Production retrieval should wrap your real index. Seedocs/RAG.md. - 📐 Structured output: Ecto-schema or inline JSON Schema per member, with native
responseSchema(Gemini) and tool-shaped fallback (OpenAI/Anthropic). - 🌊 Streaming: token-level streaming with sink callbacks, integrated with the tool-loop so tool-spanning turns look like one continuous response.
- 🎛️ Profiles: reusable per-member capability bundles (provider, model, temperature, tools, retry); 9 prebaked profiles plus user-defined
use CouncilEx.Profilemodules. - 🔀 Polymorphic dispatch:
CouncilEx.run/3andstart/3take either a module-form council or a%DynamicCouncil{}; one execution path, identical semantics. 🛡️ Failure handling: per-round
failure_mode: :continue | :fail_fast, retry policies, member timeouts, run-levelcancel/1, and structured%CouncilEx.Error{}.- 📒 Registry: config + runtime registration of profiles, tools, schemas, routers, rounds, sub-councils, and input mappers, all resolvable by string name.
- 📡 PubSub events: 10 frozen events on
"council_ex:run:#{run_id}"(CouncilEx.Events); idempotent subscribe across:pgand Phoenix.PubSub adapters. 📊 Telemetry:
[:council_ex, :run | :round | :member | :tool, :*]events with full parity on the modern async path; ~3µs/event overhead.🔍 Verbose tracer:
verbose: true | :debugopt prints a human-readable per-run timeline (member start/stop, durations, tokens, tool calls). Pure event consumer, zero production cost when off.- 🗺️ Diagram tooling:
CouncilEx.Diagram.{to_ir,topology,sequence}for both council shapes; IR is React-Flow-friendly JSON. - 🧪 Mock provider: scriptable in-memory provider for tests and example fixtures (
CouncilEx.Providers.Mock.script/2); not for production use.
Installation
def deps do
[
{:council_ex, "~> 0.1"}
]
endReal LLM providers need a configured adapter + API key (e.g. OPENAI_API_KEY,
ANTHROPIC_API_KEY, GEMINI_API_KEY, OPENROUTER_API_KEY). See
Providers.
Optional dependencies
The core (parallel rounds, aggregation, streaming, tools, telemetry/PubSub observability) needs nothing beyond the dep above. Each opt-in backend pulls its own library — add it only when you use that feature:
| Feature | Add to deps | Docs |
|---|---|---|
Ecto persistence (Recorder/Registry/Reliability.Ecto, migrations) | {:ecto_sql, "~> 3.13"} + a driver, e.g. {:postgrex, "~> 0.20"} | PERSISTENCE.md |
| Durable background runs | {:oban, "~> 2.19"} | RUNNING_WITH_OBAN.md |
Redis backends (Registry/Reliability.Redis) | {:redix, "~> 1.5"} | — |
| Route events through your own PubSub | {:phoenix_pubsub, "~> 2.1"} | RUNNING_IN_PHOENIX.md |
These are declared optional: true, so they are not installed transitively
— including under Mix.install (e.g. in a Livebook). council_ex compiles fine
without them; the relevant modules are simply omitted until the dep is present.
Quickstart
This walkthrough uses OpenRouter to answer the meta-question: when
should you use an LLM council instead of a single model call?
OpenRouter is the easiest way to start. One API key reaches every major
frontier model (openai/gpt-4o, anthropic/claude-sonnet-4-6,
google/gemini-2.5-flash, meta-llama/llama-3.3-70b-instruct, etc.), so a multi-model
council needs no extra wiring. The same council code runs against OpenAI,
Anthropic, Gemini, or Ollama directly. See Providers.
# 1. Configure OpenRouter. Set OPENROUTER_API_KEY in your shell.
Application.put_env(:council_ex, :providers,
openrouter: [
adapter: CouncilEx.Provider.Adapters.OpenRouter,
api_key: {:system, "OPENROUTER_API_KEY"}
]
)
# 2. Define members (identity: role + system prompt)
defmodule MyApp.Members.Advocate do
use CouncilEx.Member
role "Advocate"
system_prompt """
You argue FOR using a multi-model LLM council. Given the user's task,
list 3-5 concrete situations where multiple model voices outperform a
single call (e.g. high-stakes decisions, contested judgement, weak
ground truth, creative divergence). Be specific. No hedging.
"""
end
defmodule MyApp.Members.Skeptic do
use CouncilEx.Member
role "Skeptic"
system_prompt """
You argue AGAINST using a multi-model LLM council. Given the user's
task, list 3-5 concrete situations where a council is overkill or
actively harmful (latency, cost, false consensus, deterministic
problems with a known answer). Be specific. No hedging.
"""
end
defmodule MyApp.Members.Synthesizer do
use CouncilEx.Member
role "Synthesizer"
system_prompt """
Read the Advocate's and Skeptic's lists. Produce a short decision rule
the reader can apply to their own task: "use a council when …, skip it
when …". Two short paragraphs max.
"""
end
# 3. Define a council (capability: provider + model)
# Each member can run on a different frontier model. That's the
# point. OpenRouter exposes them all under one provider.
defmodule MyApp.WhenToCouncil do
use CouncilEx
member :advocate, MyApp.Members.Advocate,
provider: :openrouter, model: "openai/gpt-4o-mini"
member :skeptic, MyApp.Members.Skeptic,
provider: :openrouter, model: "anthropic/claude-sonnet-4-6"
round :independent_analysis
chair MyApp.Members.Synthesizer, id: :chair,
provider: :openrouter, model: "openai/gpt-4o"
end
# 4. Run
{:ok, result} =
CouncilEx.run(
MyApp.WhenToCouncil,
%{question: "When should I use an LLM council instead of a single LLM call?"}
)
IO.puts(result.final.content)Example run output (VERBOSE=1 mix run examples/quickstart_example.exs)
```
VERBOSE=1 mix run examples/quickstart_example.exs 17s 17:48:01
▶ run …Sd72NF started council=QuickstartCouncil.WhenToCouncil
▶ round independent_analysis (#0)
▶ advocate
▶ skeptic
✓ advocate 5547ms in=90 out=361
✓ skeptic 5839ms in=91 out=398
✓ round independent_analysis
▶ round synthesis (#1)
▶ chair
✓ chair 3825ms in=918 out=156
✓ round synthesis
✓ run …Sd72NF ok
=== Panel members (independent_analysis round) ===
[advocate]
1. **High-Stakes Decisions**: multiple model voices minimize catastrophic-error risk …
2. **Contested Judgement**: subjective calls benefit from differing viewpoints …
… (5 situations total)
[skeptic]
1. **Latency in Time-Sensitive Applications**: each model query adds delay …
2. **Cost-Efficiency in High-Volume Use Cases**: per-call costs multiply …
… (5 situations total)
=== Final synthesis (chair) ===
Use a council for complex, high-stakes, or contested decisions where multiple
perspectives or weak/disputed data justify the extra cost. Skip it when the task
demands speed, cost-efficiency, or has a clear single answer.
Total duration: 15226ms
Total tokens: 2014
```
The Advocate and Skeptic run in parallel during the
:independent_analysis round; the Synthesizer chair sees both outputs and
produces the final answer. Inspect result.rounds for each member's
verdict and result.metadata for token + timing totals.
A runnable version of this exact council lives in
examples/quickstart_example.exs.
Run it with OPENROUTER_API_KEY=sk-or-v1-... mix run examples/quickstart_example.exs.
Single-vendor variant: If you only have one vendor's API key (
OPENAI_API_KEY,ANTHROPIC_API_KEY,GEMINI_API_KEY), swap the provider config in step 1 for that vendor's adapter and use that vendor's model ids (see Providers). The council code is unchanged.
Karpathy-style 3-stage council: for the opinions → anonymized peer review → chairman pattern from
karpathy/llm-council, seedocs/TUTORIAL_KARPATHY_COUNCIL.mdand the runnableexamples/karpathy_council_example.exs. Decision guide for picking betweenPeerReviewandAnonymizedPeerReviewlives atdocs/PEER_REVIEW_PATTERNS.md.
Mock provider:
CouncilEx.Providers.Mockexists for tests and deterministic example fixtures only. Do not use it as a stand-in for a real LLM in application code. See Test helpers.
Examples
Index of examples/*.exs. Every example runs against a real provider
(default OpenAI or OpenRouter: see the Run: comment at the top of
each file for the required API key). The Mock provider exists for
tests only; do not run it as a stand-in for an LLM in examples.
Most examples support the COUNCIL_FORM=static|dynamic env switch
(see Dual-form pattern). Examples that don't
support the switch: dynamic_council_example.exs (already dynamic),
the prebaked Councils.{Specialist,Consensus,Tournament}.new/1
wrappers (specialist, consensus, tournament), and the
council-bypass demos (parallel_tools, tool_call_events).
sub_council_example.exs also supports the switch.
Topologies & composition
parallel_panel_example.exs: simplest panel + chairkarpathy_council_example.exs: Karpathyllm-council3-stage port — opinions →anonymized_peer_review(PeerRanking) → chairman. Seedocs/TUTORIAL_KARPATHY_COUNCIL.mddebate_example.exs: Pro/Con + chainedpeer_reviewroundspeer_review_manuscript_example.exs:Councils.PeerReview.new/1as literal scientific peer review — theorist presents → 3 distinct-lens reviewers critique → author rebuts → journal editor's decision letterspecialist_example.exs:Councils.Specialist.new/1consensus_example.exs:Councils.Consensus.new/1with convergence callbacktournament_example.exs:Councils.Tournament.new/1weighted_consensus_example.exs:Councils.WeightedConsensus.new/1with static:weightopts (Wu et al. Council Mode)jury_with_retry_example.exs:Councils.JuryWithRetry.new/1, K judges + confidence-triggered re-samplepr_review_example.exs: analyst → judge → chair topology. Per-round routers split a single roster (analysts run round 1, judges vote in round 2 withPluralityoverSchemas.Vote, chair synthesizes citing tally + dissent + analyst findings; chair may override plurality on critical analyst severity)confidence_example.exs: per-member:self_reportconfidence drivingWeightedConsensusweightsbias_detector_example.exs:CouncilEx.BiasDetector.analyze/2over aParallelPanelon a value-laden questionreliability_example.exs:CouncilEx.ReliabilityETS store. Records outcomes, scores per(member, query_features)(no API key required)pairwise_direct_example.exs: rawIterate(PairwiseElimination)compositionrouter_example.exs: adaptiveCouncilEx.Routerauto_council_example.exs:CouncilEx.AutoCouncilrouting across three small councils (inline rules, registry catalog,provider_check,CouncilEx.auto/1shortcut)sub_council_example.exs: hierarchical sub-council memberdynamic_sub_council_example.exs: three sub-council reference shapes (inline struct, registered name, registered name + input_mapper) for%DynamicCouncil{}presidential_debate_example.exs: N-memberPeerReview: four candidates rebut across chained rounds, Pundit chair synthesizesmulti_model_panel_example.exs: three vendors, one panelagi_debate_example.exs: all five providers (OpenAI / Anthropic / Gemini / Ollama / OpenRouter), one perspective each, debating AGI timing + post-AGI society + human-AI cooperation
Profiles & dynamic councils
profile_example.exs:default_profile,profile:overrides, inline overridescreative_judge_example.exs:OpenAICreativewriters +OpenAIDeterministicjudgedynamic_council_example.exs: builder → validate → JSON round-trip → run; includes inline JSON schema member
Custom rounds & voting
custom_round_example.exs: implements four ofCouncilEx.Round's five callbacks (all but the optionalconverged?/3)vote_example.exs::voteround with Plurality vs WeightedMean
Streaming & tools
streaming_example.exs: OpenAI token streaminganthropic_streaming_example.exs: Anthropic typed-event streamingtool_calling_example.exs: full tool loop + tool error recoverytool_call_events_example.exs: per-call PubSub events (real OpenAI provider)rag_via_tools.exs: RAG via council-level + per-memberInMemoryDocstools (real OpenRouter provider)bench/parallel_tools.exs: sequential vs parallel tool exec (benchmark)
Operational concerns
error_handling_example.exs: retry,failure_mode,cancel/1,await/2timeoutverbose_tutorial_example.exs:verbose: true/:debugandverbose_io:capturephoenix_pubsub_example.exs: pluggable PubSub backend
Per-provider quickstarts
gemini_example.exs,ollama_example.exs,openrouter_example.exs,anthropic_structured_output_example.exs,parallel_panel_real_provider.exs
Concepts
Vocabulary used throughout the rest of the README.
- Council: the workflow itself. A named ordering of members + rounds + an optional chair. Two interchangeable forms: module-form (
use CouncilEx) or data-form (%CouncilEx.DynamicCouncil{}). - Member: one LLM seat at the table. Defines identity (
role,system_prompt, optionaloutput_schema). Identity is reusable; pair it with different capability stacks via Profiles. - Profile: capability stack (
provider,model,temperature,max_tokens,tools,retry). Same Member + different Profile = same brain, different model. Resolution: inline opts > member:profile> councildefault_profile> app config. - Round: one phase of the run. Built-in types:
:independent_analysis(members run in parallel),:peer_review(members see each other's prior turn),:vote(each member emits a ballot, aggregator picks a winner),:pairwise_elimination(tournament bracket), plus:anonymized_peer_review,:critique,:ranking,:synthesis,:iterate, and user-definedCouncilEx.Roundmodules. A council can have any number of rounds. - Chair: final synthesis member. Runs once after all rounds, sees every prior member output, and produces
%Result{}.final. Optional. Councils without a chair return per-round results only. - Router: dynamic next-step picker. Inspects state mid-run and chooses the next member or round. Inline closure or registered-by-name.
- Sub-council: a council used as a member of another council. Composes vertically: the outer council sees the sub-council's
finalas that member's response. Works in static + dynamic forms. - Run: one execution of a council against an input. Identified by
run_id. Sync viarun/3, async viastart/3+await/2/cancel/1. - Result:
%CouncilEx.Result{}returned fromrun/3andawait/2. Carriesinput, per-round%RoundResult{}(with per-member%MemberResult{}),finalchair response,status,errors, andmetadata(timings + token totals). - Tool: Elixir module implementing
CouncilEx.Toolthat the model can call mid-turn. Parallel execution + multi-iteration tool-loops are built in. - Aggregator: function that reduces a
:voteround's ballots into a winner.Plurality,WeightedMeanship in core; user-defined ones plug into the same interface. - Registry: runtime/config table of named profiles, tools, schemas, routers, rounds, sub-councils, and input mappers. Lets data-form councils reference behaviour by string name (
"my_tool") instead of module atoms, required for JSON ser/de. - Provider adapter: module behind a configured
provider:key (:openai,:anthropic, …) that translates a normalized request into an HTTP call and parses the response. ImplementsCouncilEx.Provider.Adapter. OpenAI / Anthropic / Gemini / Ollama / OpenRouter ship in core. - Council vs ensemble: a classical ensemble = N models in parallel + flat aggregator (one round, no roles). A council adds roles, multi-round flow, cross-member visibility, iteration, chair synthesis, sub-councils, and dynamic routing. Only the
Votingtopology reduces to ensemble shape; the other six add structure ensembles cannot express. Seedocs/COUNCILS.mdfor the full comparison. - AutoCouncil:
%CouncilEx.AutoCouncil{}data struct that resolves to a council at run time. Holds a:strategy(:rules,:cascade, …), a:catalogof routable councils (inline list or registry-backed), and an:on_no_matchpolicy. From the runner's perspective it is a council. Pass it toCouncilEx.run/3like any other. The picked council's identity surfaces inresult.metadata.auto. See Auto-routing.
Council forms
CouncilEx exposes two interchangeable ways to declare a council. Both lower to the
same %CouncilEx.Spec{} and execute through the same runtime — behaviour,
telemetry, and %Result{} shape are identical.
| Pick | When |
|---|---|
Static (use CouncilEx) | Workflow is checked into code. Members, rounds, chair, router known at compile time. |
Dynamic (%DynamicCouncil{}) | Workflow built at runtime, persisted to a DB as JSON, edited in a UI. |
CouncilEx.run/3 and start/3 accept either form (polymorphic dispatch), so
you can switch a council from static to dynamic without touching call sites.
Static module-form
defmodule MyApp.MyCouncil do
use CouncilEx
default_profile CouncilEx.Profiles.OpenAIMini
member :researcher, MyApp.Members.Researcher
member :critic, MyApp.Members.Critic
round :peer_review
chair MyApp.Members.Synthesizer, profile: CouncilEx.Profiles.OpenAIBalanced
endFull DSL macro reference (member forms, round, chair, router, default_profile,
output_schema) and prebuilt Councils.* templates (ParallelPanel, PeerReview,
Voting, Specialist, Consensus, Tournament, WeightedConsensus, JuryWithRetry):
docs/COUNCILS.md.
Dynamic form, registry, sub-councils, hybrid
docs/DYNAMIC_COUNCILS.md covers everything runtime-configurable:
- Dynamic data-form — pipeable builder (
add_member/2,set_chair/2, …), JSON round-trip (to_json/2/from_json/1), inline JSON Schema output,profile_overrides, React-Flow export (to_flow_graph/1). - Registry — string-keyed lookup with config + runtime tiers; eight kinds (
:profile,:tool,:schema,:router,:round,:sub_council,:input_mapper,:council). - Sub-councils — nest any council (module,
%DynamicCouncil{}, or registered name) as a member;:input_mapperprojects input between layers. - Hybrid form — static outer with dynamic sub-council, or dynamic outer referencing static modules; per-tenant flows and incremental migration.
- Prebuilt dynamic variants —
Councils.{Specialist,Consensus,Tournament,WeightedConsensus}.new_dynamic/1return a%DynamicCouncil{}. - Dual-form pattern — run the same topology as static or dynamic via a
COUNCIL_FORM=static|dynamicswitch.
Providers
CouncilEx ships five provider adapters. Configure once in app config; route
members via the provider: opt or a Profile. The council DSL is provider-agnostic.
| Provider atom | Env var | Notes |
|---|---|---|
:openai | OPENAI_API_KEY | Tool-calling, streaming, structured output. |
:anthropic | ANTHROPIC_API_KEY | response_schema: and tools: are mutually exclusive per member. |
:gemini | GEMINI_API_KEY | Native responseSchema; same mutual-exclusion as Anthropic. |
:ollama | (none) | Config preset over the OpenAI adapter — not a separate adapter impl. |
:openrouter | OPENROUTER_API_KEY | Thin wrapper over the OpenAI adapter; reaches any model OpenRouter routes. |
See docs/PROVIDERS.md for full config snippets, adapter quirks,
multi-provider council patterns, and the CouncilEx.Provider.Adapter behaviour
(7 required + 6 optional callbacks) for adding your own provider.
Profiles
A Profile bundles the capability stack (provider, model, temperature,
max_tokens, tools, retry) separately from the Member's identity (role, system
prompt, output schema). Nine prebaked profiles ship in CouncilEx.Profiles.*:
OpenAIBalanced, OpenAIMini, OpenAICreative, OpenAIDeterministic,
AnthropicBalanced, GeminiBalanced, OllamaLocal, OpenRouterAuto,
OpenRouterClaudeSonnet.
Resolution order (later wins): app config default → council default_profile
→ member :profile opt → inline opts.
See docs/PROFILES.md for defining custom profiles, dynamic-form
registration, profile_overrides, and the prebaked-profile capability table.
Running councils
Start a run and block:
{:ok, result} = CouncilEx.run(MyCouncil, %{question: "go or wait?"})Start async, stream progress events, then await:
{:ok, pid} = CouncilEx.start(MyCouncil, input, subscribe: true)
run_id = CouncilEx.RunServer.run_id(pid)
receive do
{:round_completed, ^run_id, name, _rr} -> IO.puts("round done: #{name}")
end
{:ok, result} = CouncilEx.await(pid)- Core API — async start/await,
cancel/2(cooperative),terminate_run/2(non-cooperative),validate/1,startvsstart_link,pid_for/2, run grouping withCouncilEx.Supervisor, retry policy:docs/RUNNING_COUNCILS.md - Phoenix / LiveView / channels:
docs/RUNNING_IN_PHOENIX.md - Oban / background jobs:
docs/RUNNING_WITH_OBAN.md
Council topologies
Nine pre-built templates (ParallelPanel, PeerReview, Voting,
Specialist, Consensus, Tournament, Chairman, WeightedConsensus,
JuryWithRetry), five aggregators (Plurality, Borda, Condorcet,
WeightedMean, Median), and the Iterate round wrapper for
convergence loops.
WeightedConsensus ports Wu et al. Council Mode (arXiv:2604.02923):
heterogeneous members aggregated by :weight / :confidence /
Reliability lookup rather than equal-weight chair synthesis. Mapping
in docs/COUNCIL_MODE_PAPER.md.
JuryWithRetry runs K judges and re-samples on low average confidence
(default threshold 0.7, max 2 iterations). Judges don't see each
other across iterations. Wu et al. Can LLM Agents Really Debate?
(arXiv:2511.07784) conformity mitigation. Pattern shared with
Chaos-MoA-Pipeline + Adjudicator. Full multi-paper context in
docs/RELATED_WORK.md.
council =
CouncilEx.Councils.Specialist.new(
as: MyApp.MyCouncil,
members: [
{:seo, MyApp.Members.Seo, [provider: :openai, model: "gpt-4o-mini"]},
{:tech, MyApp.Members.Tech, [provider: :openai, model: "gpt-4o-mini"]}
],
chair: {MyApp.Members.Synth, [provider: :openai, model: "gpt-4o"]}
)
{:ok, result} = CouncilEx.run(council, %{topic: "..."})See docs/COUNCILS.md for the full topology table,
aggregator catalog, iteration semantics, and RoundResult.metadata.history
shape.
Per-member capabilities
CouncilEx members support structured outputs, streaming, and tool calling independently of one another. Full details — every default, Anthropic-specific behaviour, and PubSub event payloads — are in docs/PER_MEMBER_CAPABILITIES.md.
Structured outputs — set output_schema on a member to an Ecto embedded schema. CouncilEx.Providers.Instructor casts the LLM's JSON into that schema and runs the schema's optional validate_changeset/2; the member module's validate/1 then runs for business rules. On Anthropic, CouncilEx forces a synthetic _respond tool whose input_schema mirrors your Ecto schema; structured-output and user tools: are mutually exclusive on the same member.
Streaming — add stream true to a member. During streaming the adapter reassembles Anthropic partial_json SSE fragments; subscribers receive :member_token PubSub events carrying %CouncilEx.StreamChunk{content, index, finish_reason}. The [:council_ex, :member, :stream_chunk] telemetry event fires per chunk.
Tools — a tool implements CouncilEx.Tool (four callbacks: name/0, description/0, parameters_schema/0, execute/1). The dispatcher runs a bounded tool-call loop (default max_tool_iterations: 5); exceptions are caught by safe_execute/2 and surfaced as {:tool_raised, exception}. Multiple tool calls in one turn run in parallel by default (parallel_tools: true, strategy :collect, tool_concurrency_factor: 1.0, tool_timeout_ms: 30_000). CouncilEx.Providers.Instructor.stream/3 drives the same loop across streaming round-trips; subscribe for :tool_call_request / :tool_call_result events (the synthetic _respond tool is excluded).
Composition
Two ways to scale a council beyond a flat member list: nest a council inside
another (sub-councils, including dynamic %DynamicCouncil{} forms with
input_mapper), and gate which members participate per round (adaptive routers
— council-level or per-round override). Excluded members land in RoundResult
with status: :skipped.
See docs/COMPOSITION.md for the full sub-council and
router surface (sub-run event topics, :sub_run_id / :sub_result metadata,
dynamic-form router registration, :skipped semantics, and a runnable two-level
example with mixed providers).
Auto-routing with AutoCouncil
CouncilEx.AutoCouncil is an opt-in routing layer for callers that don't
know up-front which council fits a given prompt. Pass it to CouncilEx.run/3
like any other council — internally a strategy picks from a catalog,
executes the winning council, and records the decision in result.metadata.auto:
auto = CouncilEx.AutoCouncil.new(
strategy: :rules,
catalog: [
%{id: "seo", council: MyApp.Councils.SEO, match: ~r/seo|sitemap/i},
%{id: "code", council: MyApp.Councils.CodeReview, match: ~r/code|PR/i}
]
)
{:ok, result} = CouncilEx.run(auto, %{question: "audit my SEO"})
result.metadata.auto
# => %{strategy: :rules, kind: :static, catalog_id: "seo",
# reason: "matched ~r/seo|sitemap/i", score: nil, latency_ms: 1}- Strategies —
:rules(regex/fun, zero cost),:cascade(chain cheap→expensive),:embedding/:llm_classify/:llm_build(stubs, return{:error, :not_implemented}), or{MyModule, opts}for custom. - Catalog — inline list or
{:registry, :council}for hot-reloadable shared routing.provider_check: truedrops entries whose providers aren't configured. - Fallback —
on_no_match: :error(default),{:fallback, MyCouncil}, or{:fallback, "registered_id"}. - Shortcut —
CouncilEx.auto/1,2uses:council_ex, :autoapp config as default; per-call opts override it and:verbose/:await_timeoutforward torun/3.
Full reference — Strategy behaviour, custom-strategy recipe, decision-shape
contract, telemetry events (:decision, :cascade_step, :catalog_filtered),
composability — in docs/AUTO_COUNCILS.md.
Observability
Ten events fire on topic "council_ex:run:#{run_id}":
:run_started, :round_started, :member_started, :member_token,
:tool_call_request, :tool_call_result, :member_completed,
:round_completed, :run_completed, :run_failed
(documented in CouncilEx.Events).
- Phoenix.PubSub adapter — route events through your own server:
config :council_ex, pubsub: {CouncilEx.PubSub.Phoenix, name: MyApp.PubSub}. CouncilEx never starts a PubSub server itself. - Telemetry logger —
CouncilEx.Telemetry.attach_default_logger/0,1attaches Logger handlers (:eventssubset opt;:exceptionalways logs at:warning; re-attach is idempotent);detach_default_logger/0removes them. Verbose mode —
verbose: true | :debugprints a per-run timeline to stdout (zero cost when off;verbose_io:to redirect).
Full reference: docs/OBSERVABILITY.md.
Topology diagrams: docs/DIAGRAMS.md.
Introspection — inspect a council's structure as data at runtime
(Mod.__council__/0 → %Spec{}, __providers__/0), export it as a node/edge
graph for a UI (CouncilEx.Diagram.to_ir/1, both forms), or query a live run
(CouncilEx.RunServer.state/1, list_active_runs/0). See
docs/INTROSPECTION.md.
Testing
import CouncilEx.Test for three helpers: script_council/2 (script Mock
responses for every member of a council — or nested sub-council — in one call),
capture_events/2 (drain a run's PubSub topic until the terminal event or
timeout), and assert_round_completed/3 (block on :round_completed and return
the %RoundResult{}). The Mock provider is CouncilEx.Providers.Mock (tests and
fixtures only; never production code).
See docs/UNIT_TESTING.md for the full helper reference,
streaming scripts, and state inspection; docs/TESTING.md for
live-provider and manual testing.
Deployment Considerations
A single :mode config knob picks the deployment shape: :single_node
(default, no config needed) uses an ETS-backed Registry, a Null reliability
store, and no Recorder; :multi_node flips all three to their *.Ecto defaults
and autowires Recorder.Ecto into every CouncilEx.start/3 call. Per-key
overrides (:reliability_store, :registry_backend, :recorder) always win
over the mode default, so mixing backends (e.g. Reliability.Redis +
Registry.Ecto + Recorder.Ecto) is one line each.
See docs/PERSISTENCE.md for the module map, migration
setup, Redis backends, Oban durable retries, and the deployment topology matrix.
Roadmap & changelog
Capabilities
Topic-tagged highlights of what ships in the 0.1.0 release. See
CHANGELOG.md for the full release notes.
- Paper-replication slate:
Councils.WeightedConsensus+Rounds.WeightedSynthesis(Wu et al. Council Mode port, arXiv:2604.02923); per-member:confidencefield on%MemberResult{}with:self_reportand:logprobstrategies;CouncilEx.BiasDetectordiagnostic round (lexicon backend);CouncilEx.Reliabilitystore (Null + ETS + Ecto/Postgres + Redis backends);Councils.JuryWithRetrywith confidence-triggered re-sample (Chaos-MoA / Adjudicator pattern, Wu et al. Can LLM Agents Really Debate? (arXiv:2511.07784) conformity mitigation);bench/eval/skeleton harness for TruthfulQA / HaluEval / BBQ;:expose_confidenceopt onWeightedSynthesis. Mapping:docs/COUNCIL_MODE_PAPER.md+docs/RELATED_WORK.md. - Dynamic councils: build / edit / validate / serialise data-form councils with sub-council composition (registered name / module / nested struct), polymorphic
run/3+start/3dispatch, full run/round telemetry parity on the async path.ProfileDSL + 9 prebaked profiles, per-runverbose:opt, OpenRouter adapter, diagram tooling (CouncilEx.Diagram), real-key-only examples, Gemini schema sanitization.:tool_choicemember opt, atom-exhaustion DoS fix on JSON ser/de, idempotent:pgPubSub subscribe, cold-load tool-call adapter probe. - Providers: stock OpenAI / Anthropic / Gemini / Ollama / OpenRouter adapters; pluggable
Provider.Adapterbehaviour; frozenCouncilEx.EventsPubSub surface;:member_completedcarries full%MemberResult{}. - Tool calling: stream tool-loop in
CouncilEx.Providers.Instructor.stream/3; parallel tool execution; tool-call PubSub events; Tournament Bracket round; Anthropic structured output via the tool-use API. - GenServer-aligned run lifecycle: caller-owned pids via
run/3,start/3,start_link/3; opt-inCouncilEx.Supervisorfor tenant isolation; no auto-started supervisor (you own the pids). - Persistence: optional
*.Ectobackends forReliability,Registry,Recorder, plus*.RedisforReliabilityandRegistry(Recorder is Ecto-only);CouncilEx.Config:modeknob (:single_node/:multi_node) flips all backends in one place.
Planned
- Nice-to-have, unscheduled: chained multi-step tool loops where one tool call feeds the next within a single member turn (gap #11 in
docs/FUTURE_EXAMPLES.md); ranking-parser regex fallback for cheap models (karpathy pattern);Fairness.parity/2metric helper (cultural_debate); persona-counterweight presets; LLM-judge / embedding-cluster backends forBiasDetector; logical-validity-aware aggregator (Wu 2025); deterministic pre-injection RAG (docs/future/RAG_PRE_INJECTION.md). Tracked indocs/RELATED_WORK.md. - Out of scope for this repo: durable run history, durable execution, and a LiveView dashboard. Build them in your host app against the frozen
CouncilEx.EventsPubSub surface andDiagram.to_ir/1.
License
Apache-2.0. See LICENSE.
Built by Humberto Aquino · Brewing Elixir.
Acknowledgements
Special thanks to Andrej Karpathy, whose karpathy/llm-council sparked the initial idea behind this project. His "models review each other before a final synthesis" experiment is what we set out to bring to Elixir as a reusable framework. See docs/TUTORIAL_KARPATHY_COUNCIL.md for the Elixir port.
References
- Wu, S., Li, X., Feng, Y., Li, Y., Wang, Z., & Wang, R. (2026). Council Mode: A Heterogeneous Multi-Agent Consensus Framework for Reducing LLM Hallucination and Bias. arXiv:2604.02923. PDF. Implemented as
Councils.WeightedConsensus, per-member confidence (MemberResult.:confidence),BiasDetector(diagnostic), andReliabilitystore. Full mapping indocs/COUNCIL_MODE_PAPER.md.
For broader context on multi-agent LLM papers and projects (MAD, Adjudicator, karpathy/llm-council, Chaos-MoA-Pipeline, culturaldebate, etc.) and how each maps onto CouncilEx, see docs/RELATED_WORK.md. The Wu et al. _Can LLM Agents Really Debate? (arXiv:2511.07784) finding on conformity-under-visible-majority motivated Councils.JuryWithRetry's "judges don't see each other across iterations" design.