Testing CouncilEx Manually

Copy Markdown View Source

A guide to setting up keys, running examples, and verifying CouncilEx end-to-end against real LLM providers (with Mock fallbacks when no key is available).

Looking for the test suite (mix test)? Skip to §7.


1. Prerequisites

ToolRequired versionNotes
Elixir~> 1.16elixir --version
Erlang/OTP26+erl -version
Mixbundled with Elixirmix --version

Optional:

  • ollama (local install + ollama pull llama3.1) for the local-LLM path.
  • git to clone.

Clone and bootstrap:

git clone https://github.com/brewingelixir/council_ex.git
cd council_ex
mix deps.get
mix compile

2. API Keys

Each real provider needs an env var. CouncilEx reads them via the {:system, "VAR"} config tuple.

ProviderEnv varGet a key
OpenAIOPENAI_API_KEYhttps://platform.openai.com/api-keys
AnthropicANTHROPIC_API_KEYhttps://console.anthropic.com/settings/keys
GeminiGEMINI_API_KEYhttps://aistudio.google.com/apikey
OpenRouterOPENROUTER_API_KEYhttps://openrouter.ai/settings/keys
Ollama (local)none: run ollama servehttps://ollama.com/download

Set them in your shell before running examples:

export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GEMINI_API_KEY=...
export OPENROUTER_API_KEY=sk-or-...

Examples are real-key-only. Every example in examples/*.exs calls System.fetch_env!/1 for its required key and fails fast with a clear ArgumentError if it's missing. The Mock provider is reserved for mix test (the unit suite). To preview a council's topology without any API call, use DIAGRAM_ONLY=1 (see §3).


3. No-Key Smoke Tests

A no-API-key bench exercises the parallel-batch executor directly:

mix run bench/parallel_tools.exs       # parallel-batch executor benchmark

bench/parallel_tools.exs should print a ~3x speedup line.

The PubSub tool-call event surface is demonstrated end-to-end with a real provider in examples/tool_call_events_example.exs (requires OPENAI_API_KEY; see Section 2).

To inspect any other example's topology without making an API call, set DIAGRAM_ONLY=1:

DIAGRAM_ONLY=1 mix run examples/specialist_example.exs
DIAGRAM_ONLY=1 mix run examples/router_example.exs
DIAGRAM_ONLY=1 mix run examples/sub_council_example.exs

The example prints its diagram (default ascii; switch with DIAGRAM=mermaid or DIAGRAM=ir) and the description string, then halts before CouncilEx.run/3 fires. Handy for browsing topology shapes during library exploration.


4. Provider-by-Provider Tests

4.1 OpenAI

export OPENAI_API_KEY=sk-...
mix run examples/streaming_example.exs

Expected: text streams to stdout one chunk at a time, ending with a final synthesized response.

Tool-calling end-to-end against real OpenAI:

mix run examples/tool_calling_example.exs

Expected: solver emits a calculator tool call, the adapter executes Calculator.execute/1, the tool result feeds back, and the model returns the final answer (e.g. "The result of 17 * 23 is 391.").

4.2 Anthropic

export ANTHROPIC_API_KEY=sk-ant-...
mix run examples/anthropic_streaming_example.exs

Expected: streamed text from Claude (model: claude-sonnet-4-6).

Anthropic structured-output:

mix run examples/anthropic_structured_output_example.exs

Expected: %AnswerSchema{rating: <int>, reason: "..."} printed.

4.3 Gemini

export GEMINI_API_KEY=...
mix run examples/gemini_example.exs

Expected: Live Gemini structured response: %GeminiExample.AnswerSchema{rating: <int>, reason: "..."} (model: gemini-2.5-flash). The Gemini adapter strips additionalProperties from the JSON Schema before sending, since Gemini's response_schema is an OpenAPI 3.0 subset that rejects that key.

4.4 OpenRouter

export OPENROUTER_API_KEY=sk-or-...
mix run examples/openrouter_example.exs

Expected: a three-member council (different upstream vendors via provider/model IDs) reaches its synthesis. Shows Provider.Adapters.OpenRouter (a thin wrapper over the OpenAI adapter; defaults base_url and adds optional :referer / :title attribution headers).

If a model id 404s, swap from the alternatives comment block in the file or use openrouter/auto. See docs/PROVIDER_MODELS.md §5 for the live catalog refresh protocol.

4.5 Ollama (local)

Ollama is the local-LLM path. No key needed, just a running server.

ollama serve              # in one terminal
ollama pull llama3.1      # in another (one-time)

mix run examples/ollama_example.exs

Expected: a one-sentence response from the local Llama 3.1 model. The example raises if Ollama isn't reachable on localhost:11434.


5. Council Topology Examples

These exercise the round/aggregator/router/sub-council machinery against real providers. They all require OPENAI_API_KEY (the topology demos were standardized on OpenAI to keep credentials simple).

mix run examples/consensus_example.exs       # Iterate(Critique) → Synthesis convergence
mix run examples/debate_example.exs           # Multi-round debate
mix run examples/specialist_example.exs       # Specialist council (chair-driven)
mix run examples/router_example.exs           # Adaptive routing by input
mix run examples/parallel_panel_example.exs   # Parallel independent_analysis
mix run examples/parallel_panel_real_provider.exs   # Mixed-model panel (mini + flagship)
mix run examples/sub_council_example.exs      # Hierarchical SubCouncil
mix run examples/tournament_example.exs       # Pairwise-elimination bracket
mix run examples/phoenix_pubsub_example.exs   # External Phoenix.PubSub adapter
mix run examples/profile_example.exs          # Profile DSL + default_profile + override
mix run examples/multi_model_panel_example.exs   # Different vendor per member
mix run examples/creative_judge_example.exs   # Divergent (creative) → judge (deterministic)
mix run examples/dynamic_council_example.exs  # Data-driven council, JSON ser/de, React-Flow export

Each prints either a final synthesis or the per-round results. To inspect a topology before paying for a run, prefix with DIAGRAM_ONLY=1. Add VERBOSE=1 to any of the above for a per-run timeline of member starts/stops and token usage.

Dynamic council smoke recipe

# iex -S mix
alias CouncilEx.{DynamicCouncil, Registry}

Registry.register_profile("mini", CouncilEx.Profiles.OpenAIMini)

# Builder accepts maps OR keyword lists for ergonomics. Pick whichever
# fits the call site. JSON ser/de always uses string-keyed maps.
council =
  DynamicCouncil.new("scratch-1")
  |> DynamicCouncil.set_default_profile("mini")
  |> DynamicCouncil.add_member(id: "a", system_prompt: "say hi in 1 sentence")
  |> DynamicCouncil.add_round(:independent_analysis)

:ok = DynamicCouncil.validate(council)
{:ok, json} = DynamicCouncil.to_json(council)
{:ok, restored} = DynamicCouncil.from_json(json)
{:ok, result} = CouncilEx.run(restored, %{q: "?"})

Verifies builder + validation + JSON round-trip + run path end-to-end without ever calling defmodule.

Add verbose: true to the run/3 call (or set VERBOSE=1 if running an example) to see the per-run timeline printed to stdout.


6. Stream + Tool-Call PubSub Events

CouncilEx broadcasts a documented set of run/round/member/tool events on the topic "council_ex:run:#{run_id}". See lib/council_ex/events.ex for the full catalog (or mix docsCouncilEx.Events).

Quick demo:

mix run examples/tool_call_events_example.exs

Expected output shows :tool_call_request and :tool_call_result events for each tool execution.

To subscribe in your own code:

{:ok, pid} = CouncilEx.start(MyCouncil, %{topic: "..."})
:ok = CouncilEx.PubSub.subscribe("council_ex:run:#{run_id}")

receive do
  {:run_started, ^run_id, council_module, input} -> ...
  {:round_started, ^run_id, round_name, idx} -> ...
  {:member_started, ^run_id, round_name, member_id} -> ...
  {:member_token, ^run_id, round_name, member_id, %CouncilEx.StreamChunk{}} -> ...
  {:tool_call_request, ^run_id, round_name, member_id, %CouncilEx.ToolCall{}} -> ...
  {:tool_call_result, ^run_id, round_name, member_id, %CouncilEx.ToolCallResult{}} -> ...
  {:member_completed, ^run_id, round_name, member_id, %CouncilEx.MemberResult{}} -> ...
  {:round_completed, ^run_id, round_name, %CouncilEx.RoundResult{}} -> ...
  {:run_completed, ^run_id, %CouncilEx.Result{}} -> ...
end

Subscribe BEFORE start/3 if you want the initial :run_started event.


7. mix test + Statics

The full unit + Bypass-driven integration suite:

mix test --seed 0

Expected: the full suite passes with 1 excluded (:integration) on a clean checkout.

The 1 excluded test is the :integration tag (real-API tests); to include those:

mix test --include integration --seed 0

(Requires OPENAI_API_KEY, ANTHROPIC_API_KEY, and/or GEMINI_API_KEY set.)

Static analysis:

mix dialyzer            # 0 errors
mix credo --strict      # 0 issues
mix format --check-formatted   # clean

Generate local docs:

mix docs
open doc/index.html

8. Common Errors

SymptomCauseFix
** (ArgumentError) could not fetch environment variable "X"Example's required key not exported in current shellexport X=...
** (KeyError) key :api_key not foundProvider config missing required keySet the env var, or pass api_key: "..." directly in the providers config
404 not_found_error: model 'X' not found (Ollama)Model not pulled locallyollama pull <model>
404 on Anthropic with claude-3-5-sonnet-...Model name retiredUse claude-sonnet-4-6 (current): examples already updated
404 model is no longer available (Gemini)Date-suffixed Gemini ID retiredUse gemini-2.5-flash or check docs/PROVIDER_MODELS.md §1
404 on OpenRouterModel id drifted (date suffix changed) or feature unsupportedSwap to an undated GA id from docs/PROVIDER_MODELS.md §5 or use openrouter/auto
Invalid JSON payload ... "additionalProperties" (Gemini)Stale checkout: fixed on mainUpdate; adapter sanitizes the schema
Streaming silently produces no :member_token eventsStale checkout: fixed on main (cold-load function_exported? race)Update
{:invalid_config, ...} on CouncilEx.run/3Config keyword has unknown optsCheck the adapter's @config_schema (e.g., OpenAI, Anthropic, Gemini, OpenRouter); only documented opts are accepted
:max_tool_iterations error after tool loopModel emitted tool_calls > max iterations (default 5)Pass max_tool_iterations: <N> in the provider config
Test: assert_receive timeoutCPU contention; transientRe-run; if persistent on your hardware, file an issue

9. Adding Your Own Test Scenario

Quickest path: copy an existing example and modify.

cp examples/tool_calling_example.exs examples/my_test.exs
$EDITOR examples/my_test.exs
mix run examples/my_test.exs

For a project-embedded test (not a script):

defmodule MyApp.MyCouncil do
  use CouncilEx

  member :writer do
    provider :openai
    model "gpt-4o"
    system_prompt "You are a clear writer."
    stream true
  end

  round :independent_analysis
end

# In iex -S mix:
{:ok, result} = CouncilEx.run(MyApp.MyCouncil, %{topic: "your prompt"})
IO.puts(hd(result.rounds).member_results[:writer].response.content)

For streaming + observability, subscribe to the PubSub topic per §6.


10. Reporting Issues

When something breaks, please include:

  • Output of mix test --seed 0 (truncated to the failure block).
  • elixir --version and erl -version.
  • Which provider + model.
  • The example file or minimal repro.
  • Any environment-variable redaction needed.

File at https://github.com/brewingelixir/council_ex/issues.