Peer Review Patterns

Copy Markdown View Source

Two rounds in council_ex look superficially similar but solve different problems. Picking the wrong one silently degrades signal. This page is the decision guide.

  • CouncilEx.Rounds.PeerReview: cross-visibility. Members read each other's prior outputs (keyed by original member id) and keep working with that context.
  • CouncilEx.Rounds.AnonymizedPeerReview: blind judging. Members rank each other's prior outputs under anonymous labels (Response A, Response B, …), and the aggregator reports rankings de-anonymized.

Decision matrix

PeerReviewAnonymizedPeerReview
Member ids visible to peersyes (semantic)no (Response A/B/C)
Own slot in peers mapomittedomitted
Aggregates?noyes (default Aggregators.PeerRanking)
Suitable for ranking / votingno: biasedyes
Suitable for collaboration / refinementyesno: strips role context
Output shape required from membersfree-form:ordering (e.g. Schemas.Ranking)
label_to_id map exposedn/ayes, in aggregated.raw.label_to_id

Rule of thumb:

  • "Read the others, then keep working"PeerReview.
  • "Read the others, then rank them"AnonymizedPeerReview.

Why anonymization matters

When LLMs judge each other's work with author identities visible, rankings collapse to garbage signal. Three failure modes anonymization prevents:

  1. Self-recognition bias. LLMs recognize their own writing style (idioms, formatting, hedging patterns). Given a mixed pile labeled by id, a model spots its own output and ranks itself first. Every model does it. Result: every judge picks self → no winner, no signal.

  2. Brand bias. If labels expose model names (gpt-4o-mini, claude-sonnet-4-6), models defer to known-strong brands or attack rivals based on training-data sentiment rather than actual answer quality. Judgments based on reputation, not text.

  3. Stable-position leakage. Repeated runs with the same id order let a judge learn "slot N = competitor, downrank." Stable id ordering across runs leaks signal anonymization is meant to remove.

Anon labels (Response A/B/C) plus own-slot removal close all three. The judge sees only text and is forced to evaluate substance.

This is the only stage of a multi-model council that adds signal a single-model call cannot produce. Stage 1 = N parallel queries (boring). Stage 3 = synthesis (any model can do). Stage 2 anonymized peer review is the actual research contribution. karpathy/llm-council calls this out explicitly in its README: it is the reason that project exists.

Why AnonymizedPeerReview lives in the library

User-side anonymization is doable but error-prone:

  • Easy to leak ids in prompts (forget to strip from one field).
  • Easy to assign per-judge labels inconsistently: label A meaning different model to different judges breaks aggregation.
  • Easy to drop the de-anon map and lose UI traceability.

AnonymizedPeerReview solves all three:

  • Global stable map. Every judge sees gpt → Response A. Aggregation across judges is meaningful.
  • Own-slot removal. Judge never sees own answer at all. Self-recognition impossible.
  • Map preserved through to aggregator. winner, scores, avg_position, judge_ballots all reported in original-id space.

When PeerReview (visible ids) is correct

Keep ids visible when identity carries meaning the next round needs.

  • Heterogeneous roles. :researcher → :critic → :synthesizer. The critic needs to know it's reading Researcher's draft, not "Response B." Anonymization destroys role context the workflow depends on.
  • Iterative refinement. :draft and :editor collaborating across rounds. The editor's prompt likely references "the draft above": id is the semantic anchor.
  • Non-judgment cross-pollination. "Each member sees what the others wrote, then writes a new version factoring in those perspectives." No ranking, no voting. Visibility for inspiration, not adjudication.
  • Critique chains. Rounds.Critique is built on top of PeerReview.prepare_input/3.

When NEITHER fits

  • Single-judge setups. No peer pool, nothing to anonymize over. Use a plain synthesis round.
  • Identity-as-signal tasks. "Which model is most aligned with house style?" you want the judge to know who wrote what. Use PeerReview so labels stay visible, or write a custom round.
  • Vendor-leaking content. Anonymization is label-level only. If models sign their answers ("As Claude, I…"), no round-level relabeling will hide that. Sanitize content first or rewrite the member system prompts to suppress self-identification.
  • Pairwise judgments at scale. For tournament-style elimination use Rounds.PairwiseElimination; ranking entire fields per judge doesn't scale past ~6 members.

Both rounds, both supported

PeerReview is not deprecated by AnonymizedPeerReview. They do different jobs. Removing PeerReview would break Rounds.Critique and every collaborative-refinement council that depends on visible ids.

See also