Dsxir. Optimizer. COPRO
(dsxir v0.3.0)
Copy Markdown
Coordinate-ascent instruction optimizer. Port of DSPy's COPRO.
COPRO optimizes per-predictor instructions only (no demos). Each round it
asks a proposer LM for a breadth of candidate instructions per predictor,
evaluates every candidate as a whole program (the other predictors held at
their current committed override), and at the end of the round commits each
predictor's strict round winner. The next round's proposer is grounded in the
predictor's scored attempt history. After depth rounds the committed
overrides are applied to the student and returned.
The pure coordinate-ascent state and its transitions live in
Dsxir.Optimizer.COPRO.Sampler; this module owns the IO — proposer (LM)
calls via Dsxir.Optimizer.COPRO.Proposer and whole-program scoring via
Dsxir.Optimizer.COPRO.Evaluator — and threads the sampler through.
Quick start
{:ok, compiled, stats} =
Dsxir.compile(
Dsxir.Optimizer.COPRO,
program,
trainset,
metric,
auto: :light
)Options
See Dsxir.Optimizer.COPRO.Auto for the :light | :medium | :heavy presets
controlling :breadth, :depth, and :init_temperature.
:auto(default:medium) — budget preset.:proposer_lm—{module, config}tuple for instruction proposals. Defaults to the resolved task LM.
Returned stats
Dsxir.Optimizer.COPRO.Stats.t/0. Notable fields:
:best_score— whole-program score under the committed overrides.:best_instructions— the committed per-predictor overrides.:rounds— committed rounds (equalscfg.depthon a full run).:trials— per-candidateDsxir.Optimizer.COPRO.Stats.Recordlist.:proposer_calls— proposer LM calls issued.:degraded—truewhen any proposer call failed and was substituted with the predictor's current best instruction.
Under compile/4 the reported best_score is a single confirmatory
whole-program evaluation under the committed best_overrides after the final
round commits, so it reflects the program actually returned rather than a
per-predictor candidate maximum.
In session mode (Dsxir.OptimizerSession) best_program/best_score are
selected by the session from the highest-scoring individual trial, and each
COPRO trial's candidate program changes only one predictor's instruction. So
for a program where more than one predictor improves, a session run can return
a different program and score than compile/4, which applies every committed
override at once. This mirrors Dsxir.Optimizer.MIPROv2's session behaviour.
Wrapper-only bookkeeping (proposer call count, degraded flag, per-candidate
Stats.Record list) lives in first-class Sampler fields so session-mode
step/6 callers can checkpoint it; it serializes and survives Sampler
transitions transparently.
Summary
Functions
@spec compile(Dsxir.Program.t(), [Dsxir.Example.t()], Dsxir.Metric.t(), keyword()) :: Dsxir.Optimizer.result()
Compile student against trainset under metric.
Returns {:ok, program, stats} on success or {:error, exception} on
validation failure. See the module doc for opts.
@spec compile!(Dsxir.Program.t(), [Dsxir.Example.t()], Dsxir.Metric.t(), keyword()) :: {Dsxir.Program.t(), Dsxir.Optimizer.COPRO.Stats.t()}
Like compile/4 but raises the validation exception on {:error, _}.
@spec init_session( Dsxir.Program.t(), [Dsxir.Example.t()], nil | Dsxir.Metric.t(), keyword() ) :: {:ok, Dsxir.Optimizer.COPRO.Sampler.t(), pos_integer()} | {:error, Exception.t()}
Prepare a resumable COPRO session.
Expands the budget preset, validates the trainset and predictor set, scores
the seed program once to seed the round baseline, and builds the round-zero
candidate queue via the basic proposer. The returned planned-trial count is
cfg.depth * length(predictors) * cfg.breadth.
@spec step( Dsxir.Optimizer.COPRO.Sampler.t(), non_neg_integer(), Dsxir.Program.t(), [Dsxir.Example.t()], nil | Dsxir.Metric.t(), keyword() ) :: {:cont, Dsxir.Optimizer.COPRO.Sampler.t(), map()} | {:halt, Dsxir.Optimizer.COPRO.Sampler.t(), term()}
Run a single COPRO trial against the session sampler.
Halts once the planned trial budget is met. Otherwise pops the next candidate,
applies it on top of the current best overrides, scores the whole program,
records the result, and returns a trial_result map. When the round queue is
exhausted it commits the round; if the depth budget is not yet spent it builds
the next round's queue (grounded in the attempt history) and re-enters to
return one evaluated trial.