Dsxir.Optimizer.MIPROv2 (dsxir v0.3.0)

Copy Markdown

Joint instruction-and-demo optimizer. Port of DSPy's MIPROv2.

Bootstraps candidate demo bundles, asks a proposer LM for candidate instructions grounded in program / dataset summaries, then searches the joint categorical space with a configurable sampler (Dsxir.Optimizer.Search.TPE by default). Trials run in parallel on a minibatch via Dsxir.Optimizer.MIPROv2.Trial.run/1; top candidates are periodically re-run on the full valset and the highest full-eval score wins.

Quick start

{:ok, compiled, stats} =
  Dsxir.compile(
    Dsxir.Optimizer.MIPROv2,
    program,
    trainset,
    metric,
    auto: :medium
  )

Options

See Dsxir.Optimizer.MIPROv2.Auto for :light | :medium | :heavy presets.

  • :auto (default :medium) — preset for num_trials, num_instruction_candidates, num_demo_sets, minibatch_size.
  • :proposer_lm{module, config} tuple used for program/dataset summaries and grounded instruction proposals. Defaults to the task LM.
  • :sampler (default Dsxir.Optimizer.Search.TPE) — sampler module.
  • :sampler_opts (default []) — passed to sampler.init/2.
  • :batch_size (default 4) — trials suggested + executed per loop step.
  • :minibatch_full_eval_steps (default 10) — every Nth trial triggers a full-valset rerank of the top :top_k_full_eval trials.
  • :top_k_full_eval (default 4) — how many minibatch winners to rerank.
  • :seed (default 0) — controls trainset/valset split and sampler PRNG.
  • :valset_fraction (default 0.2) — share of trainset reserved as valset (clamped to leave at least one training example and one valset example when possible).
  • :compile_cache (default true) — enables Dsxir.Optimizer.Cache.
  • :tip (default nil) — stylistic hint forwarded to the grounded proposer.

Returned stats

Dsxir.Optimizer.MIPROv2.Stats.t/0. Notable fields:

  • :best_score / :best_config — winning trial.
  • :trials / :full_evals — per-trial records.
  • :proposer_calls — LM calls issued to the proposer.
  • :total_task_lm_calls / :total_cached_calls — task LM stats summed across trials.
  • :degradedtrue when any proposer call failed and was substituted with an empty summary or candidate list.
  • :wall_clock_ms — wall time of the compile.

Session mode

Dsxir.OptimizerSession.compile(MIPROv2, ...) runs trials sequentially via step/6 and skips the periodic full-valset rerank performed by compile/4. In session mode best_program is selected from minibatch scores only, so outcomes may differ from a non-session compile against the same inputs.

Summary

Functions

Compile student against trainset under metric.

Prepare a resumable MIPROv2 session.

Run a single trial against the session sampler.

Functions

compile(student, trainset, metric, opts)

Compile student against trainset under metric.

Returns {:ok, program, stats} on success or {:error, exception} on validation failure. See module doc for opts.

init_session(student, trainset, metric, opts)

@spec init_session(
  Dsxir.Program.t(),
  [Dsxir.Example.t()],
  nil | Dsxir.Metric.t(),
  keyword()
) ::
  {:ok, Dsxir.Optimizer.MIPROv2.Sampler.t(), pos_integer()}
  | {:error, Exception.t()}

Prepare a resumable MIPROv2 session.

Performs the one-time, expensive proposer LM calls (program summary, dataset summary, grounded instruction proposals) and bootstraps the demo bundles so the resulting Dsxir.Optimizer.MIPROv2.Sampler can be checkpointed and the proposer never has to be called again across step/6 invocations.

The returned planned-trial count equals cfg.num_trials.

step(sampler, trial_idx, program, trainset, metric, opts)

Run a single trial against the session sampler.

Returns {:halt, sampler, :budget_exhausted} once the trial budget is met, otherwise asks the underlying sampler module for one config, runs it through Trial.run/1 on the cached minibatch, and returns a trial_result map with the nine keys expected by Dsxir.OptimizerSession.

Session mode skips the periodic full-valset rerank that compile/4 runs.

Exceptions from the trial pipeline are caught and reported as status: :error trials rather than crashing the session.