Dsxir.Optimizer.BootstrapFewShot (dsxir v0.3.0)

Copy Markdown

Two-phase optimizer: slot labeled demos from the trainset (phase 1), then augment with bootstrapped demos captured from successful traces (phase 2).

Phases

  1. Labeled. Up to :max_labeled_demos examples are picked from the trainset (uniform random; deterministic-by-hash when :deterministic is set). Each chosen example is slotted as %Dsxir.Demo{kind: :labeled} only into predictors whose declared input + output fields the demo's data keys cover; non-matching predictors get no labeled demo from that example. No LM call.

  2. Bootstrap. For each round in 1..max_rounds, the trainset is walked example-by-example. For each example, the program is run inside a Dsxir.with_trace/1 frame with per-call opts seeded for diversity (temperature: cfg.diversity_temperature, cache: false, plus a per-round per-example nonce). When the metric coerces to >= :threshold, each trace entry is pushed into the matching predictor's demos_pool as %Dsxir.Demo{kind: :bootstrapped, source: %{round: R, example_index: I}} until :max_bootstrapped_demos is reached.

Diversity is delivered by pushing a Dsxir.Settings.context/2 frame that swaps the resolved :lm config tuple with one carrying the diversity keywords. The LM dispatcher reads :lm from settings and merges per-call opts on top, so the temperature lever reaches the wire protocol.

Options

  • :max_labeled_demos (default 4) — cap on phase 1 demos per predictor.
  • :max_bootstrapped_demos (default 4) — cap on phase 2 demos per predictor.
  • :max_rounds (default 1) — number of bootstrap passes over the trainset.
  • :threshold (default 1.0) — coerced metric must meet or exceed this to keep the trace. Accepted threshold types: true | false | integer() | float(). Booleans coerce to 1.0 / 0.0. Other values raise FunctionClauseError during option parsing — bootstrap is a fail-fast operation on bad configuration.
  • :max_errors (default 10) — aggregate cap on per-example errors. Exceeding returns a framework-classed error.
  • :deterministic (default false) — when true, phase 1 selection is hash-stable and phase-2 trainset order is hash-stable. Phase-2 LM outputs are still nondeterministic via temperature.
  • :diversity_temperature (default 1.0) — temperature forwarded as per-call opt during phase 2.
  • :degraded_demos (default :exclude) — controls whether trace entries flagged degraded: true are eligible for the bootstrapped pool. :exclude drops them silently; :include keeps them. An entry is degraded when its resolved inputs include any nil from an upstream optional skip — see Dsxir.Trace.Entry.

Returned stats

%{
  labeled_demos: non_neg_integer(),
  bootstrapped_demos: non_neg_integer(),
  predictor_count: non_neg_integer(),
  rounds: non_neg_integer(),
  error_count: non_neg_integer(),
  max_errors: non_neg_integer(),
  threshold: float()
}

Errors

Per-example raises are caught and stamped with path: [:bootstrap_few_shot, :"round_R", :"example_I"]. When error_count > max_errors, compile/4 returns {:error, %Dsxir.Errors.Framework.OptimizerError{optimizer: __MODULE__, inner: aggregate}} where inner is an aggregate produced via Splode's to_class helper on Dsxir.Errors. Callers can traverse per-predictor sub-errors via Splode's traverse_errors helper.

Trainset hash

metadata.trainset_hash is :crypto.hash(:sha256, :erlang.term_to_binary(trainset)) |> Base.encode16(case: :lower).

Session mode

Dsxir.OptimizerSession.compile(BootstrapFewShot, ...) walks the trainset one example at a time via step/6, halting with :rounds_exhausted after all max_rounds passes complete. Session mode preserves the per-round, per-example scheduling but does not bail out early when the pool fills; the trial budget matches max_rounds * length(trainset).

Summary

Functions

Prepare a resumable BootstrapFewShot session.

Run a single example through one round of BootstrapFewShot.

Functions

init_session(student, trainset, metric, opts)

@spec init_session(
  Dsxir.Program.t(),
  [Dsxir.Example.t()],
  nil | Dsxir.Metric.t(),
  keyword()
) ::
  {:ok, Dsxir.Optimizer.BootstrapFewShot.Sampler.t(), pos_integer()}
  | {:error, Exception.t()}

Prepare a resumable BootstrapFewShot session.

Performs phase one (labeled demo selection) eagerly and seeds the per-round example queue used by step/6. The returned planned-trial count is max_rounds * length(trainset) — an upper bound, since errors do not reduce the budget but do increment the session's error counter.

step(sampler, idx, program, trainset, metric, opts)

Run a single example through one round of BootstrapFewShot.

When the current round's queue is empty and round < max_rounds, advances to the next round and recurses on the freshly prepared queue. When the queue is empty and round >= max_rounds, halts with :rounds_exhausted.

Exceptions from the example pipeline are caught and reported as status: :error trials rather than crashing the session.