Dsxir.Optimizer.SIMBA.Evaluator (dsxir v0.5.0)

Copy Markdown

Trace-capturing parallel runner for Dsxir.Optimizer.SIMBA — DSPy's wrap_program generalized to a batch of {program, example} pairs.

Each pair runs in its own worker under Dsxir.TaskSupervisor, replaying the caller's Dsxir.Settings snapshot. Inside the worker the forward is wrapped in Dsxir.with_trace/1 (the trace slot is process-local, so each worker captures only its own execution), optionally under a scoped diverse lm for sampling. The metric scores the prediction and ScoreWithFeedback is drained into metadata.

Results are order-aligned with the input pairs (ordered: true). SIMBA treats per-pair failures as score 0.0 rather than drops or nil: an exception in the recognised LM/adapter/invalid class set, or a worker :exit (timeout/kill), yields a zero-score record so every pair produces one.

Summary

Functions

Runs each {program, example} pair concurrently and returns one trace-capturing trajectory_record/0 per pair, order-aligned with the input.

Types

trajectory_record()

@type trajectory_record() :: %{
  prediction: Dsxir.Prediction.t() | nil,
  trace: [Dsxir.Trace.Entry.t()],
  score: float(),
  example: Dsxir.Example.t(),
  metadata: term()
}

Functions

run(pairs, metric, opts \\ [])

Runs each {program, example} pair concurrently and returns one trace-capturing trajectory_record/0 per pair, order-aligned with the input.

Options: :num_threads (max concurrency, default 4), :sampling and :temperature (enable a scoped diverse lm for the forward).