Dsxir.Optimizer.GEPA.Evaluator (dsxir v0.2.0)

Copy Markdown

Parallel devset evaluation for Dsxir.Optimizer.GEPA. Returns {scores, feedback} — two ordered lists aligned with the devset.

Each per-example Dsxir.Metric.apply/4 call is wrapped in a worker under Dsxir.TaskSupervisor, replaying the caller's Dsxir.Settings snapshot. After the metric runs, the worker drains the GEPA feedback slot via Dsxir.Metric.drain_gepa_feedback/0. Per-example exceptions in the recognised LM/adapter/invalid class set are caught and produce nil in both arrays.

Whole-evaluation failures (e.g. process crashes outside the rescue list) raise; the caller catches them.

Summary

Functions

Evaluator wrapper that returns positionally-aligned nil arrays when the metric is nil or the devset is empty; otherwise delegates to run/4.

Types

predictor_feedback()

@type predictor_feedback() :: %{required(atom()) => String.t() | nil}

Functions

run(program, devset, metric, num_threads \\ 4)

@spec run(
  Dsxir.Program.t(),
  [Dsxir.Example.t()],
  Dsxir.Metric.t(),
  num_threads :: pos_integer()
) :: {[float() | nil], [predictor_feedback() | String.t() | nil]}

run_or_nils(program, devset, metric)

@spec run_or_nils(Dsxir.Program.t(), [Dsxir.Example.t()], Dsxir.Metric.t() | nil) ::
  {[float() | nil], [predictor_feedback() | String.t() | nil]}

Evaluator wrapper that returns positionally-aligned nil arrays when the metric is nil or the devset is empty; otherwise delegates to run/4.