Dsxir. Optimizer. SIMBA. Strategy. AppendRule
(dsxir v0.5.0)
Copy Markdown
SIMBA strategy that reflects on a better-vs-worse trajectory pair and appends per-predictor advice to each named predictor's instruction. One LM call.
Skips when the good trajectory is at or below the mini-batch 10th percentile or the bad trajectory is at or above the 90th percentile. When the good and bad scores tie (the degenerate case), the weaker-but-higher-percentile trajectory is blanked before rendering so the contrast still favours one side.
The reflective LM runs the OfferFeedback signature through the adapter via
Dsxir.Predictor.Predict.forward/4 (the mechanism Dsxir.Predictor.Refine
uses), and the parsed advice is appended to each predictor's effective
instruction through instructions_override. A reflective-LM failure or
unparseable advice degrades the strategy to :skip rather than crashing the
optimization step.