Dsxir. Optimizer. SIMBA. Bucket
(dsxir v0.5.0)
Copy Markdown
Per-example trajectory bucket with variance statistics.
A bucket groups all trajectory records for one trainset example and computes gap statistics used to prioritise which examples are processed first during candidate generation.
Summary
Functions
Builds a bucket from a list of records for one example.
Computes {p10, p90} over a flat list of numeric scores using
linear interpolation (numpy-style).
Sorts a list of buckets descending by {max_to_min_gap, max_score, max_to_avg_gap}.
Types
Functions
@spec from_records([trajectory_record()]) :: t()
Builds a bucket from a list of records for one example.
Records are sorted by score descending. Three gap stats are computed:
max_to_min_gap, max_score, and max_to_avg_gap.
Computes {p10, p90} over a flat list of numeric scores using
linear interpolation (numpy-style).
For percentile p over sorted xs of length n: rank = p/100 * (n-1); interpolate between floor and ceil indices.
Sorts a list of buckets descending by {max_to_min_gap, max_score, max_to_avg_gap}.
Examples with more variance and a higher score ceiling are processed first.