viva_math/scheduler

Learning-rate / coefficient schedulers.

Pure functions (step, config) -> Float. No internal state. Useful for:

All schedulers are total — even out-of-range inputs are clamped to a sensible value rather than erroring. This makes them safe inside hot training loops.

Types

pub type OneCycleConfig {
  OneCycleConfig(
    max_lr: Float,
    initial_lr: Float,
    final_lr: Float,
    total_steps: Int,
    pct_start: Float,
  )
}

Constructors

  • OneCycleConfig(
      max_lr: Float,
      initial_lr: Float,
      final_lr: Float,
      total_steps: Int,
      pct_start: Float,
    )

    Arguments

    pct_start

    Fraction of total spent on increasing LR (default 0.3).

Values

pub fn cosine_annealing(
  base_lr: Float,
  step: Int,
  t_max: Int,
  min_lr: Float,
) -> Float

Cosine annealing: smooth cosine ride from base_lr to min_lr over T_max steps.

lr(t) = min_lr + ½(base_lr - min_lr)·(1 + cos(π · t / T_max))

pub fn cosine_warm_restarts(
  base_lr: Float,
  step: Int,
  period: Int,
  t_mult: Int,
  min_lr: Float,
) -> Float

Cosine warmup-restart: cosine annealing that resets to base_lr every period steps. The period doubles each restart when t_mult = 2.

pub fn exponential(
  base_lr: Float,
  step: Int,
  gamma: Float,
) -> Float

Exponential decay: lr · γ^step.

pub fn inverse(base_lr: Float, step: Int, tau: Float) -> Float

Inverse decay: lr / (1 + step / τ).

pub fn inverse_sqrt(
  base_lr: Float,
  step: Int,
  tau: Float,
) -> Float

Inverse square-root decay: lr / √(1 + step / τ). Common in transformers.

pub fn linear_warmup(
  base_lr: Float,
  step: Int,
  warmup_steps: Int,
) -> Float

Linear warmup: ramps from 0 to base_lr over warmup_steps.

pub fn multi_step_decay(
  base_lr: Float,
  step: Int,
  milestones: List(Int),
  gamma: Float,
) -> Float

Multi-step decay: drops by γ each time step crosses a milestone.

pub fn one_cycle(config: OneCycleConfig, step: Int) -> Float

One-cycle policy: cosine ramp up to max_lr, then cosine anneal down.

pub fn one_cycle_defaults(
  max_lr: Float,
  total_steps: Int,
) -> OneCycleConfig

Default 1-cycle configuration: 30 % warmup, then anneal.

pub fn polynomial(
  base_lr: Float,
  step: Int,
  total_steps: Int,
  power: Float,
) -> Float

Polynomial decay: lr · (1 - step / total)^power, clamped at zero past total.

pub fn step_decay(
  base_lr: Float,
  step: Int,
  step_size: Int,
  gamma: Float,
) -> Float

Step decay: lr · γ^(step / step_size).

Drops by factor γ every step_size steps.

pub fn triangle(step: Int, period: Int) -> Float

Triangle waveform: 0 → 1 → 0 over period steps.

Search Document