Dsxir.OptimizerSession (dsxir v0.3.0)

Copy Markdown

Resumable optimizer session driver.

Wraps a checkpointable Dsxir.Optimizer implementation in a gen_statem that schedules trials sequentially, persists a checkpoint after each trial, and exposes poll/1, await/2, and a synchronous compile/5 wrapper.

Pause / resume / cancel are soft signals: setting pause or cancel lets the in-flight trial complete and be checkpointed before the status moves to :paused / :cancelled. Crashed sessions can be rehydrated from a persisted checkpoint via resume_session/3.

Note: await/2's timeout is enforced client-side. The underlying :gen_statem.call/3 uses :infinity as the transport timeout so that enqueued awaiters are cleaned up via monitor-and-prune rather than left as stale alias entries on a call-side exit. Callers that need a hard deadline should wrap await/2 in a Task and apply their own timeout.

compile/5 accepts :timeout in opts; defaults to 5 minutes (overridable via Application.compile_env(:dsxir, :compile_default_timeout, ms)). On timeout the session is cancelled (soft) and {:error, %OptimizerError{inner: :timeout}} is returned to the caller; the session pid continues running briefly until its in-flight trial completes and the checkpoint is flushed.

Summary

Functions

Block until the session reaches a terminal state.

Soft-cancel the session: in-flight trial completes and is checkpointed before transitioning to :cancelled.

Synchronous wrapper: start a fresh session and await its terminal result.

Delete a persisted session checkpoint. Idempotent: returns :ok even if absent.

List checkpoint listings from the given store. Filter keys: :status, :optimizer, :updated_since.

Soft-pause the session: in-flight trial completes and is checkpointed before transitioning to :paused.

Snapshot the current session state without blocking.

Resume a paused session. Returns {:error, :not_paused} from any other state.

Rehydrate a session from a persisted checkpoint.

Start a fresh session under Dsxir.OptimizerSession.DynamicSupervisor.

Functions

await(ref, timeout \\ :infinity)

@spec await(:gen_statem.server_ref(), timeout()) :: {:ok, map()} | {:error, term()}

Block until the session reaches a terminal state.

timeout is enforced client-side via a wrapping Task so abandoned awaiters are pruned via monitor on the caller. Returns {:ok, result} on :completed, {:error, _} on :failed/:paused/:cancelled, {:error, :timeout} on deadline, or {:error, {:session_down, reason}} if the session crashed.

cancel(ref)

@spec cancel(:gen_statem.server_ref()) :: :ok | {:error, term()}

Soft-cancel the session: in-flight trial completes and is checkpointed before transitioning to :cancelled.

compile(optimizer, program, trainset, metric, opts)

@spec compile(
  module(),
  Dsxir.Program.t(),
  [Dsxir.Example.t()],
  nil | Dsxir.Metric.t(),
  keyword()
) :: {:ok, Dsxir.Program.t(), map()} | {:error, term()}

Synchronous wrapper: start a fresh session and await its terminal result.

:timeout (in opts) defaults to Application.compile_env(:dsxir, :compile_default_timeout, :timer.minutes(5)). On deadline, soft-cancels the session and returns {:error, %Errors.Framework.OptimizerError{inner: :timeout}}.

delete_session(store_spec, session_id)

@spec delete_session(Dsxir.OptimizerSession.Store.store_spec(), String.t()) ::
  :ok | {:error, term()}

Delete a persisted session checkpoint. Idempotent: returns :ok even if absent.

list_sessions(store_spec, filter \\ [])

@spec list_sessions(
  Dsxir.OptimizerSession.Store.store_spec(),
  keyword()
) :: {:ok, [Dsxir.OptimizerSession.Store.listing()]} | {:error, term()}

List checkpoint listings from the given store. Filter keys: :status, :optimizer, :updated_since.

pause(ref)

@spec pause(:gen_statem.server_ref()) :: :ok | {:error, term()}

Soft-pause the session: in-flight trial completes and is checkpointed before transitioning to :paused.

poll(ref)

@spec poll(:gen_statem.server_ref()) :: map()

Snapshot the current session state without blocking.

resume(ref)

@spec resume(:gen_statem.server_ref()) :: :ok | {:error, term()}

Resume a paused session. Returns {:error, :not_paused} from any other state.

resume_session(store_spec, session_id, opts \\ [])

@spec resume_session(Dsxir.OptimizerSession.Store.store_spec(), String.t(), keyword()) ::
  {:ok, pid()} | {:error, term()}

Rehydrate a session from a persisted checkpoint.

Refuses terminal sessions (:completed/:failed/:cancelled) with AlreadyTerminal. :paused is resumable. Trainset hash, optimizer module, and sampler-format version are verified; mismatch returns a ResumeMismatch. Required opts: :program, :trainset, :metric.

start_link(opts)

@spec start_link(keyword()) :: {:ok, pid()} | {:error, term()}

Start a fresh session under Dsxir.OptimizerSession.DynamicSupervisor.

Required opts: :optimizer, :program, :trainset, :metric, :opts. Optional: :session_id (auto-generated when absent), :name, :settings_snapshot, :max_errors, :store.

Returns {:error, :already_running} if a session with the same session_id is already registered.