PtcRunner.Lisp.Eval.ParallelBudget (PtcRunner v0.11.0)

Copy Markdown View Source

A shared, lock-free slot semaphore bounding the number of parallel pmap/pcalls worker processes alive at once across a whole PtcRunner.Lisp.run/2.

Why a global slot budget (and not heap division)

An earlier model gave each worker max_heap / concurrency. That is unsound for nested parallelism: a parent pmap worker stays alive while its nested children run, so a parent and its children are all live simultaneously — dividing the heap cannot bound the aggregate once nesting compounds.

The model instead is: every parallel worker (top-level and nested) runs under a fixed max_heap_size cap, and one shared semaphore with capacity max_parallel_workers limits how many such workers may be alive at once. The aggregate guarantee is then simply:

max live parallel heap  max_parallel_workers × worker_max_heap

Why :atomics (not :counters, not a GenServer)

  • :atomics gives add_get/3 — an atomic increment-and-fetch. Try-acquire is one atomic op with no race: bump the counter, and if the new value exceeds capacity, atomically give it back. No lock, no extra process, no message round-trip.
  • :counters has only add/3 (returns :ok) + a separate get/2; a try-acquire built from those two has a check-then-act race between concurrent acquirers.
  • A GenServer would add a process to supervise, monitor and clean up, plus a message round-trip on every spawn — all to serialise an operation :atomics already does atomically.

The :atomics reference is an opaque term; it is threaded through EvalContext and copied into worker closures unchanged — every process operates on the same underlying counter.

Acquire / release contract

  • try_acquire/1 is non-blocking. It never waits for a slot — a worker that cannot get one fails fast with :parallel_capacity_exceeded rather than deadlocking on a slot that can only free when the worker itself finishes.
  • Every acquired slot MUST be released on every termination path (normal, timeout, heap kill, cancellation). Callers pair try_acquire/1 with release/1 via monitor cleanup / after.
  • Releasing without a held slot is a caller bug. It raises rather than clamping because a decrement-then-clamp release can race with a valid acquire and erase the acquired slot.

Summary

Types

t()

Shared parallel-worker slot budget.

Functions

Returns the number of slots currently free.

Returns the number of slots currently held (for tests / introspection).

Creates a budget with capacity slots, all initially free.

Releases one previously-acquired slot.

Non-blocking attempt to acquire one slot.

Types

t()

@type t() :: %PtcRunner.Lisp.Eval.ParallelBudget{
  atomics_ref: :atomics.atomics_ref(),
  capacity: pos_integer()
}

Shared parallel-worker slot budget.

Functions

available(budget)

@spec available(t()) :: non_neg_integer()

Returns the number of slots currently free.

held(parallel_budget)

@spec held(t()) :: non_neg_integer()

Returns the number of slots currently held (for tests / introspection).

new(capacity)

@spec new(pos_integer()) :: t()

Creates a budget with capacity slots, all initially free.

release(parallel_budget)

@spec release(t()) :: :ok

Releases one previously-acquired slot.

Safe to call exactly once per successful try_acquire/1. Raises if no slot is currently held; underflow is a caller bug and must not be hidden in a hard security budget.

try_acquire(parallel_budget)

@spec try_acquire(t()) :: :ok | :full

Non-blocking attempt to acquire one slot.

Returns :ok if a slot was acquired (the caller now owns it and must release/1 it), or :full if all slots are in use. Never blocks.