distribute/actor
Named actor lifecycle: start, register, supervise, pool.
Resource cleanup and the {terminate, Reason} gap
gleam/otp/actor 1.x does not implement OTP’s
{terminate, Reason} system message
(see gleam_otp_external.erl; there is an explicit TODO).
That has two consequences for any actor that owns external
resources (file handles, ETS tables, ports, locks, sockets):
- Handler-controlled exits run user code first. If your
handler returns
receiver.Stoporreceiver.StopAbnormal, you have full control: release every resource you own before returning the stop variant. The actor will then exit and the BEAM frees its mailbox. - External terminations skip the actor. When the actor is
process.kill-ed (uncatchable), or when the supervisor sendsexit(Pid, shutdown), gleam/otp/actor does not currently convert it into a callback. Resources owned only on the actor’s heap are released by the BEAM (memory), but any external handle (a file, a TCP connection, an ETS table you forgot to make public) leaks.
The OTP-pure mitigation is the “linked resource owner”
pattern: spawn a tiny dedicated process to own the external
resource, monitor the actor, and run a close callback when the
actor dies for any reason. The library ships
actor.start_resource_owner/3 as a ready-made helper for this
pattern. See docs/recipes.md for the underlying recipe.
When upstream gleam/otp ships termination callbacks (tracking issue: gleam-lang/otp#126) the helper becomes redundant for the actor case. Its API is shaped so the migration is a one-line change in user code and a deprecation here, no semantics shift.
Types
Error from start_registered: distinguishes an actor init failure from
a :global registration failure after a successful start.
pub type StartRegisteredError {
ActorStartFailed(actor.StartError)
GlobalRegisterFailed(registry.RegisterError)
}
Constructors
-
ActorStartFailed(actor.StartError)The OTP actor failed to start (init crashed, timeout, etc.).
-
GlobalRegisterFailed(registry.RegisterError)The actor started but
:globalregistration failed; the orphaned actor process has been killed.
Values
pub fn child_spec(
typed_name: registry.TypedName(msg),
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
init_timeout_ms: Int,
) -> supervision.ChildSpecification(global.GlobalSubject(msg))
OTP child spec for a named actor that auto-registers on (re)start.
pub fn child_spec_default(
typed_name: registry.TypedName(msg),
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
) -> supervision.ChildSpecification(global.GlobalSubject(msg))
Like child_spec, with config.get().default_init_timeout_ms.
pub fn pool(
typed_name: registry.TypedName(msg),
size: Int,
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
init_timeout_ms: Int,
) -> Result(process.Pid, actor.StartError)
Start N supervised actors, registered as name_1 .. name_N.
Returns Error(actor.InitFailed(...)) if size < 1. list.range
would otherwise produce a degenerate list of weird worker names
(e.g. name_0, name_-1) and silently start a useless supervisor.
Cascading Failure Risk:
:globaland fixed pools do not mix perfectly. If a single worker in the pool fails its global registration (e.g. becausename_4is legitimately held by a node on the other side of a network split), the worker crashes. The pool supervisor will try to restart it. If the conflict persists, the supervisor exhausts itsMaxRrestart intensity and crashes the entire pool. This means a collision on 1 worker brings down all N workers, causing the parent supervisor to restart the whole pool. This architectural limit will be solved natively when thesynregistry backend is introduced in v4.2.0 (viasynprocess groups).
pub fn pool_default(
typed_name: registry.TypedName(msg),
size: Int,
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
) -> Result(process.Pid, actor.StartError)
Like pool, with config.get().default_init_timeout_ms.
pub fn start(
typed_name: registry.TypedName(msg),
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
init_timeout_ms: Int,
) -> Result(global.GlobalSubject(msg), actor.StartError)
Start a named actor.
pub fn start_default(
typed_name: registry.TypedName(msg),
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
) -> Result(global.GlobalSubject(msg), actor.StartError)
Like start, but uses config.get().default_init_timeout_ms as the init timeout.
pub fn start_observed(
typed_name: registry.TypedName(msg),
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
init_timeout_ms: Int,
on_decode_error: fn(codec.DecodeError) -> Nil,
) -> Result(global.GlobalSubject(msg), actor.StartError)
Start a named actor, with a callback for decode errors.
Useful for logging or metering malformed messages across nodes (e.g. during rolling deploys with mismatched codec versions).
pub fn start_registered(
typed_name: registry.TypedName(msg),
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
init_timeout_ms: Int,
) -> Result(global.GlobalSubject(msg), StartRegisteredError)
Start an actor and register it globally. Kills the actor if registration fails.
See also: start_registered_default/3 (configured init timeout),
start_registered_observed/5 (decode-error hook),
start_supervised/4 (auto-restart on crash).
pub fn start_registered_default(
typed_name: registry.TypedName(msg),
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
) -> Result(global.GlobalSubject(msg), StartRegisteredError)
Like start_registered, but uses config.get().default_init_timeout_ms as the init timeout.
pub fn start_registered_error_to_string(
err: StartRegisteredError,
) -> String
Render a StartRegisteredError as a human-readable string. Mirrors
the *_error_to_string formatters published by the other error
modules so observability paths can io.println an error directly
without pattern-matching at every call site.
pub fn start_registered_observed(
typed_name: registry.TypedName(msg),
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
init_timeout_ms: Int,
on_decode_error: fn(codec.DecodeError) -> Nil,
) -> Result(global.GlobalSubject(msg), StartRegisteredError)
Start a named actor and register it globally, with a callback for decode errors.
pub fn start_resource_owner(
open: fn() -> resource,
close: fn(resource) -> Nil,
lifetime: process.Pid,
) -> process.Pid
Spawn a dedicated process that owns an external resource and runs
close(resource) when lifetime dies for any reason. Workaround
for the missing {terminate, Reason} callback in
gleam/otp/actor 1.x: external resources (database connections,
distributed locks, NIF handles, OS pipes) cannot rely on the
actor’s exit to free them, but a dedicated observer process can.
Returns the owner’s PID, in case you need to stop the owner independently of the lifetime PID.
Why this shape
The resource is opened inside the owner, so it is BEAM-process-
owned by the owner. That matters for resources whose lifetime is
tied to a specific process at the runtime level (ETS tables, ports,
gen_tcp controlling process).
The owner is linked to the caller (which is usually the actor’s
init function). It also traps exits (process.trap_exits(True)).
This guarantees a fail-fast behaviour: if open() crashes before
it returns, the owner dies, and because it is linked, it pulls the
actor down with it immediately. This prevents the “partial failure”
scenario where the actor survives but its critical resource failed
to initialize.
The owner uses a process.monitor on lifetime. Monitor delivery
is asynchronous but reliable: the BEAM guarantees a DOWN for
every monitored PID, and a monitor on an already-dead PID fires
immediately. That makes the helper safe to call right after the
actor has started: even if there is a microsecond gap before the
monitor goes up, the BEAM resolves it correctly.
Usage
import distribute
import distribute/actor as dist_actor
import distribute/global
let assert Ok(gs) =
distribute.start_registered(name, init, handler)
let assert Ok(actor_pid) = global.owner(gs)
let _owner = dist_actor.start_resource_owner(
fn() { open_postgres_pool() },
fn(pool) { close_postgres_pool(pool) },
actor_pid,
)
If the actor needs to use the resource, have open build a
process.Subject the owner serves and pass it through your
actor’s init.
Failure modes
openraises: the owner dies before the resource is opened. Because the owner is linked to the caller (the actor), the actor receives an exit signal and crashes immediately. This fail-fast behaviour avoids partial failures where the actor runs without its resource.closeraises: the owner dies after the panic, no further cleanup runs. The resource may be in a partially-closed state.lifetimeis already dead at call time: monitor firesDOWNimmediately,closeruns on the next scheduler tick.- The owner is
process.kill-ed: cleanup is uncatchable, BEAM reaps the resource via process death (only effective for BEAM-process-tied resources). External handles leak. This is the same uncatchable-kill caveat OTP itself has.
Future
When gleam/otp/actor ships native termination callbacks
(gleam-lang/otp#126), this helper becomes redundant for the
actor case. The contract will not shift: callers can replace the
actor.start_resource_owner(open, close, pid) line with the
upstream API and delete the helper call site. The function will
stay (deprecated) for the more general “tie cleanup to any PID”
pattern.
pub fn start_supervised(
typed_name: registry.TypedName(msg),
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
init_timeout_ms: Int,
) -> Result(process.Pid, actor.StartError)
Start a supervised actor that auto-registers on (re)start.
If registration fails, the worker crashes and the supervisor retries, which is the correct OTP pattern for transient registration failures.
pub fn start_supervised_default(
typed_name: registry.TypedName(msg),
initial_state: state,
handler: fn(msg, state) -> receiver.HandlerStep(state),
) -> Result(process.Pid, actor.StartError)
Like start_supervised, with config.get().default_init_timeout_ms.