Crown (crown v0.3.0)

Copy Markdown View Source

Leader election and supervised child management backed by an external oracle.

Crown is a GenServer that coordinates leader election across an Erlang cluster. Leadership authority is delegated to a pluggable oracle (see Crown.Oracle) talking to the external world: a database lease or distributed lock. Only one node holds the crown at a time even during netsplits.

The elected leader starts a supervised child (the :child_spec option) and keeps it running for as long as leadership is held. When leadership is lost the child is stopped and, optionally, a :follower_child_spec is started instead. Followers monitor the leader and attempt to claim when it goes down.

Crown registers the leader globally as {Crown, name} so followers on other nodes can discover and monitor it.

Usage

children = [
  {Crown,
   name: :my_worker,
   oracle: {MyApp.RedisOracle, lock_key: "my_worker"},
   child_spec: MyApp.SingletonWorker}
]

Supervisor.start_link(children, strategy: :one_for_one)

Options

  • :name (atom/0) - Required. Atom used to register the process locally and, when leading, as {:global, {Crown, name}} for cross-node discoverability.

  • :oracle - Required. A {module, opts} tuple. module must implement Crown.Oracle. opts are passed to Crown.Oracle.init/1.

  • :child_spec (term/0) - Required. Child spec to start when this node holds the crown. May be nil if no child is needed (though Crown's main purpose is to supervise a child). Accepts any value accepted by Supervisor.child_spec/2.

  • :follower_child_spec (term/0) - Child spec to start when this node does not hold the crown. Defaults to nil (no follower child). The default value is nil.

  • :claim_delay (non_neg_integer/0) - Milliseconds to wait after startup before the first claim attempt. The default value is 0.

  • :monitor_delay (non_neg_integer/0) - Milliseconds to wait between attempts to find and monitor the current leader after a failed claim. The first attempt always happens immediately (delay 0); subsequent retries use this value. Useful in environments where the cluster takes time to form. Defaults to 5000. The default value is 5000.

  • :monitor_timeout (pos_integer/0) - Maximum time in milliseconds to spend trying to find the leader before giving up and attempting to claim again. Defaults to 30000.

    This timeout is checked each time a monitor_delay tick fires, so the actual elapsed time before a re-claim may exceed monitor_timeout by up to one monitor_delay interval.

    The default value is 30000.

  • :monitor_leader (boolean/0) - When true, nodes that fail to claim will monitor the current leader and attempt to claim when it goes down. Set to false in deployments where nodes cannot see each other and rely instead on oracle handle_info callbacks to trigger claim attempts. The default value is true.

Telemetry

Crown emits telemetry events under the [:crown, ...] prefix. Use attach_default_logger/1 for built-in logging or attach your own handlers. See Crown.Telemetry for the full list of events.

Summary

Functions

attach_default_logger(filters \\ [])

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

global_name(name)

leader?(server)

start(opts)

start_link(opts)

status(server)

stop(server)