DurableServer behaviour (durable_server v0.1.1)

DurableServer provides durable, distributed GenServer processes backed by pluggable storage.

DurableServer implements fault-tolerant, stateful processes that can survive node failures, restarts, and deployments by automatically persisting state to storage and coordinating across a distributed cluster.

Key Features

  • Durable state: Automatically persists state to storage with configurable sync intervals
  • Cluster coordination: Uses distributed registry for process discovery and health monitoring
  • Capacity-aware placement: Monitors CPU, memory, and disk usage to route new processes to nodes with available capacity
  • Sticky placement: Environment variable-based placement preferences (e.g., same machine, same region via FLY_REGION, etc.) with time-gated fallback to preferred nodes
  • Automatic recovery: Failed processes are detected and restarted across the cluster
  • Graceful shutdown: Ensures state is synchronized before termination via DurableServer.Terminator

Architecture

DurableServers must be started through DurableServer.Supervisor, which provides:

  • Prefix-based isolation between different supervisor instances
  • Graceful shutdown coordination via Terminator GenServer
  • Automatic lifecycle management and restart capabilities with coordination across the cluster

See DurableServer.Supervisor for supervisor setup and configuration options.

Basic Usage

defmodule MyCounterServer do
  use DurableServer, vsn: 1

  def dump_state(state) do
    %{count: state.count}
  end

  def load_state(_old_vsn, %{"count" => count} = _dumped_state) do
    %{count: count}
  end

  def init(%{count: count} = state) do
    IO.puts("Starting with count #{count}")
    {:ok, Map.merge(state, %{started_at: DateTime.utc_now()}), permanent: true}
  end

  def handle_call(:increment, _from, state) do
    new_state = %{state | count: state.count + 1}
    {:reply, new_state.count, new_state}
  end

  def handle_call(:get_count, _from, state) do
    {:reply, state.count, state}
  end

  def handle_call(:reset, _from, state) do
    {:reply, :ok, %{state | count: 0}}
  end
end

# Start the supervisor first (typically in your application.ex supervision tree):

children = [
  ...,
  {DurableServer.Supervisor, name: MyDurableSup, prefix: "durable/"}
]

# or start directly if you simply want to demo:
{:ok, supervisor_pid} = DurableServer.Supervisor.start_link(
  name: MyDurableSup,
  prefix: "durable/"
)

# Start individual servers through the supervisor
{:ok, {pid, _meta}} = DurableServer.Supervisor.start_child(
  MyDurableSup,
  {MyCounterServer, key: "user_123", initial_state: %{count: 0}}
)

# Use the server
GenServer.call(pid, :increment)  # => 1
GenServer.call(pid, :increment)  # => 2
GenServer.call(pid, :get_count)  # => 2

Note: for releases, :os_mon must be added to extra_applications in mix.exs:

def application do
  [
    mod: {My.Application, []},
    extra_applications: [:logger, :runtime_tools, :os_mon]
  ]
end

Advanced Example: Session Manager

defmodule UserSessionServer do
  use DurableServer, vsn: 2

  def dump_state(state), do: Map.take(state, [:user_id, :session, :last_activity_at])

  # migration logic for version 1 -> 2
  def load_state(vsn, dumped_state) do
    case vsn do
      1 ->
        # migrate to v2 logic

      _ ->
        %{
          user_id: Map.fetch!(dumped_state, ["user_id"]),
          session: Map.get(dumped_state, "session" || %{},
          last_activity: dumped_state["last_activity_at"],
        }
      end
  end

  def init(%{} = loaded_state) do
    init_state = %{loaded_state | last_activity_at: System.system_time(:millisecond)}
    {:ok, init_state, sync_every_ms: 30_000}
  end

  def handle_call({:update_session, func}, _from, state) do
    %{} = new_session = func.(state.session)
    new_state = %{state | session: new_session, last_activity: System.system_time(:millisecond)}
    {:reply, :ok, new_state}
  end

  def handle_call(:get_session, _from, state) do
    {:reply, state.session, %{state | last_activity: System.system_time(:millisecond)}}
  end

  def handle_call(:logout, _from, state) do
    {:stop, :normal, :ok, %{state | last_activity: System.system_time(:millisecond)}}
  end
end

Configuration Options

DurableServer supports these options in the init/1 or init/2 return tuple:

  • :auto_sync - Enable automatic periodic syncing (default: false)
  • :sync_every_ms - Sync interval in milliseconds (default: 30_000)
  • :meta - Optional metadata to include for the globally registered server which is returned alongside the pid with DurableServer.Supervisor.lookup/2.
  • :permanent - Mark server for automatic restart by LifecycleManager (default: false)

Accessing Runtime Info

DurableServer provides runtime information through the optional init/2 callback. The info map contains supervisor references and any user-defined data configured via the supervisor's :init_info option.

Built-in Keys

The following keys are always present in the info map:

  • :key - DurableServer key
  • :supervisor - The DurableServer.Supervisor name
  • :task_supervisor - Task supervisor for spawning async tasks
  • :dynamic_supervisor - The DynamicSupervisor managing DurableServer processes

User-defined Keys

Pass custom data to all servers via the supervisor's :init_info option:

# In your supervision tree
{DurableServer.Supervisor,
 name: MyApp.DurableSup,
 prefix: "myapp/",
 init_info: %{api_client: MyApp.APIClient, config: %{timeout: 5000}}}

Then access it in your server's init/2:

def init(state, info) do
  api_client = info.api_client
  timeout = info.config.timeout
  {:ok, %{state | api_client: api_client, timeout: timeout}}
end

Choosing Between init/1 and init/2

  • Use init/1 if you don't need access to supervisor references or custom init_info
  • Use init/2 if you need the task supervisor, dynamic supervisor, or custom data

Both callbacks are optional. If you implement init/2, it takes precedence. If neither is implemented, the default init/1 returns {:ok, state}.

State Synchronization

State is synchronized to storage in these scenarios:

  1. Manual sync: Return :sync from any callback, ie: {:noreply, state, :sync} You can also combine sync with other actions via callback options, e.g. {:noreply, state, {:continue, term}, sync: true}.
  2. Automatic sync: When :auto_sync is enabled all changes are immediately written when any callback returns, or the :sync_every_ms interval can be provided to periodically sync changes.
  3. Graceful shutdown: Automatically synced during normal termination, ie: cold deploys
  4. Before stopping: When returning {:stop, reason, state} from callbacks

Stopping Behavior

DurableServer supports different stop reasons with specific behaviors regarding exit signal propagation:

Shutdown-wrapped stops (exit signal propagates to linked processes)

  • {:stop, {:shutdown, :delete}, state} - Stops and deletes from storage, exit signal propagates
  • {:stop, {:shutdown, :permanent}, state} - Stops permanently, exit signal propagates. :permanent stop will make the server no longer elligable for permanent restarts and it will remain stopped until explicitly started by DurableSuper.Supervisor.start_child/2.
  • {:stop, {:shutdown, :normal}, state} - Normal stop, exit signal propagates (syncs as stopped_graceful)

Shutdown-wrapped exits propagate to linked processes (allowing them to react) but don't kill them.

Non-shutdown stops (exit signal does NOT propagate to linked processes)

  • {:stop, :delete, state} - Stops and deletes, silent termination (no exit signal)
  • {:stop, :permanent, state} - Stops permanently, silent termination (no exit signal)
  • {:stop, :normal, state} - Normal stop, silent termination (syncs as stopped_graceful)

Non-shutdown stops are transformed to :normal exits which don't propagate to linked processes.

Error stops

  • {:stop, {:error, reason}, state} - Stops with error, marks as crashed, exit signal propagates

Use shutdown-wrapped stops when linked processes need to be notified of the shutdown. Use non-shutdown stops for silent termination without notifying linked processes.

Error Handling and Recovery

DurableServers are designed to be resilient:

  • Process crashes: LifecycleManager detects failures and restarts servers
  • Node failures: Other nodes claim and restart orphaned processes
  • Storage failures: Retries and graceful degradation where possible
  • Region-aware network partitions: Consistent hashing ensures only one node manages each key and places servers in their initial region where possible

Best Practices

  1. Always use DurableServer.Supervisor: Never start DurableServers directly
  2. Design for restarts: Assume your process can be restarted on any node at any time
  3. Ensure load_state/2 handles migrations and avoids side effects You must implement state migrations for schema changes across code changes, which is handled by bumping your :vsn option to use DurableServer and matching in your load_state/2 on old versions.

Note: A lock is not aquired until init/1 is entered, so your load_state/2 callbacks should always be a pure function without side effects. ie if you need process messaging, pubsub, or to perform work on process start, do so after loading your state within init/1.

  1. Consider appropriate sync intervals: Balance durability vs performance needs

Distribution and Clustering

DurableServers work seamlessly in distributed environments:

  • Processes register in a cluster-wide registry with their unique keys
  • Permanent servers are started across the cluster and guarantee only a single key is started globally at a given time
  • Servers can be configured with sticky placement preferences to restart on the same machine or in the same region where they were running
  • Health monitoring detects failures across the cluster
  • Automatic failover ensures high availability

See DurableServer.Supervisor documentation for cluster configuration options.

Capacity-Aware Placement

DurableServers support automatic capacity-aware placement with remote fallback.

Local Placement (Default)

When starting a child, the local node is tried first. If capacity limits are exceeded, remote placement is attempted automatically.

Remote Placement

If local capacity is exhausted, DurableServer automatically tries remote nodes:

  1. Same-region nodes first - Prioritizes nodes in the same region for lower latency
  2. Least busy nodes - Selects nodes with the lowest utilization across all limits
  3. Configurable retries - Default 3 remote nodes tried, configurable via max_placement_retries

Capacity Limits

Configure capacity limits when starting a supervisor:

{DurableServer.Supervisor,
 name: MyDurableSup,
 prefix: "durable/",
 max_children: %{
   :total => 100,                     # Max total children on this node
   MyModule => 50                     # Max MyModule children on this node
 },
 max_cpu: 80,                         # Max CPU % before rejecting
 max_memory: 85,                      # Max memory % before rejecting
 max_disk: {90, "/data"}}             # Max disk % on mount point before rejecting

Unlike CPU and memory limits, disk limits are bypassed for sticky restarts (children returning to their previous node) since part of the disk usage is the child's own data.

Placement Options

Control remote placement behavior per start_child call:

# Default: Try local, then up to 3 remote nodes
DurableServer.Supervisor.start_child(sup, {MyServer, key: "user_1", initial_state: %{}})

# Local only, no remote fallback
DurableServer.Supervisor.start_child(sup, {MyServer, key: "user_1", initial_state: %{}},
  max_placement_retries: 0)

# Try local, then up to 5 remote nodes
DurableServer.Supervisor.start_child(sup, {MyServer, key: "user_1", initial_state: %{}},
  max_placement_retries: 5)

Note: Automatic restarts from LifecycleManager always use max_placement_retries: 0 to place processes on their current node only, deferring to other node LifecycleManagers to manager their own node-local placement.

See DurableServer.Supervisor for full configuration details.

Sticky Placement

Sticky placement allows DurableServers to prefer restarting on nodes with specific characteristics (e.g., same machine, same region) before falling back to other nodes. This is particularly useful for things like Litestream-backed databases to avoid unnecessary S3 restores when the database is already available locally.

Sticky Configuration

Configure sticky placement per-module when starting a supervisor using a keyword list where keys are environment variable names (as atoms) and values are delay times in milliseconds:

{DurableServer.Supervisor,
 name: MyDurableSup,
 prefix: "durable/",
 sticky_placement: %{
   MyDatabaseServer => [
     FLY_MACHINE_ID: 10_000,
     FLY_REGION: 20_000,
     any: 0
   ]
 }}

Sticky placement uses environment variables to create a progressive fallback strategy with cumulative time windows. Each delay value specifies how much time to add before the next level can claim. From the above configuration:

  1. Level 0 (immediate): Only nodes matching FLY_MACHINE_ID can claim
  2. Level 1 (after 10s): Nodes matching FLY_REGION can claim
  3. Level 2 (after 30s): Any node (:any) can claim

The delays are cumulative - each level unlocks at the sum of all previous delays:

  • Level 0 unlocks at 0ms (always immediate)
  • Level 1 unlocks at 10,000ms (sum of delays before level 1)
  • Level 2 unlocks at 30,000ms (10s + 20s)

The last level's delay value is unused (no subsequent level), so 0 is conventional. Earlier levels remain eligible even after later levels unlock, maintaining preference order.

Common Patterns

Machine stickiness with region fallback (no :any):

sticky_placement: %{
  MyServer => [
    FLY_MACHINE_ID: 20_000,
    FLY_REGION: 0
  ]
}

Same machine claims immediately, same region claims after 20s. Without :any, nodes in other regions can never claim - the server will only run in its original region.

Region stickiness, falling back to any node:

sticky_placement: %{
  MyServer => [
    FLY_REGION: 20_000,
    any: 0
  ]
}

Same region claims immediately, any node can claim after 20s.

Custom environment variables:

sticky_placement: %{
  MyServer => [
    DATACENTER: 15_000,
    AVAILABILITY_ZONE: 30_000,
    any: 0
  ]
}

Same datacenter claims immediately, same availability zone after 15s, any node after 45s.

Strict region pinning (no fallback):

sticky_placement: %{
  MyServer => [
    FLY_REGION: 0
  ]
}

Only nodes with matching FLY_REGION can claim, and they can claim immediately. Without :any, non-matching nodes can never claim the server - it will only run on nodes with the same FLY_REGION as where it was originally started. Use this when data locality is critical and you'd rather the server stay down than run in the wrong location.

Default Sticky Placement

Apply the same sticky placement configuration to all modules:

{DurableServer.Supervisor,
 name: MyDurableSup,
 prefix: "durable/",
 default_sticky_placement: [
   FLY_REGION: 20_000,
   any: 0
 ]}

Per-module configurations override the default.

Updating Sticky Placement Configuration

When a DurableServer starts, its sticky placement is captured based on the module configuration and the node's current environment variables. This placement is persisted with the server's state in object storage.

If you later change the module's sticky placement configuration (for example, adding :any as a fallback level), running servers retain their original placement from when they started. To ensure proper orphan claiming behavior, the lifecycle manager automatically augments persisted placement with the :any level if present in the updated module config.

For example, if you change from:

sticky_placement: %{MyServer => [FLY_MACHINE_ID: 60_000, FLY_REGION: 0]}

To:

sticky_placement: %{MyServer => [FLY_MACHINE_ID: 60_000, FLY_REGION: 120_000, any: 0]}

Servers started before the change will have their persisted placement augmented with the :any level at runtime. This ensures they can still be claimed by any node after their specific placement preferences are exhausted, using the delay specified in the module config.

Other environment variable levels cannot be added retroactively since their values were determined when the server originally started.

Important Notes

  • Environment variable values are captured when the server first starts
  • Values are stored in the server's metadata in object storage
  • nil environment variable values are preserved and can match
  • The :any atom matches any node, regardless of environment variables
  • Time windows are cumulative, not independent intervals
  • Earlier preference levels remain eligible after later levels unlock

Monitoring Events with Group

DurableServer uses Group for distributed process groups, registry, and lifecycle monitoring.

You can call into the Group instance of your Supervisor to monitor DurableServer events:

# Monitor a specific key
:ok = Group.monitor(MyDurableSup, "user/123")

# Monitor all keys with a prefix
:ok = Group.monitor(MyDurableSup, "user/")

# Monitor all events
:ok = Group.monitor(MyDurableSup, :all)

Monitors receive {:group, events, info} tuples in their mailbox:

def handle_info({:group, events, _info}, state) do
  Enum.each(events, fn
    %Group.Event{type: :registered, key: key, pid: pid, previous_meta: nil} ->
      # A DurableServer started (previous_meta is nil for first registration)
      :ok
    %Group.Event{type: :unregistered, key: key, reason: reason} ->
      # A DurableServer stopped
      :ok
    _ -> :ok
  end)
  {:noreply, state}
end

Event types: :registered, :unregistered, :joined, :left

:registered and :joined events include a previous_meta field (nil for new, old meta for re-register/re-join). Single operations produce one event per tuple; bulk operations (nodedown, process death) batch all events together.

Joining as a Member

Non-DurableServer processes can join keys to be discoverable and receive dispatched messages:

# Join a key (e.g., from a Phoenix Channel)
:ok = Group.join(MyDurableSup, "room/123", %{type: :channel})

# Re-joining updates metadata in place
:ok = Group.join(MyDurableSup, "room/123", %{type: :channel, status: :active})

# Query all members of a key (DurableServers + joined processes)
members = Group.members(MyDurableSup, "room/123")
# => [{#PID<0.150.0>, %{...}}, {#PID<0.200.0>, %{type: :channel, status: :active}}]

# Leave when done (also happens automatically on process death)
:ok = Group.leave(MyDurableSup, "room/123")

Dispatching to Members

Send messages to all members of a key:

# From a DurableServer, broadcast to all connected channels
Group.dispatch(MyDurableSup, state.key, {:new_message, message})

Monitor vs Join

  • monitor/2: Receive lifecycle events (:registered, :unregistered, :joined, :left) - system-generated
  • join/3: Be discoverable via members/2 and receive dispatch/3 messages - application-level

These are independent - joining does not monitor events, and monitoring does not make you discoverable.

Summary

Callbacks

Optional callback invoked after terminate/2 and after final status sync.

Transform user state into a map for persistence.

Initializes the DurableServer with loaded state.

Transform backend-decoded persisted state back into user state format.

Functions

Returns a specification to start this module under a supervisor.

Attempt to atomically claim a restart attempt for a server.

Clear restart attempt metadata from a server object.

Fetches the DurableServer's current state from storage.

Get just the metadata for a server without the full object.

Types

callback_option()

@type callback_option() :: {:meta, user_meta()} | {:sync, boolean()}

callback_options()

@type callback_options() :: [callback_option()]

init_option()

@type init_option() ::
  {:auto_sync, boolean()}
  | {:sync_every_ms, pos_integer()}
  | {:meta, map()}
  | {:permanent, boolean()}

sync_action()

@type sync_action() :: :sync

timeout_action()

@type timeout_action() :: timeout() | :hibernate | {:continue, term()} | sync_action()

user_meta()

@type user_meta() :: map()

user_stop_reason()

@type user_stop_reason() ::
  nil
  | :normal
  | :delete
  | :permanent
  | {:shutdown, :delete}
  | {:shutdown, :permanent}
  | {:shutdown, :normal}
  | {:error, term()}

Callbacks

after_terminate(terminate_return, info)

(optional)
@callback after_terminate(terminate_return :: term(), info :: map()) :: term()

Optional callback invoked after terminate/2 and after final status sync.

This callback is only invoked when the final status sync completed successfully for a graceful stop (final_status: :stopped_graceful and sync_result: :ok).

The first argument is exactly the return value from terminate/2. The second argument is an info map:

  • :key - DurableServer key
  • :supervisor - Supervisor name
  • :final_status - Final persisted status atom
  • :sync_result - :ok | {:error, term()}

  • :reason - Termination reason passed to terminate/2

code_change(old_vsn, state, extra)

(optional)
@callback code_change(
  old_vsn :: term() | {:down, term()},
  state :: term(),
  extra :: term()
) ::
  {:ok, new_state :: term()} | {:error, reason :: term()}

dump_state(state)

@callback dump_state(state :: term()) :: map()

Transform user state into a map for persistence.

This required callback is used when saving state through the configured storage backend. It allows you to:

  • Filter out keys that shouldn't be persisted (like PIDs, refs, etc.)
  • Transform the state shape for storage
  • Remove ephemeral data

The returned value must be a plain map at the top level. Nested values are passed through to the configured backend as-is, so they only need to be encodable by the backend you are using.

This means persisted shapes may differ by backend. For example:

  • DurableServer.Backends.ObjectStore typically encodes to and decodes from JSON-shaped data with string keys
  • DurableServer.Backends.EKVStore may preserve richer Elixir terms

If you plan to move data between backends, load_state/2 should be prepared to handle multiple persisted shapes during the migration window.

Examples

def dump_state(%{count: count, temp_data: _temp} = state) do
  # Only persist count, filter out temp_data
  %{count: count}
end

handle_call(request, from, state)

(optional)
@callback handle_call(request :: term(), from :: GenServer.from(), state :: term()) ::
  {:reply, reply, new_state}
  | {:reply, reply, new_state, timeout_action()}
  | {:reply, reply, new_state, callback_options()}
  | {:reply, reply, new_state, timeout_action(), callback_options()}
  | {:noreply, new_state}
  | {:noreply, new_state, timeout_action()}
  | {:noreply, new_state, callback_options()}
  | {:noreply, new_state, timeout_action(), callback_options()}
  | {:stop, reason, reply, new_state}
  | {:stop, {:shutdown, :delete}, reply, new_state}
  | {:stop, {:shutdown, :permanent}, reply, new_state}
  | {:stop, :delete, reply, new_state}
  | {:stop, :permanent, reply, new_state}
  | {:stop, reason, new_state}
  | {:stop, {:shutdown, :delete}, new_state}
  | {:stop, {:shutdown, :permanent}, new_state}
  | {:stop, :delete, new_state}
  | {:stop, :permanent, new_state}
when reply: term(), new_state: term(), reason: term()

handle_cast(request, state)

(optional)
@callback handle_cast(request :: term(), state :: term()) ::
  {:noreply, new_state}
  | {:noreply, new_state, timeout_action()}
  | {:noreply, new_state, callback_options()}
  | {:noreply, new_state, timeout_action(), callback_options()}
  | {:stop, reason :: term(), new_state}
  | {:stop, {:shutdown, :delete}, new_state}
  | {:stop, {:shutdown, :permanent}, new_state}
  | {:stop, :delete, new_state}
  | {:stop, :permanent, new_state}
when new_state: term()

handle_continue(continue, state)

(optional)
@callback handle_continue(continue :: term(), state :: term()) ::
  {:noreply, new_state}
  | {:noreply, new_state, timeout_action()}
  | {:noreply, new_state, callback_options()}
  | {:noreply, new_state, timeout_action(), callback_options()}
  | {:stop, reason :: term(), new_state}
  | {:stop, {:shutdown, :delete}, new_state}
  | {:stop, {:shutdown, :permanent}, new_state}
  | {:stop, :delete, new_state}
  | {:stop, :permanent, new_state}
when new_state: term()

handle_info(msg, state)

(optional)
@callback handle_info(msg :: :timeout | term(), state :: term()) ::
  {:noreply, new_state}
  | {:noreply, new_state, timeout_action()}
  | {:noreply, new_state, callback_options()}
  | {:noreply, new_state, timeout_action(), callback_options()}
  | {:stop, reason :: term(), new_state}
  | {:stop, {:shutdown, :delete}, new_state}
  | {:stop, {:shutdown, :permanent}, new_state}
  | {:stop, :delete, new_state}
  | {:stop, :permanent, new_state}
when new_state: term()

init(loaded_state)

(optional)
@callback init(loaded_state :: map()) ::
  :ignore | {:ok, state :: term()} | {:ok, state :: term(), [init_option()]}

Initializes the DurableServer with loaded state.

This callback is invoked after the server acquires its global lock and loads any persisted state. You can implement either init/1 or init/2:

  • init/1 - Receives only the loaded state
  • init/2 - Receives the loaded state and an info map with runtime information

If you implement init/2, it takes precedence over init/1.

The Info Map (init/2)

The info map in init/2 contains:

  • :key - The DurableServer key
  • :supervisor - The supervisor name (e.g., MyApp.DurableSup)
  • :task_supervisor - The task supervisor for async operations
  • :dynamic_supervisor - The dynamic supervisor managing DurableServer processes
  • Any user-defined keys from the supervisor's :init_info option

Return Values

  • {:ok, state} - Initialize with the given state
  • {:ok, state, opts} - Initialize with state and options
  • :ignore - Don't start the server, sync as stopped_graceful

Options

  • :auto_sync - Enable automatic syncing on every callback return (default: false)
  • :sync_every_ms - Periodic sync interval in milliseconds (default: 30_000)
  • :meta - User metadata returned by DurableServer.Supervisor.lookup/2
  • :permanent - Mark server for automatic restart by LifecycleManager (default: false)

Examples

# Simple init/1
def init(state) do
  {:ok, state, permanent: true}
end

# init/2 with runtime info
def init(state, info) do
  # Access built-in values
  %{key: key, task_supervisor: task_sup} = info

  # Access user-defined values from supervisor's init_info
  api_client = info.api_client

  {:ok, Map.merge(state, %{task_sup: task_sup, api_client: api_client})}
end

init(loaded_state, info)

(optional)
@callback init(loaded_state :: map(), info :: map()) ::
  :ignore | {:ok, state :: term()} | {:ok, state :: term(), [init_option()]}

load_state(old_vsn, persisted_state)

@callback load_state(old_vsn :: pos_integer() | nil, persisted_state :: map()) :: map()

Transform backend-decoded persisted state back into user state format.

This required callback is used when loading state from the configured backend. It allows you to:

  • Convert backend-specific persisted shapes into your runtime state format
  • Set default values for missing keys
  • Initialize ephemeral state that wasn't persisted

On first boot for a never-before-persisted server, DurableServer encodes and decodes the result of dump_state/1 through the configured backend before calling load_state/2. This keeps the first-boot shape consistent with the shape you will receive on later restarts for that backend.

Persisted state is backend-dependent. For example:

  • DurableServer.Backends.ObjectStore usually passes JSON-decoded maps with string keys
  • DurableServer.Backends.EKVStore may pass maps with atom keys or other native Elixir terms

During backend migrations, it is valid for load_state/2 to receive multiple historical shapes until the migration is complete.

For a server that has never been persisted, the old_vsn will be nil.

Note: the function is NOT guaranteed to be idempotent. The durable server is not considered started until after load_state/2 is run and a lock is succesfully obtained with your loaded state. Concurrent nodes can race your state load and aquire the lock before you, so this function should not issue side effects like calling other processes. Peform such side effect work inside init/1, which is gauranteed to have started your durable server with a successful global lock.

Examples

def load_state(_old_vsn, dumped_state) do
  # Convert string keys to atoms and add ephemeral state
  %{
    count: Map.fetch!(dumped_state, "count"),
    temp_data: nil,
    status: :initialized
  }
end

terminate(reason, state)

(optional)
@callback terminate(reason :: term(), state :: term()) :: term()

Functions

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

claim_restart_attempt(store, stored_state, opts)

Attempt to atomically claim a restart attempt for a server.

Returns :ok if the claim succeeds, or {:error, reason} if it fails.

clear_restart_attempt(store, data)

Clear restart attempt metadata from a server object.

fatal_exit!(reason)

fatal_exit!(pid, reason)

fetch_stored_state(source, request, opts \\ [])

Fetches the DurableServer's current state from storage.

get_server_metadata(store, path)

Get just the metadata for a server without the full object.

start_link(info)