ZenMonitor v1.0.0 ZenMonitor.Local.Connector View Source

ZenMonitor.Local.Connector performs a variety of duties. For every remote that a the local is interested in monitoring processes on there will be a dedicated ZenMonitor.Local.Connector. This collection of Connectors are managed by a GenRegistry registered under the ZenMonitor.Local.Connector atom.

Connecting and Monitoring the remote ZenMonitor.Proxy

Connectors, as their name suggests, connect to the ZenMonitor.Proxy on the remote node that they are responsible for. They do this using standard ERTS Distribution, by invoking the remote Proxy’s ping command. A Remote is considered compatible if the ping command returns the :pong atom, otherwise it will be marked incompatible.

Connectors manage their remote node’s status in the global node status cache, and provide facilities for efficient querying of remote status, see compatibility/1 and cached_compatibility/1

Batching and Updating the remote ZenMonitor.Proxy

When a local process wishes to monitor a remote process, the Connector will be informed of this fact with a call to monitor/3. The Connector is responsible for maintaining a local record of this monitor for future fan-out and for efficiently batching up these requests to be delivered to the remote ZenMonitor.Proxy.

Fan-out of Dead Summaries

Periodically, the ZenMonitor.Proxy (technically the ZenMonitor.Proxy.Batcher) on the remote node will send a “Dead Summary”. This is a message from the remote that informs the Connector of all the processes the Connector has monitored that have gone down since the last summary.

The Connector uses it’s local records to generate a batch of down dispatches. These are messages that look identical to the messages provided by Process.monitor/1 when a process goes down. It is sometimes necessary for the original monitoring process to be able to discern whether the :DOWN message originated from ERTS or from ZenMonitor, to aid this, ZenMonitor will wrap the original reason in a tuple of {:zen_monitor, original_reason}.

The fan-out messages are sent to ZenMonitor.Local for eventual delivery via ZenMonitor.Local.Dispatcher, see those modules for more information.

Fan-out of nodedown / ZenMonitor.Proxy down

The Connector is also responsible for monitoring the remote node and dealing with nodedown (or the node becoming incompatible, either due to the ZenMonitor.Proxy crashing or a code change).

If the Connector detects that the remote it is responsible for is down or no longer compatible, it will fire every established monitor with {:zen_monitor, :nodedown}. It uses the same mechanism as for Dead Summaries, see ZenMonitor.Local and ZenMonitor.Local.Dispatcher for more information.

Link to this section Summary

Functions

Check the cached compatibility status for a remote node

Returns a specification to start this module under a supervisor

Gets the chunk size from the Application Environment

Puts the chunk size into the Application Environment

Determine the effective compatibility of a remote node

Connect to the provided remote

Asynchronously demonitors a pid

Get a connector from the registry by destination

Get a connector from the registry by remote node

Synchronous connect handler

Handles demonitoring a reference for a given pid

Handle other info

Invoked when the server is started. start_link/3 or start/3 will block until it returns

Asynchronously monitors a pid

Gets the sweep interval from the Application Environment

Puts the sweep interval into the Application Environment

Link to this section Types

Link to this type cached_compatibility() View Source
cached_compatibility() ::
  compatibility() | :miss | {:expired, integer()} | :unavailable
Link to this type compatibility() View Source
compatibility() :: :compatible | :incompatible
Link to this type death_certificate() View Source
death_certificate() :: {pid(), reason :: any()}
Link to this type down_dispatch() View Source
down_dispatch() ::
  {pid(), {:DOWN, reference(), :process, pid(), {:zen_monitor, any()}}}
Link to this type t() View Source
t() :: ZenMonitor.Local.Connector

Link to this section Functions

Link to this function cached_compatibility(remote) View Source
cached_compatibility(remote :: node()) :: cached_compatibility()

Check the cached compatibility status for a remote node

This will only perform a fast client-side lookup in the ETS table. If an authoritative entry is found it will be returned (either :compatible, :incompatible, or :unavailable). If no entry is found then :miss is returned. If an expired entry is found then {:expired, attempts} is returned.

Returns a specification to start this module under a supervisor.

See Supervisor.

Link to this function chunk_size() View Source
chunk_size() :: integer()

Gets the chunk size from the Application Environment

The chunk size is the maximum number of subscriptions that will be sent during each sweep, see ZenMonitor.Local.Connector’s @chunk_size for the default value

This can be controlled at boot and runtime with the {:zen_monitor, :connector_chunk_size} setting, see ZenMonitor.Local.Connector.chunk_size/1 for runtime convenience functionality.

Link to this function chunk_size(value) View Source
chunk_size(value :: integer()) :: :ok

Puts the chunk size into the Application Environment

This is a simple convenience function for overwriting the {:zen_monitor, :connector_chunk_size} setting at runtime.

Link to this function compatibility(remote) View Source
compatibility(remote :: node()) :: compatibility()

Determine the effective compatibility of a remote node

This will attempt a fast client-side lookup in the ETS table. Only a positive :compatible record will result in :compatible, otherwise the effective compatibility is :incompatible

Link to this function connect(remote) View Source
connect(remote :: node()) :: compatibility()

Connect to the provided remote

This function will not consult the cache before calling into the GenServer, the GenServer will consult with the cache before attempting to connect, this allows for many callers to connect with the server guaranteeing that only one attempt will actually perform network work.

If the compatibility of a remote host is needed instead, callers should use the compatibility/1 or cached_compatibility/1 functions. compatibility/1 will provide the effective compatibility, cached_compatibility/1 is mainly used internally but can provide more detailed information about the cache status of the remote. Neither of these methods, compatibility/1 nor cached_compatibility/1, will perform network work or call into the GenServer.

Link to this function demonitor(target, ref) View Source
demonitor(target :: ZenMonitor.destination(), ref :: reference()) :: :ok

Asynchronously demonitors a pid.

Get a connector from the registry by destination

Link to this function get_for_node(remote) View Source
get_for_node(remote :: node()) :: pid()

Get a connector from the registry by remote node

Link to this function handle_call(msg, from, state) View Source

Synchronous connect handler

Attempts to connect to the remote, this handler does check the cache before connecting to avoid a thundering herd.

Handles demonitoring a reference for a given pid

Cleans up the internal ETS record if it exists

Handle other info

If a call times out, the remote end might still reply and that would result in a handle_info

Invoked when the server is started. start_link/3 or start/3 will block until it returns.

args is the argument term (second argument) passed to start_link/3.

Returning {:ok, state} will cause start_link/3 to return {:ok, pid} and the process to enter its loop.

Returning {:ok, state, timeout} is similar to {:ok, state} except handle_info(:timeout, state) will be called after timeout milliseconds if no messages are received within the timeout.

Returning {:ok, state, :hibernate} is similar to {:ok, state} except the process is hibernated before entering the loop. See c:handle_call/3 for more information on hibernation.

Returning {:ok, state, {:continue, continue}} is similar to {:ok, state} except that immediately after entering the loop the c:handle_continue/2 callback will be invoked with the value continue as first argument.

Returning :ignore will cause start_link/3 to return :ignore and the process will exit normally without entering the loop or calling c:terminate/2. If used when part of a supervision tree the parent supervisor will not fail to start nor immediately try to restart the GenServer. The remainder of the supervision tree will be started and so the GenServer should not be required by other processes. It can be started later with Supervisor.restart_child/2 as the child specification is saved in the parent supervisor. The main use cases for this are:

  • The GenServer is disabled by configuration but might be enabled later.
  • An error occurred and it will be handled by a different mechanism than the Supervisor. Likely this approach involves calling Supervisor.restart_child/2 after a delay to attempt a restart.

Returning {:stop, reason} will cause start_link/3 to return {:error, reason} and the process to exit with reason reason without entering the loop or calling c:terminate/2.

Callback implementation for GenServer.init/1.

Link to this function monitor(target, ref, subscriber) View Source
monitor(
  target :: ZenMonitor.destination(),
  ref :: reference(),
  subscriber :: pid()
) :: :ok

Asynchronously monitors a pid.

Link to this function sweep_interval() View Source
sweep_interval() :: integer()

Gets the sweep interval from the Application Environment

The sweep interval is the number of milliseconds to wait between sweeps, see ZenMonitor.Local.Connector’s @sweep_interval for the default value

This can be controlled at boot and runtime with the {:zen_monitor, :connector_sweep_interval} setting, see ZenMonitor.Local.Connector.sweep_interval/1 for runtime convenience functionality.

Link to this function sweep_interval(value) View Source
sweep_interval(value :: integer()) :: :ok

Puts the sweep interval into the Application Environment

This is a simple convenience function for overwriting the {:zen_monitor, :connector_sweep_interval} setting at runtime.