ExMCP.Transport.Beam.HealthMonitor (ex_mcp v0.9.2)

View Source

Health monitoring for MCP services in BEAM transport clustering.

Monitors service health through various mechanisms:

  • Process liveness checks
  • Custom health check callbacks
  • Response time monitoring
  • Circuit breaker integration

Automatically removes unhealthy services from the registry and notifies the cluster coordinator of health changes.

Health Check Methods

  • Process Monitor: Monitor service processes for crashes
  • Ping: Send ping messages to verify responsiveness
  • Custom Callback: Use service-defined health check functions
  • Response Time: Track and threshold response times

Example Usage

{:ok, monitor} = HealthMonitor.start_link(%{
  registry: registry_pid,
  check_interval: 5000,
  service_timeout: 3000,
  methods: [:process, :ping, :custom]
})

Summary

Functions

Manually triggers a health check for a specific service.

Returns a specification to start this module under a supervisor.

Gets health status for a specific service.

Gets health monitoring statistics.

Adds a service to health monitoring.

Starts the health monitor with the given configuration.

Stops the health monitor.

Removes a service from health monitoring.

Types

config()

@type config() :: %{
  registry: GenServer.server(),
  check_interval: non_neg_integer(),
  service_timeout: non_neg_integer(),
  methods: [health_method()],
  max_failures: non_neg_integer(),
  failure_window: non_neg_integer()
}

health_method()

@type health_method() :: :process | :ping | :custom | :response_time

health_status()

@type health_status() :: :healthy | :unhealthy | :unknown

Functions

check_service(monitor, service_id)

@spec check_service(GenServer.server(), String.t()) ::
  {:ok, health_status()} | {:error, term()}

Manually triggers a health check for a specific service.

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

get_health_status(monitor, service_id)

@spec get_health_status(GenServer.server(), String.t()) ::
  {:ok, health_status()} | {:error, :not_found}

Gets health status for a specific service.

get_stats(monitor)

@spec get_stats(GenServer.server()) :: {:ok, map()}

Gets health monitoring statistics.

monitor_service(monitor, service_id, health_config \\ %{})

@spec monitor_service(GenServer.server(), String.t(), map()) :: :ok

Adds a service to health monitoring.

start_link(config)

@spec start_link(config()) :: GenServer.on_start()

Starts the health monitor with the given configuration.

stop(monitor)

@spec stop(GenServer.server()) :: :ok

Stops the health monitor.

unmonitor_service(monitor, service_id)

@spec unmonitor_service(GenServer.server(), String.t()) :: :ok

Removes a service from health monitoring.