WebsockexAdapter Supervision Strategy

View Source

Overview

WebsockexAdapter provides optional supervision for WebSocket client connections, ensuring resilience and automatic recovery from failures. This is critical for financial trading systems where connection stability directly impacts order execution and risk management.

Important: As a library, WebsockexAdapter does not start any supervisors automatically. You must explicitly add supervision to your application's supervision tree when needed.

Architecture

Your Application Supervisor
     WebsockexAdapter.ClientSupervisor (Optional DynamicSupervisor)
            Client GenServer 1
            Client GenServer 2
            Client GenServer N
     Your other children...

Key Components

1. ClientSupervisor (WebsockexAdapter.ClientSupervisor)

  • DynamicSupervisor for managing client connections
  • Restart strategy: :one_for_one (isolated failures)
  • Maximum 10 restarts in 60 seconds (configurable)
  • Each client runs independently

2. Client GenServer (WebsockexAdapter.Client)

  • Manages individual WebSocket connections
  • Handles Gun process ownership and message routing
  • Integrated heartbeat handling
  • Automatic reconnection on network failures

Usage Patterns

Pattern 1: No Supervision (Simple/Testing)

# Direct connection without supervision
{:ok, client} = WebsockexAdapter.Client.connect("wss://example.com")

# Use the client
WebsockexAdapter.Client.send_message(client, "Hello")

# Clean up when done
WebsockexAdapter.Client.close(client)

Pattern 2: Using ClientSupervisor

First, add the supervisor to your application:

defmodule MyApp.Application do
  use Application
  
  def start(_type, _args) do
    children = [
      # Add the WebsockexAdapter supervisor
      WebsockexAdapter.ClientSupervisor,
      # Your other children...
    ]
    
    Supervisor.start_link(children, strategy: :one_for_one)
  end
end

Then create supervised connections:

# Basic supervised connection
{:ok, client} = WebsockexAdapter.ClientSupervisor.start_client("wss://example.com")

# With configuration
{:ok, client} = WebsockexAdapter.ClientSupervisor.start_client("wss://example.com",
  retry_count: 10,
  heartbeat_config: %{type: :deribit, interval: 30_000}
)

Pattern 3: Direct Client Supervision

Add individual clients directly to your supervision tree:

defmodule MyApp.Application do
  use Application
  
  def start(_type, _args) do
    children = [
      # Supervise individual clients
      {WebsockexAdapter.Client, [
        url: "wss://exchange1.com",
        id: :exchange1_client,
        heartbeat_config: %{type: :deribit, interval: 30_000}
      ]},
      {WebsockexAdapter.Client, [
        url: "wss://exchange2.com", 
        id: :exchange2_client
      ]},
      # Your other children...
    ]
    
    Supervisor.start_link(children, strategy: :one_for_one)
  end
end

Restart Behavior

Transient Restart Strategy

  • Clients are restarted only if they exit abnormally
  • Normal shutdowns (via Client.close/1) don't trigger restart
  • Crashes and connection failures trigger automatic restart

Failure Scenarios

  1. Network Disconnection

    • Client detects connection loss
    • Attempts internal reconnection (configurable retries)
    • If max retries exceeded, GenServer exits
    • Supervisor restarts the client
  2. Process Crash

    • Supervisor immediately detects exit
    • Starts new client process
    • Connection re-established from scratch
  3. Heartbeat Failure

    • Client tracks heartbeat failures
    • Closes connection after threshold
    • Supervisor restarts for fresh connection

Production Considerations

1. Resource Management

  • Each supervised client consumes:
    • 1 Erlang process (Client GenServer)
    • 1 Gun connection process
    • Associated memory for state and buffers

2. Restart Limits

  • Default: 10 restarts in 60 seconds
  • Prevents restart storms
  • Adjust based on expected failure patterns

3. Monitoring

# List all supervised clients
clients = WebsockexAdapter.ClientSupervisor.list_clients()

# Check client health
health = WebsockexAdapter.Client.get_heartbeat_health(client)

4. Graceful Shutdown

# Stop a specific client
WebsockexAdapter.ClientSupervisor.stop_client(pid)

# Client won't be restarted (normal termination)

Best Practices

  1. Use Supervision for Production

    • Always use ClientSupervisor.start_client/2 for production
    • Direct connections only for testing/development
  2. Configure Appropriate Timeouts

    • Set heartbeat intervals based on exchange requirements
    • Configure retry counts for network conditions
  3. Monitor Client Health

    • Implement health checks using get_heartbeat_health/1
    • Set up alerts for excessive restarts
  4. Handle Restart Events

    • Subscriptions may need re-establishment
    • Authentication may need renewal
    • Order state should be reconciled

Example: Production Deribit Connection

defmodule TradingSystem.DeribitConnection do
  use GenServer
  
  def start_link(opts) do
    GenServer.start_link(__MODULE__, opts, name: __MODULE__)
  end
  
  def init(opts) do
    # Start supervised connection
    url = "wss://test.deribit.com/ws/api/v2"
    config = [
      heartbeat_config: %{type: :deribit, interval: 30_000},
      retry_count: 10,
      retry_delay: 1000
    ]
    
    {:ok, client} = WebsockexAdapter.ClientSupervisor.start_client(url, config)
    
    # Create adapter with supervised client
    adapter = %WebsockexAdapter.Examples.DeribitAdapter{
      client: client,
      authenticated: false,
      subscriptions: MapSet.new(),
      client_id: opts[:client_id],
      client_secret: opts[:client_secret]
    }
    
    # Authenticate and subscribe
    {:ok, adapter} = WebsockexAdapter.Examples.DeribitAdapter.authenticate(adapter)
    {:ok, adapter} = WebsockexAdapter.Examples.DeribitAdapter.subscribe(adapter, [
      "book.BTC-PERPETUAL.raw",
      "trades.BTC-PERPETUAL.raw",
      "user.orders.BTC-PERPETUAL.raw"
    ])
    
    {:ok, %{adapter: adapter}}
  end
  
  # Handle reconnection events
  def handle_info({:gun_down, _, _, _, _}, state) do
    # Log disconnection
    Logger.warn("Deribit connection lost, supervisor will restart")
    {:noreply, state}
  end
end

Supervision Tree Visualization

YourApp.Supervisor
     WebsockexAdapter.Application
        WebsockexAdapter.ClientSupervisor
            Client_1 (Deribit Production)
            Client_2 (Deribit Test)
            Client_3 (Binance)
     YourApp.TradingEngine

The supervision strategy ensures that WebSocket connections remain stable and automatically recover from failures, critical for 24/7 financial trading operations.