DiodeClient.NodeScorer (Diode Client v1.4.8)

Copy Markdown View Source

Tracks per-node connection outcomes and drives backoff for reconnect attempts.

Reasoning

Not all nodes are equally reliable. Some accept connections and stay up; others fail to connect, drop connections, or crash. Without scoring we would retry bad nodes as often as good ones, wasting time and resources. The scorer records failures and successes per node (by server URL) and increases the delay before the next connect/restart attempt when a node has a negative score. That way we back off from unreliable nodes while still trying good ones at the normal rate.

Score is an integer in a bounded range. It decreases on failures (e.g. connect error or connection process crash) and increases when a connection becomes stable (e.g. first peak received). The delay before the next attempt is base delay plus extra time when the score is negative; the worse the score, the longer we wait.

Well-behaved nodes

  • Stable seed: Connects once, receives peaks, stays up. Score moves from 0 toward the positive cap with each reported success; delay stays at base (e.g. 15 s) on restart.
  • Brief outage then recovery: Node fails once or twice (e.g. SSL closed), then restarts and gets a peak. Score dips then recovers; after a few successes it is back in positive territory and delay is base again.
  • New node: Never seen before has score 0, so delay is base—no penalty until we see failures.

Misbehaving nodes

  • Repeated connect failures: Node is down or rejecting connections. Each failed connect decrements the score; delay grows (e.g. 15 s + 10 s per negative point, capped). We retry less often and avoid hammering a bad host.
  • Connect then drop: Node accepts TLS then closes the connection or crashes. Manager reports a crash, score drops; next restart is delayed. If this repeats, score stays negative and we back off further.
  • Flaky node: Intermittent failures and successes. Score oscillates; average behaviour determines whether we tend toward base delay (more successes) or longer delays (more failures).

Summary

Functions

Returns a specification to start this module under a supervisor.

Returns the recommended delay in milliseconds before the next connect/restart attempt. Returns base_delay when NodeScorer is not running.

Reports a connection failure or abort for the given node (server URL).

Reports a stable connection for the given node (server URL).

Functions

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

get_delay(node_id)

Returns the recommended delay in milliseconds before the next connect/restart attempt. Returns base_delay when NodeScorer is not running.

report_failure(node_id)

Reports a connection failure or abort for the given node (server URL).

report_success(node_id)

Reports a stable connection for the given node (server URL).

start_link(opts \\ [])