Hybrid Logical Clock (HLC) for FerricStore.
An HLC combines a physical wall-clock component (milliseconds since epoch) with a logical counter to produce timestamps that are:
- Monotonically increasing -- even when the wall clock is not (NTP corrections, VM migration, etc.).
- Causally ordered -- merging a remote timestamp via
update/1ensures happens-before relationships are preserved across nodes. - Close to real time -- the physical component tracks
System.os_time(:millisecond)and only diverges when the wall clock jumps backward or a remote node is ahead.
Spec reference (2G.6)
- HLC is piggybacked on Raft heartbeats.
- Inter-node TTL precision is bounded to Raft heartbeat RTT (~10 ms).
- A telemetry warning is emitted at 500 ms drift between HLC physical and wall clock.
- TTL-sensitive reads are rejected at 1 000 ms drift.
- The HLC millisecond component is used for Stream ID generation (XADD
*). - HLC timestamps are stamped on commands before they enter Raft -- they
are not computed inside
apply().
Architecture
The hot-path functions now/0 and now_ms/0 are lock-free: they use
a single :atomics slot (stored in :persistent_term) instead of a
GenServer.call. Physical ms and logical counter are packed into one
64-bit integer, eliminating the two-slot race that broke monotonicity
under contention.
The GenServer is retained only for:
update/1-- merging remote timestamps from Raft heartbeats (~6 calls/sec). This must be serialized to avoid lost-update races between concurrent merges.- Process supervision -- the
init/1callback creates the atomics ref and stores it in:persistent_term.
Packed layout in a single unsigned 64-bit atomic:
|-- physical_ms (48 bits) --|-- logical (16 bits) --|48 bits covers ~8,920 years of milliseconds. 16 bits allows 65,535 increments per millisecond. If logical overflows, it spills into the physical bits — effectively advancing the clock by 1 ms, which is safe.
Usage
The application supervision tree starts a single named HLC process
(Ferricstore.HLC). Other modules obtain timestamps via:
{physical_ms, logical} = Ferricstore.HLC.now()
ms = Ferricstore.HLC.now_ms()When receiving a Raft heartbeat with a piggybacked timestamp:
:ok = Ferricstore.HLC.update(remote_timestamp)Timestamp representation
A timestamp is a 2-tuple {physical_ms, logical} where:
physical_ms-- milliseconds since Unix epoch (same scale asSystem.os_time(:millisecond))logical-- a non-negative integer counter that disambiguates events within the same physical millisecond
Timestamps are ordered lexicographically: physical first, then logical.
Cross-node synchronization
HLC sync across nodes currently happens via Raft heartbeats (~6/sec), giving ~150ms max drift. This is sufficient for all current use cases:
- Read-your-writes on the writing node: guaranteed by
ra:process_command(reply_from: :local). - Cross-node reads after quorum write: guaranteed by Raft consensus — all quorum nodes applied the command before the write returns.
- TTL expiry: second-granularity TTLs are unaffected by 150ms drift.
- Cross-shard WATCH: uses value hashing, not timestamps.
Per-command HLC stamping (not wired up)
The state machine can handle commands wrapped as
{inner_command, %{hlc_ts: {physical_ms, logical}}}. When apply/3
processes a wrapped command, it calls update/1 to merge the leader's
clock into the follower's HLC. This gives per-command causal sync (~0ms
drift) instead of heartbeat-level sync (~150ms drift).
Nothing sends wrapped commands today. Enable if any of these arise:
- Sub-second TTLs with cross-node reads — a key with
PX 100(100ms TTL) could appear alive on one node and expired on another within the 150ms drift window. - Cross-node causal ordering — if a feature needs "write A happened-before write B" guarantees across nodes beyond Raft log ordering (e.g., conflict resolution in multi-leader setups).
- Observed drift exceeding 500ms — if heartbeat intervals increase due to network issues or overloaded nodes.
To enable, stamp commands in Router.raft_write/4 before submitting:
stamped = {command, %{hlc_ts: HLC.now()}}
:ra.process_command(shard_id, stamped, opts)Cost: ~50ns per write (now/0 on leader) + one update/1 GenServer
call per follower during apply/3. CockroachDB and YugabyteDB stamp
every RPC and consider the cost negligible. Redis and Kvrocks do not
have cross-node transactions or HLC.
Graceful fallback
When the HLC GenServer has not been started (e.g. in unit tests that exercise
command modules without the full application), now/0 and now_ms/0 fall
back to System.os_time(:millisecond) with a logical counter of 0.
Summary
Types
An HLC timestamp: {physical_ms, logical_counter}.
Functions
Returns a specification to start this module under a supervisor.
Compares two HLC timestamps.
Returns true when drift exceeds the reject threshold (1 000 ms), meaning
TTL-sensitive reads should not be served.
Returns true when drift exceeds the reject threshold (1 000 ms).
Returns the absolute drift in milliseconds between the HLC physical component and the current wall clock.
Returns the absolute drift in milliseconds.
Extracts the millisecond (physical) component from an HLC timestamp.
Returns the current HLC timestamp as {physical_ms, logical}.
Returns the current HLC timestamp as {physical_ms, logical}.
Convenience: returns only the physical (millisecond) component of now/0.
Starts the HLC GenServer.
Merges a remote HLC timestamp received from another node.
Merges a remote HLC timestamp, sending the call to server.
Types
@type timestamp() :: {non_neg_integer(), non_neg_integer()}
An HLC timestamp: {physical_ms, logical_counter}.
Functions
Returns a specification to start this module under a supervisor.
See Supervisor.
Compares two HLC timestamps.
Returns :lt, :eq, or :gt following the same convention as
DateTime.compare/2.
Examples
iex> Ferricstore.HLC.compare({100, 0}, {200, 0})
:lt
iex> Ferricstore.HLC.compare({100, 1}, {100, 0})
:gt
iex> Ferricstore.HLC.compare({100, 0}, {100, 0})
:eq
@spec drift_exceeded?() :: boolean()
Returns true when drift exceeds the reject threshold (1 000 ms), meaning
TTL-sensitive reads should not be served.
Lock-free; does not call the GenServer.
@spec drift_exceeded?(GenServer.server()) :: boolean()
Returns true when drift exceeds the reject threshold (1 000 ms).
This overload accepts a server argument for backward compatibility.
Parameters
server-- ignored (kept for API compatibility).
@spec drift_ms() :: non_neg_integer()
Returns the absolute drift in milliseconds between the HLC physical component and the current wall clock.
This function is lock-free -- it reads the atomics ref directly.
Under normal single-node operation this is 0 or near-0. A non-zero drift
indicates that update/1 received a future timestamp from a remote node
or the wall clock jumped backward.
@spec drift_ms(GenServer.server()) :: non_neg_integer()
Returns the absolute drift in milliseconds.
This overload accepts a server argument for backward compatibility.
Parameters
server-- ignored (kept for API compatibility).
@spec encode_ms(timestamp()) :: non_neg_integer()
Extracts the millisecond (physical) component from an HLC timestamp.
This is used when a caller needs a plain integer millisecond value, for example as the ms part of a Redis Stream ID.
Examples
iex> Ferricstore.HLC.encode_ms({1_234_567_890, 42})
1_234_567_890
@spec now() :: timestamp()
Returns the current HLC timestamp as {physical_ms, logical}.
This function is lock-free -- it reads and updates a single packed
:atomics slot directly, bypassing the GenServer. Physical ms and logical
counter are packed into one 64-bit integer, so updates are truly atomic
with no torn-read races.
On the common path (same millisecond), uses add_get(ref, 1, 1) which
atomically increments the logical counter with zero contention. CAS is
only used at millisecond boundaries.
Falls back to {System.os_time(:millisecond), 0} when the HLC GenServer
has not been started.
Examples
iex> {phys, logical} = Ferricstore.HLC.now()
iex> phys > 0
true
iex> logical >= 0
true
@spec now(GenServer.server()) :: timestamp()
Returns the current HLC timestamp as {physical_ms, logical}.
This overload accepts a server argument for backward compatibility with
existing tests. In production, prefer the zero-arity now/0 which is
lock-free.
Parameters
server-- ignored (kept for API compatibility). The atomics ref is always read from:persistent_term.
@spec now_ms() :: non_neg_integer()
Convenience: returns only the physical (millisecond) component of now/0.
This is the value used as the millisecond part of Redis Stream IDs and as the base for TTL computations. Lock-free; does not call the GenServer.
Falls back to System.os_time(:millisecond) when the HLC GenServer is
not running.
@spec now_ms(GenServer.server()) :: non_neg_integer()
Convenience: returns only the physical (millisecond) component of now/0.
This overload accepts a server argument for backward compatibility.
Parameters
server-- ignored (kept for API compatibility).
@spec start_link(keyword()) :: GenServer.on_start()
Starts the HLC GenServer.
The GenServer creates an :atomics ref on init and stores it in
:persistent_term so that now/0 and now_ms/0 can read/write it
without a GenServer call.
Options
:name-- process name (default:Ferricstore.HLC)
@spec update(timestamp()) :: :ok
Merges a remote HLC timestamp received from another node.
Called in two contexts:
- Raft heartbeats (~6 calls/sec) -- the leader's HLC timestamp is piggybacked on heartbeat RPCs. This is the current sync mechanism, giving ~150ms max drift between nodes.
- Per-command stamping (not wired up) -- if commands are wrapped as
{inner_command, %{hlc_ts: ts}}, the state machine callsupdate/1duringapply/3on each follower, giving per-command causal sync. See the module doc for when to enable this.
This is a GenServer.call because remote timestamp merges must be
serialized to avoid lost-update races. This is acceptable because merges
are rare (~6 calls/sec from Raft heartbeats).
The merge rule follows the standard HLC algorithm:
new_physical = max(wall_clock, local_physical, remote_physical)- If all three physical values tie,
logical = max(local_logical, remote_logical) + 1. - If two tie at the max, the logical from the winner is incremented.
- If wall clock alone wins, logical resets to 0.
Parameters
remote_ts-- the remote HLC timestamp{physical_ms, logical}
@spec update(GenServer.server(), timestamp()) :: :ok
Merges a remote HLC timestamp, sending the call to server.
Parameters
server-- the HLC processremote_ts-- the remote HLC timestamp{physical_ms, logical}