Service registry
View SourceBarrel P2P's service registry lets a process register itself under a name; every other node in the cluster can then find that process. The registration is replicated across the cluster without coordination, and it merges deterministically under concurrent updates.
This page covers the data model, the lifecycle, and the patterns you reach for when building on top.
The data model
The registry is an Observed-Remove Map (OR-Map): a CRDT designed for the "named registrations that come and go" pattern.
Three properties matter:
- Adds and removes commute. Two concurrent adds of the same name produce two entries. A subsequent remove only deletes the dots it has observed; new concurrent adds survive.
- Tombstones are bounded. Old removes carry only the dots they observed and are garbage-collected once every node has applied them.
- Causal merging. Two replicas merge by union of dots plus the rule that a tombstoned dot stays tombstoned.
Each registration carries a dot, a {node, hlc_timestamp}
pair. The hybrid logical clock guarantees that two registrations
from the same node are ordered and that two registrations from
different nodes can be compared causally. The CRDT does not need
synchronised wall clocks.
For the underlying clock, see hybrid logical clocks.
The registration lifecycle
The typical flow:
%% On node A
1> barrel_p2p:register_service(my_worker, #{role => primary}).
ok
%% A moment later, on node B. lookup/1 returns #service_entry{}
%% records (fields: name, pid, node, meta).
2> barrel_p2p:lookup(my_worker).
{ok, [{service_entry, my_worker, <0.123.0>, 'node1@host', #{role => primary}}]}
3> {ok, Node, Pid} = barrel_p2p:whereis_service(my_worker).
{ok, 'node1@host', <0.123.0>}
4> Pid ! Msg. %% standard Erlang send, opens dist on demandWhat happens under the hood:
register_service/2inserts an entry into the local OR-Map with a fresh dot.- The registry's
barrel_p2p_replicainstance builds a delta of "what changed" and broadcasts it through Plumtree. - Each peer receives the delta and merges it into its own OR-Map.
- From the moment the merge completes on node B, B's
whereis_service/1finds the registration.
Replication is asynchronous and eventually consistent. The
delay between registration on A and visibility on B is
typically under one second on a healthy cluster. Code that
registers and immediately looks up on a different node should
expect to retry briefly; whereis_service/2 accepts a
retries option that does this for you.
Local vs remote lookups
Three lookup primitives, in increasing scope:
%% Local pids only.
barrel_p2p:lookup_local(Name) -> {ok, pid()} | {error, not_found}.
%% Every replica we know about, including remote ones.
barrel_p2p:lookup(Name) -> {ok, [#service_entry{}]} | {error, not_found}.
%% Single best match: local preferred, then remote, then overlay.
barrel_p2p:whereis_service(Name) ->
{ok, pid()} %% local
| {ok, node(), pid()} %% remote
| {error, not_found}.
barrel_p2p:whereis_service(Name, #{retries => 3}).whereis_service/1 is the function you reach for from
application code. It tries the local registry, then the local
cache of remote registrations, and finally an overlay route
request. The retries option absorbs the small replication
delay after a fresh registration.
Metadata
The third argument to register_service/2 is a map. The map is
replicated alongside the name and pid:
barrel_p2p:register_service(worker_service, #{
shard => 1,
capacity => 10000,
version => <<"1.4.2">>
}).You can put anything that fits a Map there. Two common uses:
- Load shedding. A worker periodically re-registers with an updated load metric; a load balancer reads the map at lookup time and picks the least-loaded worker.
- Capability discovery. A service publishes a list of capabilities; clients filter by capability when picking a service instance.
The metadata is not indexed: lookup/1 returns all instances
and the caller filters. For most workloads this is fine; if you
need a queryable index, build it on top.
Multiple instances per name
The OR-Map permits multiple entries under the same name, one per registering node:
%% On node A
barrel_p2p:register_service(worker, #{shard => 1}).
%% On node B
barrel_p2p:register_service(worker, #{shard => 2}).
%% From node C
{ok, Entries} = barrel_p2p:lookup(worker).
%% length(Entries) =:= 2This is the natural shape for a worker pool. If you want
exclusivity, layer it on top: only one node calls
register_service/2 at a time, or your application checks the
existing entries before registering.
Service events
A subscriber receives messages when services come and go:
barrel_p2p:subscribe_services().
%% Receives:
%% {barrel_p2p_service_event, {service_registered, Name, Node}}
%% {barrel_p2p_service_event, {service_unregistered, Name, Node}}
%% {barrel_p2p_service_event, {service_down, Name, Node, Reason}}service_down fires when the registered pid dies (the registry
monitors every pid it stores). The local node's service_down
fires immediately; remote nodes see it after the gossip
propagates.
Via callbacks
Barrel P2P implements the standard via-name interface, so
gen_server (and any module that uses the same convention) can
use barrel_p2p names directly:
%% Start a gen_server registered through barrel_p2p
gen_server:start({via, barrel_p2p, my_service}, my_module, [], []).
%% Call it by name
gen_server:call({via, barrel_p2p, my_service}, request).
%% Send to it
barrel_p2p:send(my_service, Msg).This makes barrel_p2p a drop-in replacement for gproc-style
name registration when the name needs to be cluster-wide.
For remote services, the registry returns a service proxy: a
local pid that forwards gen_server calls and casts to the
remote service. The proxy is what makes via-barrel_p2p work for
processes that live on another node.
Common patterns
Pool of workers
-include_lib("barrel_p2p/include/barrel_p2p.hrl").
least_loaded(Name) ->
{ok, Entries} = barrel_p2p:lookup(Name),
Sorted = lists:sort(
fun(A, B) ->
maps:get(load, A#service_entry.meta, 0)
=< maps:get(load, B#service_entry.meta, 0)
end,
Entries
),
hd(Sorted).A worker that re-registers periodically with an updated load
key publishes a load signal that callers can use.
Graceful degradation
call_service(Name, Request) ->
case barrel_p2p:whereis_service(Name, #{retries => 3}) of
{ok, Pid} ->
call_with_timeout(Pid, Request);
{ok, _Node, Pid} ->
call_with_timeout(Pid, Request);
{error, not_found} ->
{error, unavailable}
end.
call_with_timeout(Pid, Request) ->
try gen_server:call(Pid, Request, 5000) of
Reply -> Reply
catch
exit:{timeout, _} -> {error, timeout};
exit:{noproc, _} -> {error, gone}
end.Cache invalidation via service events
init(_) ->
barrel_p2p:subscribe_services(),
{ok, #{cache => #{}}}.
handle_info({barrel_p2p_service_event, {service_down, Name, _N, _R}}, S) ->
{noreply, S#{cache := maps:remove(Name, maps:get(cache, S))}};
handle_info(_, S) ->
{noreply, S}.Configuration knobs
The registry itself has no operator-tunable knobs; it follows the cluster topology. Performance settings that matter live under the gossip broadcast layer.
API
%% Register and unregister.
register_service(Name) -> ok | {error, term()}.
register_service(Name, Meta) -> ok | {error, term()}.
register_service(Name, Pid, Meta) -> ok | {error, term()}.
unregister_service(Name) -> ok.
%% Lookup.
lookup(Name) -> {ok, [#service_entry{}]} | {error, not_found}.
lookup_local(Name) -> {ok, pid()} | {error, not_found}.
list_services() -> [Name].
whereis_service(Name) ->
{ok, pid()} | {ok, node(), pid()} | {error, not_found}.
whereis_service(Name, #{retries => N}) -> same.
%% Via callbacks.
register_name(Name, Pid) -> yes | no.
unregister_name(Name) -> ok.
whereis_name(Name) -> pid() | undefined.
send(Name, Msg) -> pid().
%% Events.
subscribe_services() -> ok.
subscribe_services(Pid) -> ok.
unsubscribe_services(Pid) -> ok.
%% Global integration (beta).
global_register(Name) -> {ok, pid()} | {error, term()}.
get_proxy(Name) -> {ok, pid()} | not_found.Related
- Gossip broadcast is the protocol underneath the registry's replication.
- Hybrid logical clocks are the timestamps in each OR-Map dot.
- Cluster membership — the registry builds on top of the active view.
- Distributed chat tutorial applies the registry to a real application.