Nebulex v2.0.0-rc.0 Nebulex.Adapters.Partitioned View Source

Built-in adapter for partitioned cache topology.

A partitioned cache is a clustered, fault-tolerant cache that has linear scalability. Data is partitioned among all the machines of the cluster. For fault-tolerance, partitioned caches can be configured to keep each piece of data on one or more unique machines within a cluster. This adapter in particular hasn't fault-tolerance built-in, each piece of data is kept in a single node/machine (sharding), therefore, if a node fails, the data kept by this node won't be available for the rest of the cluster.

PG2 is used under-the-hood by the adapter to manage the cluster nodes. When the partitioned cache is started in a node, it creates a PG2 group and joins it (the cache supervisor PID is joined to the group). Then, when a function is invoked, the adapter picks a node from the node list (using the PG2 group members), and then the function is executed on that node. In the same way, when the supervisor process of the partitioned cache dies, the PID of that process is automatically removed from the PG2 group; this is why it's recommended to use a consistent hashing algorithm for the node selector.

NOTE: pg2 will be replaced by pg in future, since the pg2 module is deprecated as of OTP 23 and scheduled for removal in OTP 24.

This adapter depends on a local cache adapter (primary storage), it adds a thin layer on top of it in order to distribute requests across a group of nodes, where is supposed the local cache is running already. However, you don't need to define or declare an additional cache module for the ocal store, instead, the adapter initializes it automatically (adds the local cache store as part of the supervision tree) based on the given options within the primary: argument.

Features

  • Support for partitioned topology (Sharding Distribution Model).
  • Support for transactions via Erlang global name registration facility.
  • Configurable primary store (primary local cache).
  • Configurable hash-slot module to compute the node.

We can define a partitioned cache as follows:

defmodule MyApp.PartitionedCache do
  use Nebulex.Cache,
    otp_app: :my_app,
    adapter: Nebulex.Adapters.Partitioned

  @behaviour Nebulex.Adapter.HashSlot

  @impl true
  def keyslot(key, range) do
    key
    |> :erlang.phash2()
    |> :jchash.compute(range)
  end
end

Where the configuration for the cache must be in your application environment, usually defined in your config/config.exs:

config :my_app, MyApp.PartitionedCache,
  hash_slot: MyApp.PartitionedCache
  primary: [
    adapter: Nebulex.Adapters.Local,
    gc_interval: 86_400_000,
    backend: :shards,
    partitions: System.schedulers_online()
  ]

For more information about the usage, see Nebulex.Cache documentation.

Options

This adapter supports the following options and all of them can be given via the cache configuration:

  • :primary - The options that will be passed to the adapter associated with the local primary store. These options depend on the adapter to use, except for the shared option adapter: (see shared primary options below).

  • :hash_slot - Defines the module implementing Nebulex.Adapter.HashSlot behaviour.

  • task_supervisor_opts - Start-time options passed to Task.Supervisor.start_link/1 when the adapter is initialized.

Shared Primary Options

  • :adapter - The adapter to be used for the partitioned cache as the local primary store. Defaults to Nebulex.Adapters.Local.

The rest of the options depend on the adapter to use.

Runtime options

These options apply to all adapter's functions.

  • :timeout - The time-out value in milliseconds for the command that will be executed. If the timeout is exceeded, then the current process will exit. For executing a command on remote nodes, this adapter uses Task.await/2 internally for receiving the result, so this option tells how much time the adapter should wait for it. If the timeout is exceeded, the task is shut down but the current process doesn't exit, only the result associated with that task is skipped in the reduce phase.

Extended API

This adapter provides some additional convenience functions to the Nebulex.Cache API.

Retrieving the cluster nodes associated with the given cache name:

MyCache.nodes()
MyCache.nodes(:cache_name)

Get a cluster node for the cache name based on the given key:

MyCache.get_node("mykey")
MyCache.get_node(:cache_name, "mykey")

If no cache name is passed to the previous functions, the name of the calling cache module is used by default

Limitations

For Nebulex.Cache.get_and_update/3 and Nebulex.Cache.update/4, they both have a parameter that is the anonymous function, and it is compiled into the module where it is created, which means it necessarily doesn't exists on remote nodes. To ensure they work as expected, you must provide functions from modules existing in all nodes of the group.

Link to this section Summary

Link to this section Functions

Link to this function

eval_local_stream(primary, primary_meta, query, opts)

View Source

Helper to perform stream/3 locally.