GenHerder behaviour (gen_herder v0.1.2)

A behaviour for avoiding the stampeding-herd problem.

rationale

Rationale

On a cold cache, it can conceivably happen that several processes attempt in parallel to obtain some data. Each attempt might result in the result being cached, but only subsequent calls would hit the cache.

GenHerder ensures that, for several concurrent identical calls, the result will be computed only once and returned to all the callers.

example

Example

GenHerder abstracts the only-once computing and requires only that the handle_request/1 and time_to_live/1 callbacks be implemented.

Here is a simple token generator that just encodes request as a result with a random component and expiry baked in.

defmodule TokenGenerator do
  use GenHerder

  # Callbacks

  def handle_request(request) do
    # Simulate work
    Process.sleep(2000)

    # Simply encode the request and a random component as the token
    access_token =
      %{request: request, ref: make_ref()} |> :erlang.term_to_binary() |> Base.encode64()

    %{access_token: access_token, expires_in: 2000}
  end

  def time_to_live(%{expires_in: expires_in} = _result) do
    # Make it expire 10% earlier
    trunc(expires_in * 0.9)
  end
end

# Start the process
{:ok, pid} = TokenGenerator.start_link()

# Usage
TokenGenerator.call(%{any: "kind", of: "data"})
#=> %{access_token: ..., expires_in: 2000}

No matter how many times TokenGenerator.call/1 is called with the same arguments in parallel within the time-to-live, handle_request/1 will be invoked only once.

caching

Caching

Valid returns for time_to_live/1 are :infinity to cache the result forever, or any integer to cache the result for as many milliseconds. A TTL of 0 or smaller will cause the result to not be cached at all, but still be sent to all callers that made the request prior to its completion.

supervision

Supervision

You would typically add implementations of the behaviour to your supervision tree.

children = [
  TokenGenerator
]

Supervisor.start_link(children, strategy: :one_for_all)

It should be possible to start the GenHerder globally by providing the :name option as {:global, :anything} or by using a "via tuple". While this guarantees that only a single GenServer of a given module will be started, it does not guarantee the same in the event of a network split. It is up to you to decide whether the possibility of multiple GenHerders for the same module could result in inconsistencies in your app.

under-the-hood

Under the hood

GenHerder employs a supervisor that supervises a GenServer and TaskSupervisor.

The GenServer keeps track all the processes that make a specific request. On incoming requests, if no such request was seen before (or has expired) a task is spawned (supervised by the TaskSupervisor) and the caller is appended to a list of callers. If a task has been spawned previously for the request, but has not completed, the caller is simply added to the list.

When the task for a given request is completed, all the callers are notified and the result is cached for the duration of the TTL.

If a request is made for a value that has already been computed, and is still in the cache, the result is simply returned.

Expiry works by sending a message to the GenServer to drop the given result. There is no guarantee regarding how long the message might be held up in the message inbox.

Since results are computed in tasks, computation does not block the GenServer.

Link to this section Summary

Link to this section Types

@type request() :: any()
@type result() :: any()
Link to this type

time_to_live()

@type time_to_live() :: integer() | :infinity

Link to this section Callbacks

Link to this callback

handle_request(request)

@callback handle_request(request()) :: result()
Link to this callback

time_to_live(result)

@callback time_to_live(result()) :: time_to_live()