ExternalService (ExternalService v2.0.0-rc.2)

Copy Markdown View Source

ExternalService handles all retry and circuit breaker logic for calls to external services.

The recommended way to use it is the declarative module-based front door, use ExternalService (see __using__/1), which lets you configure a service's circuit breaker, rate limiting, and default retry options in one place. The functional API (start/2, call/3, and friends) is the lower-level foundation it is built on, and can be used directly when you need more control.

Telemetry

ExternalService emits :telemetry events so that calls to external services can be observed and instrumented. Attach a handler to any of the events below to forward them to your metrics or logging backend.

All events carry a :service key in their metadata, which is the name of the service the event relates to.

  • [:external_service, :call, :start] - emitted when a guarded call begins.

    • Measurements: :system_time, :monotonic_time
    • Metadata: :service
  • [:external_service, :call, :stop] - emitted when a guarded call completes (including when it returns an error such as ExternalService.RetriesExhausted or ExternalService.CircuitBreakerOpen).

    • Measurements: :duration, :monotonic_time
    • Metadata: :service, :result (the value returned from the call)
  • [:external_service, :call, :exception] - emitted when a guarded call raises (for example a non-retriable exception, or call!/3 raising on an open circuit breaker or exhausted retries).

    • Measurements: :duration, :monotonic_time
    • Metadata: :service, :kind, :reason, :stacktrace
  • [:external_service, :call, :retry] - emitted each time a call's function fails in a way that melts the circuit breaker: it returned :retry / {:retry, reason}, it returned a result matched by the :retry_on predicate, or it raised an exception listed in the :retry_exceptions retry option. Exceptions not listed in :retry_exceptions neither melt the breaker nor emit this event. Whether another attempt is actually made depends on the retry options.

    • Measurements: :count (always 1)
    • Metadata: :service, :reason
  • [:external_service, :circuit_breaker, :blown] - emitted when a call is rejected because the service's circuit breaker is blown.

    • Measurements: :count (always 1)
    • Metadata: :service
  • [:external_service, :rate_limit, :sleep] - emitted when a call is throttled and put to sleep to stay within the configured rate limit.

    • Measurements: :sleep_time (milliseconds)
    • Metadata: :service

Summary

Types

Error returned when a service's circuit breaker is open

Union type representing all the possible error return values

Options for start/2. See the schema documented under start/2.

Error returned when the allowable number of retries has been exceeded

A term that uniquely identifies an external service.

Error returned when a service has not been started with ExternalService.start/2

The sleep function called when a call is throttled to stay within the rate limit.

Functions

Defines a module-based gateway to an external service.

Returns true only if every service in fuse_names is available?/1.

Returns true if the service is currently available, meaning its circuit breaker is not blown.

Returns true if the service's circuit breaker is currently blown.

Executes a function for the given service, handling retry and circuit breaker logic.

Like call/3, but raises an exception if retries are exhausted or the circuit breaker is open.

Asynchronous version of ExternalService.call.

Parallel, streaming version of ExternalService.call.

Parallel, streaming version of ExternalService.call.

Parallel, streaming version of ExternalService.call.

Resets the circuit breaker for the given service.

Initializes the circuit breaker (and optional rate limiting and default retry options) for a specific service.

Stops the fuse for a specific service.

Types

circuit_breaker_open()

@type circuit_breaker_open() :: {:error, ExternalService.CircuitBreakerOpen.t()}

Error returned when a service's circuit breaker is open

error()

Union type representing all the possible error return values

options()

@type options() :: keyword()

Options for start/2. See the schema documented under start/2.

retriable_function()

@type retriable_function() :: (-> retriable_function_result())

retriable_function_result()

@type retriable_function_result() ::
  :retry | {:retry, reason :: any()} | (function_result :: any())

retries_exhausted()

@type retries_exhausted() :: {:error, ExternalService.RetriesExhausted.t()}

Error returned when the allowable number of retries has been exceeded

service()

@type service() :: term()

A term that uniquely identifies an external service.

service_not_started()

@type service_not_started() :: {:error, ExternalService.ServiceNotStarted.t()}

Error returned when a service has not been started with ExternalService.start/2

sleep_function()

@type sleep_function() :: (non_neg_integer() -> any())

The sleep function called when a call is throttled to stay within the rate limit.

Blocking the calling process for an extended period is sometimes undesirable (for example in tests), so this can be overridden. Defaults to Process.sleep/1.

Functions

__using__(opts)

(macro)

Defines a module-based gateway to an external service.

use ExternalService generates a small, declarative wrapper around the functional API. Configure the circuit breaker, rate limiting, and default retry options at the module level, then start the module under a supervisor and call the service through the generated call/1 (and friends).

Example

defmodule MyApp.Stripe do
  use ExternalService,
    circuit_breaker: [tolerate: 5, within: :timer.seconds(1), reset: :timer.seconds(5)],
    rate_limit: [limit: 100, per: :timer.seconds(1)],
    retry: [max_attempts: 5, backoff: :exponential, jitter: true]

  def charge(params) do
    call fn ->
      case Stripe.charge(params) do
        {:ok, result} -> {:ok, result}
        {:error, %{status: status}} when status in 500..599 -> :retry
        other -> other
      end
    end
  end
end

Start it under your supervision tree:

children = [MyApp.Stripe]
Supervisor.start_link(children, strategy: :one_for_one)

Configuration can be overridden when starting (useful in tests), and is deep merged with the options given to use:

{MyApp.Stripe, circuit_breaker: [tolerate: 1], retry: [max_attempts: 1]}

Options

Accepts the same options as start/2 (:circuit_breaker, :rate_limit, :retry, :sleep_function), plus:

  • :name - the term that identifies the service. Defaults to the module name.

Generated functions

all_available?(services)

@spec all_available?([service()]) :: boolean()

Returns true only if every service in fuse_names is available?/1.

Useful for guarding work that depends on several services at once.

Examples

if ExternalService.all_available?([:payments, :inventory]) do
  place_order(order)
else
  {:error, :service_unavailable}
end

available?(service)

@spec available?(service()) :: boolean()

Returns true if the service is currently available, meaning its circuit breaker is not blown.

This is useful for the circuit breaker pattern: before kicking off expensive work, you can check whether the services it depends on are available and bail out early (returning a degraded response) if any of them are not.

A service that has not been started (see start/2) is reported as not available. Note that availability can change between this check and a subsequent call/3, so this is a best-effort signal, not a guarantee.

Examples

if ExternalService.available?(:payments) do
  charge(order)
else
  {:error, :payments_unavailable}
end

blown?(service)

@spec blown?(service()) :: boolean()

Returns true if the service's circuit breaker is currently blown.

A service that has not been started (see start/2) is not considered blown; use available?/1 if you want "ready to use" semantics that also account for services that were never started.

call(service, function)

@spec call(service(), retriable_function()) :: error() | (function_result :: any())

Executes a function for the given service, handling retry and circuit breaker logic.

ExternalService.start/2 must be called for the service before using call.

The provided function can indicate that a retry should be performed by returning the atom :retry or a tuple of the form {:retry, reason}, where reason is any arbitrary term. Any other result is considered successful, so the operation will not be retried and the result of the function will be returned as the result of call.

For functions that were not written to return :retry/{:retry, reason}, the :retry_on retry option takes a predicate that is run on the return value; when it returns a truthy value the call is retried as though the function had returned {:retry, result} (the result becomes the retry reason and the circuit breaker melts). An explicit :retry/{:retry, reason} return always takes precedence over the predicate.

Raised exceptions are only retried if their type is listed in the :retry_exceptions retry option (which defaults to []); otherwise they propagate to the caller untouched. An exception that is not retried also does not melt the circuit breaker — :retry_exceptions governs both retrying and whether a raised exception counts as a circuit-breaker failure.

retry_opts may be a ExternalService.RetryOptions.t/0 struct or a keyword list of retry options. A keyword list is treated as per-call overrides: it is merged onto the service's configured default retry options (from start/2), so it overrides only the keys it lists and inherits the rest. A RetryOptions struct, being a complete set of options, replaces the service defaults entirely. When omitted (the two-argument form call/2), the service's configured defaults are used.

call(service, retry_opts, function)

@spec call(
  service(),
  ExternalService.RetryOptions.t() | keyword(),
  retriable_function()
) ::
  error() | (function_result :: any())

call!(service, function)

@spec call!(service(), retriable_function()) :: function_result :: any() | no_return()

Like call/3, but raises an exception if retries are exhausted or the circuit breaker is open.

call!(service, retry_opts, function)

@spec call!(
  service(),
  ExternalService.RetryOptions.t() | keyword(),
  retriable_function()
) ::
  function_result :: any() | no_return()

call_async(service, function)

@spec call_async(service(), retriable_function()) :: Task.t()

Asynchronous version of ExternalService.call.

Returns a Task that may be used to retrieve the result of the async call.

call_async(service, retry_opts, function)

call_async_stream(enumerable, service, function)

@spec call_async_stream(Enumerable.t(), service(), (any() ->
                                                retriable_function_result())) ::
  Enumerable.t()

Parallel, streaming version of ExternalService.call.

See call_async_stream/5 for full documentation.

call_async_stream(enumerable, service, retry_opts_or_async_opts, function)

@spec call_async_stream(
  Enumerable.t(),
  service(),
  ExternalService.RetryOptions.t() | (async_opts :: list()),
  (any() -> retriable_function_result())
) :: Enumerable.t()

Parallel, streaming version of ExternalService.call.

See call_async_stream/5 for full documentation.

call_async_stream(enumerable, service, retry_opts, async_opts, function)

@spec call_async_stream(
  Enumerable.t(),
  service(),
  ExternalService.RetryOptions.t() | keyword() | nil,
  async_opts :: list(),
  (any() -> retriable_function_result())
) :: Enumerable.t()

Parallel, streaming version of ExternalService.call.

This function uses Elixir's built-in Task.async_stream/3 function and the description below is taken from there.

Returns a stream that runs the given function function concurrently on each item in enumerable.

Each enumerable item is passed as argument to the given function function and processed by its own task. The tasks will be linked to the current process, similarly to async/1.

reset(service)

@spec reset(service()) :: :ok | {:error, :not_found}

Resets the circuit breaker for the given service.

After reset, the breaker will be closed with no recorded failures.

start(service, options \\ [])

@spec start(service(), options()) :: :ok

Initializes the circuit breaker (and optional rate limiting and default retry options) for a specific service.

The service is a term that uniquely identifies an external service within the scope of an application.

Options

  • :circuit_breaker (keyword/0) - Circuit breaker configuration. The default value is [].

    • :tolerate (pos_integer/0) - Number of failures tolerated within the :within window before the breaker opens. The default value is 10.

    • :within (pos_integer/0) - Length of the failure-counting window, in milliseconds. The default value is 10000.

    • :reset (pos_integer/0) - Milliseconds to wait before the breaker resets (closes) after it has opened. The default value is 60000.

    • :fault_injection (float/0) - If set to a rate between 0.0 and 1.0, randomly fails that fraction of calls. Intended for testing how dependents behave when this service is degraded.

  • :rate_limit (keyword/0) - Optional rate-limiting configuration. Omit for no rate limiting.

    • :limit (pos_integer/0) - Required. Maximum number of calls allowed within each :per window.

    • :per (pos_integer/0) - Required. Length of the rate-limiting window, in milliseconds.

  • :retry - Default retry options for the service, used by call/2. See ExternalService.RetryOptions for the available keys. The default value is [].

  • :sleep_function (function of arity 1) - Overrides the function used to sleep while rate limited (defaults to Process.sleep/1).

stop(service)

@spec stop(service()) :: :ok

Stops the fuse for a specific service.

Stopping is idempotent: it is safe to call on a service that was never started or has already been stopped.