Many external services impose a quota: no more than N requests per time window. Exceed it and you get throttled (or billed, or blocked). ExternalService can keep you under that quota automatically, across your entire application, using the ex_rated library.

Rate limiting is opt-in: omit the :rate_limit option and no limiting is applied.

Configuration

Add a :rate_limit option with a :limit and a :per window (in milliseconds):

use ExternalService,
  rate_limit: [
    limit: 100,                # at most 100 calls...
    per: :timer.seconds(1)     # ...per 1-second window
  ]
OptionRequiredMeaning
:limityesMaximum number of calls allowed within each :per window.
:peryesLength of the rate-limiting window, in milliseconds.

Both keys are required when :rate_limit is present.

How it works

The limit is tracked per service and shared across every caller in your application — every process that calls the service draws from the same bucket. So the example above guarantees no more than 100 calls per second in total, no matter how many processes are making them.

When a call would exceed the limit, ExternalService does not fail it. Instead it sleeps until the window has room, then proceeds. From your code's point of view the call simply takes a little longer; it still succeeds.

# This will never make more than 100 calls/second, even in a tight loop —
# excess calls sleep until the window allows them.
Enum.each(1..10_000, fn i ->
  MyApp.Api.fetch(i)
end)

Who sleeps?

The sleeping happens in whichever process is making the call:

  • With call/1 (synchronous), the calling process sleeps. Your code blocks until the call is allowed.
  • With call_async/1 and call_async_stream/2, the background task(s) sleep, not your calling process. This is often what you want for bulk work: kick off the stream and let the workers pace themselves.
# Bulk import that respects the rate limit without blocking the caller:
ids
|> MyApp.Api.call_async_stream(fn id -> MyApp.Api.fetch(id) end)
|> Enum.to_list()

Throttling has no timeout

A throttled call sleeps and retries the bucket until there is room — there is no upper bound on how long it waits. If callers are producing work faster than the limit allows, the backlog grows and individual calls can block for a long time. Rate limiting paces calls; it does not shed load. On latency- or demand-sensitive paths, keep an eye on the :rate_limit, :sleep telemetry (below) and, if needed, run the work through call_async_stream/2 (so a pool of tasks absorbs the wait) or apply your own back-pressure upstream.

Customizing the sleep

By default sleeping uses Process.sleep/1. In tests — where you don't want real delays — you can override it with :sleep_function:

use ExternalService,
  rate_limit: [limit: 100, per: :timer.seconds(1)],
  sleep_function: fn _ms -> :ok end

The function receives the number of milliseconds the library would otherwise sleep. This is also where you'd hook in deterministic test control or custom instrumentation.

Observing throttling

Every time a call is throttled and put to sleep, an [:external_service, :rate_limit, :sleep] telemetry event is emitted, with the sleep duration in its measurements. Attach a handler to track how often (and how long) you are being rate limited — a useful signal that you may need a higher quota or fewer calls. See the Telemetry guide.

Rate limiting and the circuit breaker

Rate-limit sleeps are independent of the circuit breaker: being throttled is not a failure and does not melt the breaker. A throttled call waits and then runs normally, succeeding or failing on its own merits.