Otel.OTLP.HTTP.Retry (otel v0.2.0)

Copy Markdown View Source

Retry wrapper around :httpc.request/4 for OTLP/HTTP exporters.

Spec protocol/exporter.md §Retry L181-L183:

"Transient errors MUST be handled with a retry strategy. This retry strategy MUST implement an exponential back-off with jitter to avoid overwhelming the destination until the network is restored or the destination has recovered."

Linked from opentelemetry-proto/docs/specification.md §"OTLP/HTTP Throttling" L590-L600 and §"All Other Responses" L605-L611 / §"OTLP/HTTP Connection" L615-L621.

Retryable conditions

Per opentelemetry-proto/docs/specification.md §"Retryable Response Codes" L565-L573:

StatusRetry?
200-299success — no retry
429 Too Many Requestsretry, honor Retry-After
502 Bad Gatewayretry
503 Service Unavailableretry, honor Retry-After
504 Gateway Timeoutretry
other 4xx/5xxnon-retryable, fail
connection errorsretry

When the server returns a retryable response with a Retry-After header (RFC 7231 §7.1.3), the retry honors the delta-seconds value rather than computing its own backoff (spec L590-L596 SHOULD).

Only the delta-seconds form is parsed. RFC 7231 also permits an HTTP-date form ("Fri, 31 Dec 1999 23:59:59 GMT"); when the server uses that form, the parser returns nil and the code falls back to the computed exponential backoff for that retry. The spec clause is SHOULD, so falling back is conformant; OTLP servers emit delta-seconds in practice.

Backoff formula

  • delay(n) = min(initial * multiplier^n, max_backoff)
  • jitter: each delay is multiplied by a random factor (1 + jitter_ratio * U(-1, 1))

Defaults are chosen to match the Java OTLP SDK's published defaults (no spec mandate on values):

OptionDefaultDescription
:max_attempts5total attempts including the first
:initial_backoff_ms1_000first retry delay before jitter
:max_backoff_ms5_000upper bound on per-attempt delay
:multiplier1.5exponential growth factor
:jitter_ratio0.2±20% randomization on each delay

Return shape

request/5 returns :ok if any attempt receives a 2xx response. After exhausting retries on transient errors, or on the first non-retryable response, it returns {:error, reason}. The caller (an OTLP exporter) is then responsible for satisfying its own behaviour contract (SpanExporter.export/3, etc.) — exporters typically map the {:error, _} to :error on the SDK behaviour.

Summary

Functions

Sends an OTLP/HTTP POST with retry on transient errors.

Types

request_args()

@type request_args() ::
  {url :: charlist(), headers :: [{charlist(), charlist()}],
   content_type :: charlist(), body :: binary()}

retry_opts()

@type retry_opts() :: %{
  optional(:max_attempts) => pos_integer(),
  optional(:initial_backoff_ms) => pos_integer(),
  optional(:max_backoff_ms) => pos_integer(),
  optional(:multiplier) => float(),
  optional(:jitter_ratio) => float()
}

Functions

request(request_args, http_options, request_options, retry_opts \\ %{})

@spec request(
  request_args :: request_args(),
  http_options :: keyword(),
  request_options :: keyword(),
  retry_opts :: retry_opts()
) :: :ok | {:error, term()}

Sends an OTLP/HTTP POST with retry on transient errors.

request_args are the positional args to :httpc.request/4's request body ({url, headers, content_type, body}). http_options and request_options are passed through as the third and fourth :httpc.request/4 arguments.

Retries are governed by retry_opts; defaults are applied for any keys not provided.