Monitorex.UrlNormalizer (monitorex v0.3.0)

Copy Markdown

Normalizes URLs to prevent high-cardinality URL explosion in the outbound dashboard.

Dynamic path segments like UUIDs, numeric IDs, hex strings, and long tokens are replaced with generic placeholders (:uuid, :id, :hex_id, :token) so that similar endpoints are grouped together in the dashboard rather than creating an unbounded number of distinct entries.

Supports:

  • Heuristic normalization of common dynamic segment patterns
  • User-defined regex → template patterns via application config
  • Cardinality cap per host (:max_endpoints_per_host, default 200)
  • Idempotent normalization (applying it twice produces the same result)

Summary

Functions

Normalizes a URL by replacing dynamic path segments with placeholders.

Normalizes a URL and applies the cardinality cap per host.

Functions

normalize(url)

@spec normalize(url :: String.t() | nil) :: String.t() | nil

Normalizes a URL by replacing dynamic path segments with placeholders.

Returns nil when given nil, empty string when given empty string.

Examples

iex> Monitorex.UrlNormalizer.normalize("https://api.example.com/users/12345")
"https://api.example.com/users/:id"

iex> Monitorex.UrlNormalizer.normalize("https://api.example.com/orders/550e8400-e29b-41d4-a716-446655440000")
"https://api.example.com/orders/:uuid"

iex> Monitorex.UrlNormalizer.normalize("https://api.example.com/v2/abc123def456")
"https://api.example.com/v2/:hex_id"

iex> Monitorex.UrlNormalizer.normalize("https://api.example.com/assets/a1b2c3d4e5f6_token_value")
"https://api.example.com/assets/:token"

iex> Monitorex.UrlNormalizer.normalize("https://api.example.com/users/123/orders/550e8400-e29b-41d4-a716-446655440000")
"https://api.example.com/users/:id/orders/:uuid"

iex> Monitorex.UrlNormalizer.normalize(nil)
nil

iex> Monitorex.UrlNormalizer.normalize("")
""

normalize(url, tracked_paths)

@spec normalize(url :: String.t(), tracked_paths :: MapSet.t()) :: String.t()

Normalizes a URL and applies the cardinality cap per host.

tracked_paths is a MapSet of {host, normalized_path} tuples. If the host already has max_endpoints_per_host distinct normalized paths tracked, the result is bucketed to /:other.

Examples

iex> tracked = MapSet.new([{"api.example.com", "/users/:id"}])
iex> Monitorex.UrlNormalizer.normalize("https://api.example.com/users/999", tracked)
"https://api.example.com/users/:id"