ExLLM.Infrastructure.CircuitBreaker.Metrics (ex_llm v0.8.1)

View Source

Comprehensive metrics integration for circuit breakers.

Provides metrics collection and export for Prometheus and StatsD, enabling detailed monitoring and observability of circuit breaker performance.

Supported Metrics Backends

  • Prometheus: Via :prometheus_ex and :prometheus_plugs
  • StatsD: Via :statsd client
  • Custom: Pluggable metrics backend interface

Metrics Collected

Circuit Breaker Metrics

  • State Duration: Time spent in each state (closed/open/half_open)
  • Request Counts: Total requests, successes, failures by circuit
  • Failure Rates: Percentage of failed requests over time windows
  • Recovery Times: Time circuits spend in open state
  • Transition Counts: State change frequencies

Bulkhead Metrics

  • Concurrency: Active request counts and utilization
  • Queue Metrics: Queue length, wait times, timeouts
  • Throughput: Requests per second, completion rates
  • Rejection Rates: Percentage of rejected requests

Performance Metrics

  • Response Times: Request duration histograms
  • Error Rates: Error classification and frequencies
  • Circuit Health: Overall health scores and status

Configuration

config :ex_llm, :circuit_breaker_metrics,
  enabled: true,
  backends: [:prometheus, :statsd],
  prometheus: [
    registry: :default,
    namespace: "ex_llm_circuit_breaker"
  ],
  statsd: [
    host: "localhost",
    port: 8125,
    namespace: "ex_llm.circuit_breaker"
  ]

Usage

# Start metrics collection
ExLLM.Infrastructure.CircuitBreaker.Metrics.setup()

# Manual metric recording
ExLLM.Infrastructure.CircuitBreaker.Metrics.record_request("api_service", :success, 150)
ExLLM.Infrastructure.CircuitBreaker.Metrics.record_state_change("api_service", :closed, :open)

# Get current metrics
ExLLM.Infrastructure.CircuitBreaker.Metrics.get_metrics("api_service")

Summary

Functions

Export metrics in Prometheus format.

Get current metrics for a circuit.

Get system-wide metrics summary.

Record circuit health metrics.

Record a circuit breaker request with timing and result.

Record a circuit breaker state change.

Initialize metrics collection system.

Functions

export_prometheus()

Export metrics in Prometheus format.

get_metrics(circuit_name)

Get current metrics for a circuit.

get_system_metrics()

Get system-wide metrics summary.

record_bulkhead_metrics(circuit_name, metrics)

Record bulkhead metrics.

record_health_metrics(circuit_name, health)

Record circuit health metrics.

record_request(circuit_name, result, duration_ms)

Record a circuit breaker request with timing and result.

record_state_change(circuit_name, from_state, to_state)

Record a circuit breaker state change.

record_state_duration(circuit_name, state, duration_ms)

Record circuit state duration.

setup()

Initialize metrics collection system.