Baton.LLMWorker (Baton v0.1.0)

Copy Markdown View Source

Base module for workflow steps that call an LLM API.

Extends Baton.Worker with LLM-appropriate defaults:

  • Configurable timeout/1 — defaults to :infinity (Oban's own default), overridable per worker with the :timeout option (in milliseconds). LLM calls can run long, so most workers want either the default or a generous explicit cap to release a stuck connection's queue slot — for example use Baton.LLMWorker, timeout: :timer.minutes(5).
  • :llm queue — dedicated queue so LLM jobs don't compete with fast background jobs for concurrency slots
  • Jitter backoff — spreads retries across a window to avoid thundering herd against the LLM API rate limiter when multiple jobs fail simultaneously
  • Idempotency — inherited from Baton.Worker; a retried job that already stored a result returns it immediately without re-calling the API
  • Automatic stats recording — if your result map includes an "llm_usage" key, it is stripped out and written to workflow_step_stats automatically

Usage

defmodule MyApp.Workers.Summarize do
  use Baton.LLMWorker

  @impl true
  def perform_workflow(%Oban.Job{args: args} = job) do
    start = System.monotonic_time(:millisecond)

    case MyApp.LLM.complete(build_prompt(args)) do
      {:ok, response} ->
        latency = System.monotonic_time(:millisecond) - start

        {:ok, %{
          # Your actual result — passed to downstream steps
          text: response.text,

          # Picked up automatically by LLMWorker, stripped from result,
          # written to workflow_step_stats. Never seen by downstream steps.
          llm_usage: %{
            model:               response.model,
            input_tokens:        response.usage.input_tokens,
            output_tokens:       response.usage.output_tokens,
            cache_read_tokens:   response.usage.cache_read_input_tokens,
            cache_write_tokens:  response.usage.cache_creation_input_tokens,
            latency_ms:          latency
          }
        }}

      {:error, %{status: 429}} ->
        {:snooze, 30}

      {:error, reason} ->
        {:error, reason}
    end
  end
end

Overriding defaults

All Oban.Worker options can still be overridden at the use site:

use Baton.LLMWorker,
  max_attempts: 5,
  timeout: :timer.minutes(10),
  priority: 1

Queue config

config :baton, Oban,
  queues: [default: 20, llm: 5]

Backoff behaviour

Exponential backoff with ±50% uniform jitter:

  • Attempt 1 failure → retry in ~15–45s
  • Attempt 2 failure → retry in ~23–83s
  • Attempt 3 failure → retry in ~42–162s
  • Attempt 4 failure → retry in ~79–319s

Summary

Functions

Exponential backoff with uniform jitter. Exposed as a public function so it can be tested directly.

Functions

jittered_backoff(attempt)

@spec jittered_backoff(pos_integer()) :: pos_integer()

Exponential backoff with uniform jitter. Exposed as a public function so it can be tested directly.