Multi-Model Workflows

Copy Markdown View Source

Baton.MultiModel helps when different steps should run on different LLMs, or when you want to run the same step across several models and combine the results. It builds on the normal workflow API — see the building a workflow guide for the basics.

Per-step models

Hardcoding model strings inside each worker is awkward to change and test. Instead, declare a model map on the workflow with configure/2, and let each worker read its model from the job's args.

configure/2 records the map on the workflow; Baton.add/4 then injects the matching model into each step's args automatically as "workflow_model":

Baton.new(workflow_name: "analysis")
|> Baton.MultiModel.configure(%{
  parse:  "claude-sonnet-4-20250514",
  assess: "claude-opus-4-20250514",
  report: "claude-sonnet-4-20250514"
})
|> Baton.add(:parse,  ParseDoc.new(%{text: text}))
|> Baton.add(:assess, Assess.new(%{}),  deps: [:parse])
|> Baton.add(:report, Report.new(%{}),  deps: [:assess])
|> Baton.insert!()

In a worker, read the model with model_for/2, which falls back to a default when no model was configured for that step:

def perform_workflow(%Oban.Job{} = job) do
  model = Baton.MultiModel.model_for(job, "claude-sonnet-4-20250514")
  Baton.Debug.call_llm(job, messages, model: model)
end

Steps not in the map are untouched

Only steps whose name appears in the model map get workflow_model injected; everything else is added exactly as a normal Baton.add/4 call.

Fan-out across models, then synthesize

To run one analysis across several models in parallel and merge the outputs, use fan_out/4. It generates one step per model plus an optional synthesis step that depends on all of them:

Baton.new(workflow_name: "multi-model-quality")
|> Baton.add(:parse, ParseDoc.new(%{text: text}))
|> Baton.MultiModel.fan_out(:assess, Assess,
     models: [
       "claude-sonnet-4-20250514",
       "claude-opus-4-20250514",
       "gpt-4o"
     ],
     args: %{doc_id: "doc-123"},
     deps: [:parse],
     synthesize_with: SynthesizeAssessments,
     synthesize_model: "claude-opus-4-20250514"
   )
|> Baton.add(:report, Report.new(%{}), deps: [:assess_synthesis])
|> Baton.insert!()

This produces, all depending on :parse:

:assess_sonnet_4   
:assess_opus_4      :assess_synthesis  :report
:assess_gpt_4o     
  • Step names are derived from the model string: {base}_{short} (e.g. assess_sonnet_4), and the synthesis step is {base}_synthesis.
  • Each fan-out step runs Assess with its model injected and the shared args.
  • The synthesis step runs SynthesizeAssessments once all models have finished.

Options

OptionRequiredPurpose
:modelsyesmodel strings to fan out across
:argsnobase args passed to each model's worker (%{} default)
:depsnoupstream dependencies shared by all fan-out steps
:synthesize_withnoworker module for the synthesis step; omit to skip it
:synthesize_modelnomodel for the synthesis step (default: first model)
:synthesize_argsnoextra args merged into the synthesis step

If you omit :synthesize_with, no synthesis step is added — add your own downstream step that depends on the generated fan-out step names instead.

The synthesis worker

In the synthesis step, collect_fan_results/1 returns each model's result keyed by its model string, so you can compare or vote across them:

defmodule SynthesizeAssessments do
  use Baton.Worker, queue: :default

  @impl true
  def perform_workflow(%Oban.Job{} = job) do
    by_model = Baton.MultiModel.collect_fan_results(job)
    # => %{
    #   "claude-sonnet-4-20250514" => %{"score" => 7, ...},
    #   "claude-opus-4-20250514"   => %{"score" => 8, ...},
    #   "gpt-4o"                    => %{"score" => 6, ...}
    # }

    {:ok, %{"consensus" => MyApp.Vote.median(by_model)}}
  end
end

Tracking cost

Fanning out multiplies your LLM spend, so cost visibility matters. If a worker's result map includes an "llm_usage" key, Baton.LLMWorker strips it from the result and records per-step token counts and cost to workflow_step_stats automatically. Query it with Baton.Stats:

Baton.Stats.workflow_totals(workflow_id)
# => %{input_tokens: ..., output_tokens: ..., cost_usd: #Decimal<...>, ...}

Baton.Stats.cost_by_model(from_dt, to_dt)

Cost is computed via the configured pricing module (config :baton, pricing: MyApp.Pricing) — see Baton.Pricing. Provide your own and keep it current; the built-in Baton.Pricing.Default is a starting point only.

See Baton.MultiModel for the full API.