Agentic.Strategy.Experiment (agentic v0.2.2)

Copy Markdown

Experiment runner for head-to-head strategy comparison.

Runs the same prompts through multiple strategies with configurable repetitions, then computes comparison metrics.

Summary

Functions

Compare results across strategies, computing aggregate metrics.

Run an experiment, collecting results for each (prompt, strategy, repetition) triple.

Types

comparison()

@type comparison() :: %{
  strategy: atom(),
  run_count: non_neg_integer(),
  success_count: non_neg_integer(),
  success_rate: float(),
  avg_duration_ms: float(),
  avg_cost: float(),
  avg_tokens: non_neg_integer(),
  avg_tool_calls: non_neg_integer()
}

result()

@type result() :: %{
  strategy: atom(),
  prompt: String.t(),
  repetition: pos_integer(),
  result: {:ok, map()} | {:error, term()},
  duration_ms: non_neg_integer()
}

t()

@type t() :: %Agentic.Strategy.Experiment{
  base_opts: keyword(),
  description: String.t() | nil,
  id: term(),
  name: String.t() | nil,
  prompts: [String.t()],
  repetitions: pos_integer(),
  results: [result()] | nil,
  status: atom() | nil,
  strategies: [atom()]
}

Functions

compare(experiment)

@spec compare(t()) :: [comparison()]

Compare results across strategies, computing aggregate metrics.

run(experiment)

@spec run(t()) :: t()

Run an experiment, collecting results for each (prompt, strategy, repetition) triple.