Capture and inspect full LLM context windows at each workflow step.
Why this exists
When debugging a multi-step LLM pipeline you need to answer:
- What exact prompt did step 3 send?
- What upstream context did it have access to?
- Did the model see the right information or was something lost in transit?
- What did the raw response look like before parsing?
This module captures the full request/response payload and stores it in a
dedicated table, separate from the job's meta (which only holds the
parsed result) and separate from workflow_step_stats (which only holds
numeric usage data).
How to use it
Option A: Use call_llm/4 (recommended)
A drop-in replacement for your LLM client that handles capture automatically:
def perform_workflow(%Oban.Job{} = job) do
{:ok, parsed} = Results.get_result(job, :parse_patent)
messages = [
%{role: "system", content: "You are a patent analyst."},
%{role: "user", content: "Assess this patent: #{parsed["title"]}"}
]
case Debug.call_llm(job, messages, model: "claude-sonnet-4-20250514") do
{:ok, response, _debug_log} ->
{:ok, %{
quality: Jason.decode!(response.text),
llm_usage: %{model: response.model, ...}
}}
{:error, reason} ->
{:error, reason}
end
endOption B: Manual capture
If you need more control, call log_request/3 and log_response/2 yourself:
debug = Debug.log_request(job, request_payload)
{:ok, response} = MyApp.LLM.complete(request_payload)
Debug.log_response(debug, response)Option C: Capture upstream context separately
upstream = Results.get_all_results(job)
Debug.log_upstream_context(job, upstream)Enabling/disabling
Debug logging is controlled by config. In production, disable it to avoid storing large payloads:
# config/dev.exs
config :baton, Baton.Debug, enabled: true
# config/prod.exs
config :baton, Baton.Debug, enabled: falseOr enable it selectively per-workflow:
Baton.new(workflow_name: "debug-run", debug: true)Pruning
Debug logs can be large. Prune old ones periodically:
# In a daily cron job or Oban plugin
Baton.Debug.prune_older_than(days: 7)Security
Captured logs contain the full prompt (messages) and request options, and
may include PII — disable debug logging in production unless you need it. As a
safeguard, options whose key looks like a credential (:api_key,
:authorization, :token, :headers, …) are masked as "[REDACTED]" before
storage, so secrets passed through your LLM client's options don't leak into
the database.
Summary
Functions
Call the LLM and automatically capture the full request/response for debugging.
Get just the request messages for a step — the "context window" view. Returns the messages array from the stored request.
Check whether debug logging is enabled globally.
Check whether debug logging is enabled for a specific job.
Reconstruct the full conversation as the LLM saw it, formatted for display.
Returns a list of %{role: string, content: string, token_estimate: integer}.
Update a debug log with error info after a failed call.
Get the debug log for a specific step (most recent attempt).
Log the full request payload before making the LLM call.
Returns {:ok, debug_log} or {:error, changeset}.
Update a debug log with the LLM response after a successful call.
Store the upstream context for a step separately. Useful when you want to capture what a step received from its deps independently of the prompt that was built from it.
Get all debug logs for a workflow, ordered by step execution time.
Delete debug logs older than the given duration.
Functions
Call the LLM and automatically capture the full request/response for debugging.
This is a thin wrapper around the configured LLM client's complete/2 that:
- Builds a request map from the messages and options
- Logs the request before calling the LLM
- Logs the response (or error) after the call returns
- Returns
{:ok, response, debug_log}or{:error, reason}
The debug_log is the inserted DebugLog struct — you can ignore it.
Options
The client is resolved from config :baton, llm_client: MyApp.LLM.
All options are passed through to the client's complete/2. Common ones:
:model— model string:system— system prompt (will be included in the captured request):max_tokens— max completion tokens:temperature— sampling temperature
Upstream context capture
Pass :upstream_context to also store what this step received from its deps:
Debug.call_llm(job, messages,
model: "claude-sonnet-4-20250514",
upstream_context: Results.get_all_results(job)
)
Get just the request messages for a step — the "context window" view. Returns the messages array from the stored request.
Check whether debug logging is enabled globally.
Check whether debug logging is enabled for a specific job.
Reconstruct the full conversation as the LLM saw it, formatted for display.
Returns a list of %{role: string, content: string, token_estimate: integer}.
Update a debug log with error info after a failed call.
Get the debug log for a specific step (most recent attempt).
Log the full request payload before making the LLM call.
Returns {:ok, debug_log} or {:error, changeset}.
Update a debug log with the LLM response after a successful call.
Store the upstream context for a step separately. Useful when you want to capture what a step received from its deps independently of the prompt that was built from it.
Get all debug logs for a workflow, ordered by step execution time.
Delete debug logs older than the given duration.
Examples
Debug.prune_older_than(days: 7)
Debug.prune_older_than(hours: 24)
Debug.prune_older_than(seconds: 3_600)