14. Workflows report Checkpoint Bundles and honest partial outcomes

Copy Markdown View Source

Date: 2026-06-04 Status: Accepted Implementation status: Implemented

Context

ADR 0012 deliberately shipped a small Workflow v1: structural dependency edges, read/write posture, conservative write-set serialization, execution through the Subagents manager, and compact terminal summaries. It explicitly deferred typed outputs, merge-back, workflow-level Events, and whole-workflow resume.

The T3 Code UI smoke after ADR 0012 exposed the next reliability gap. A tiny read-only Workflow completed cleanly, but a broader Workflow produced useful Subagent outputs while other steps timed out and the root retried. The UI could show run_workflow and Subagent lifecycle updates, but the runtime did not yet give the parent a first-class answer to:

  • which steps completed;
  • which steps timed out or failed;
  • which results are safe for downstream steps;
  • which dependents were held;
  • whether retry is safe;
  • what partial evidence can still be used.

The hybrid Claude Code orchestrator used "checkpoint bundles" to decide whether a dependent task could proceed. Pixir should borrow that semantic contract without borrowing Claude Code's script/runtime mechanics.

Decision

Pixir will extend Workflow results with Step Outcomes, Checkpoint Bundles, and Partial Workflow Outcomes.

A Step Outcome records both the raw Subagent lifecycle status and the Workflow's derived usability decision. It includes at least:

  • step_id;
  • agent_id;
  • child_session_id when available;
  • wave;
  • raw subagent_status;
  • derived checkpoint_status;
  • concise summary;
  • elapsed time when available;
  • error kind/details when terminal failure occurred;
  • any Checkpoint Bundle produced.

checkpoint_status is a Workflow-level projection, not a replacement for Subagent lifecycle status:

StatusMeaning
checkpoint_readyThe step produced a result safe for dependents to consume.
partialThe step produced useful evidence but not enough to unblock dependents.
failedThe step reached a terminal non-usable state, including failed, timed out, cancelled, closed, or detached with no usable checkpoint.
heldThe step was not started because a dependency did not become checkpoint-ready.
needs_orchestratorThe step found a material ambiguity, seam conflict, or unreconciled decision the Workflow cannot safely resolve.

A Checkpoint Bundle is the structured evidence that makes a step safe to consume:

  • produced contract or artifact;
  • verification evidence when the template/practice requires it;
  • known limitations;
  • dependent-safe flag;
  • source Subagent id and Session id;
  • optional structured payload once typed outputs exist.

A Partial Workflow Outcome is returned when a Workflow started and produced some operational truth but did not reach completed status. It records:

  • all Step Outcomes;
  • usable Checkpoint Bundles;
  • failed/partial/held/needs-orchestrator steps;
  • held dependents;
  • unresolved Seam Obligations;
  • safe next actions such as retry, rerun failed steps only, ask user, or abort.

Invalid Workflow specifications remain Tool errors (:invalid_args) per ADR 0005. Provider/runtime crashes that prevent Pixir from knowing what happened may still be Tool errors. But expected agentic outcomes such as one step timing out while another completed are Workflow outcomes, not protocol failures. They should be returned as structured Workflow data so parent agents and presenters cannot mistake partial work for success or lose useful evidence.

Consequences

  • Downstream steps should unlock only from checkpoint_ready, not merely from raw Subagent completed.
  • A Workflow can fail honestly without erasing completed child work.
  • ACP/T3 and terminal presenters can show partial outcomes without misleading success prose.
  • Parent agents can decide whether to retry only failed steps, ask for approval, or synthesize from partial evidence.
  • This creates a natural bridge to Skill-backed Workflow Templates: each template can define what a valid Checkpoint Bundle means for its practice.
  • Typed outputs remain future work, but the result envelope can be designed now so typed payloads fit later.

Non-goals

  • Do not add automatic merge-back from isolated writer snapshots in this ADR.
  • Do not add whole-workflow resume or canonical workflow_event yet.
  • Do not replace Subagent lifecycle statuses; preserve them and add a Workflow projection over them.
  • Do not make partial outcomes count as completed Workflows.

Verification Direction

The first implementation should add no-network tests before real-network smoke:

mix test test/pixir/workflows_test.exs test/pixir/tools_test.exs
mix pixir.smoke.workflows --dry-run --json

Required scenarios:

  • all steps checkpoint-ready;
  • one step times out while another step has a usable partial or ready checkpoint;
  • dependent steps become held;
  • invalid graphs still return :invalid_args;
  • run_workflow renders partial status without claiming completion;
  • ACP presentation distinguishes completed, partial, held, and failed outcomes.

References

  • ADR 0003: stateless Turns; local Log is source of truth.
  • ADR 0004: unified Event envelope and canonical vs ephemeral events.
  • ADR 0005: tool ergonomics and structured errors.
  • ADR 0011: BEAM-native Subagents as supervised child Sessions.
  • ADR 0012: structural Workflows over Subagents.
  • ADR 0013: Skills can provide Workflow Templates as installed practices.