One tool, two orthogonal axes

A tool is declared once. What varies is captured on two independent axes, not one, because "needs human input" and "server handles it" are two different questions tangled together.

  1. Executor — who produces the result.
  2. Approval gate — whether the call is gated by human confirmation before it runs.

These are independent. A server tool can be gated (send_email runs server-side but needs a confirm click first) or ungated. Confirmation is not a kind of tool; it is a gate in front of an otherwise-server tool. "Answering," by contrast, is a tool whose result is the human's input — a different control flow entirely.

Executor axis

  • :server — your code runs it and returns a result. The default.
  • :human (elicitation) — the human is the executor; their answer is the result. The agent suspends. One-phase: nothing runs after the human responds.
  • :client — runs in the browser/LiveView: geolocation, client/device state, a file picker, a client-side computation. Not a human decision and not the server. The execution boundary is the socket; only works while one is live.
  • :provider — the provider executes it (hosted web search, code interpreter) and the result returns in the stream. Pass it through, do not dispatch locally. Already in the stack via ReqLLM's provider-hosted tools.

:sub_agent is deferred — it is :server with a spawn inside; modelling it as a distinct executor adds complication for no gain.

Approval gate (orthogonal modifier)

A policy on side-effecting tools: :auto or :requires_approval. Two-phase: suspend → human approves → the tool then executes → result. Distinct from :human elicitation, which is one-phase.

Legal matrix: the gate applies to :server and :client only. :provider cannot be gated — the provider executes mid-stream; there is no pre-execution suspend point. Gating :human is circular (approve asking the human?) — reject it at definition time. A gated :client call suspends twice: once for approval, once for client execution — two pending entries, two resolutions.

Progressive tools (a property, not a type)

A tool emitting intermediate progress before a final result (a long server job streaming logs) is marked :streaming?. Orthogonal to the executor — it affects the renderer (show progress) and whether the call resolves as one event or many.

Tool definition (sketch)

name:             string
description:      string
parameter_schema: NimbleOptions keyword list, passed through to ReqLLM
executor:         :server | :human | :client | :provider
approval:         :auto | :requires_approval
streaming?:       boolean
retention:        see 05-compaction (per-tool age/count + never_evict)
callback:         required for :server (and the post-approval phase of gated tools)

:server callbacks receive parsed args plus the %Agentix.Turn{} scope (see 02) for ambient context (current user, db handle).

Schema pass-through, not a new layer. ReqLLM's Tool.new/1 already natively accepts NimbleOptions keyword lists (and raw JSON Schema) and compiles them to JSON Schema itself (verified v1.16.0). Agentix hands the schema through verbatim — it must not re-compile or re-validate, or schemas get double-compiled and the two interpretations drift.

The control-flow collapse

From the state machine's point of view every executor reduces to one of two shapes:

  • Resolve-in-process: :server, and :provider (resolves in-stream).
  • Suspend-and-await-external-resolution: :human, :client, anything gated, and (later) :sub_agent.

So you build one suspension primitive; the executor only parameterizes who may resolve and how the UI prompts. The distinction stays explicit where it matters (tool declaration, loop dispatch, renderer) without multiplying machinery.

The suspension/resolution primitive

A suspended call is a pending tool call with a correlation id (tool_call_id) and a resolver. A single turn can carry a mix — three tool calls, two :server (self-resolve in ms) and one :human (resolves whenever). awaiting_input means "awaiting resolution of N pending calls, some already done." The agent holds:

pending: %{tool_call_id => status}   # :running | :awaiting | :resolved | :errored

The resume to the next LLM turn fires only when the pending set empties.

Two related-but-distinct shapes share the name "pending" — keep them straight:

  • This in-memory tracking map holds every call in the turn with its status, including :running :server calls. It is the agent's working state for deciding when to resume; it is never persisted as-is.
  • The persisted/rendered pending (fsm_state.pending in 01/04, the renderer assign in 06) is only the awaiting-external subset, shaped %{tool_call_id => %{executor, kind, prompt}} where kind is :approval | :elicitation | :client_exec — the field the renderer actually switches on. Running :server calls are not in it — if the agent is killed mid-turn, those are recovered by re-running from the log (the tool_call with no tool_result), not from the snapshot. The renderer shows running calls via in_flight_tools, awaiting ones via pending.

Resolver interface

:gen_statem.call(via(conversation_id), {:resolve, tool_call_id, result})

call, not cast, for the synchronous ack — a confirm click that silently vanishes is a terrible HITL failure mode. The agent:

  1. Validates tool_call_id against the pending set; if stale, unknown, or already resolved, replies {:error, :stale} (covers double-clicks, resubmits, expiries).
  2. Records the result, replies :ok immediately.
  3. Only then, via an internal :next_event, decides whether to start the next turn.

Replying before resuming is essential: resume-first blocks the caller for the whole next turn and trips the 5s call timeout.

Addressing and revival

The caller never holds the agent pid. It resolves through ensure_started(conversation_id), which returns the live process or starts/rehydrates one, then calls. This is what lets a suspended-on-human conversation survive the agent being killed: the answer arriving revives it.

Resolution is a public API, not a socket affordance: anything holding a conversation_id and a tool_call_id — a LiveView, a webhook controller, a job, an external system — calls the same resolve. This is what generalizes the suspension primitive from HITL chat into durable workflows (see 08).

Timeout and idempotency

Every suspending call needs a timeout. Default: resolve an unanswered call to a tool-error result the model can recover from ("user did not respond"). Idempotency keyed on tool_call_id covers the kill → revive → late-answer race — and the same id is the idempotency key side-effecting tools should honor, because a kill mid-:server-tool is recovered by re-dispatching that exact call (see 01). Timeout machinery is owned by the persistence adapter (schedule_expiry / cancel_expiry — see 04) rather than a per-agent timer, since a per-agent timer dies with a killed agent.

Resolved: :client is :human with JS as the "user"

The agent emits {:suspended, id, :client, args}; a registered JS hook maps tool name → client function, executes, and pushEvents back to the LiveView, which calls the same resolve. Mechanically symmetric to elicitation, just no visible prompt.

Security rule (write it down): client results are user-controllable. The server validates them and never trusts a :client result for a privileged decision.

Two edge rules: with no live socket (headless/API callers) a :client call fails fast to a tool-error after a short grace period — it must not park the conversation in awaiting_input forever. With multiple sockets (two tabs) both execute the JS; the second resolve gets {:error, :stale} server-side, but client-side double side effects (two file pickers) are the app's to guard.

Resolved: approval vs elicitation — one mechanism, two components

The resolution path is identical ({:resolve, tool_call_id, result}), so the headless layer has one pending concept and one resolver. But ship two default components (see 06): <.approval> (a boolean gate) and <.elicitation> (an arbitrary form). Don't force a form abstraction over a yes/no.

Who consumes the executor field

  • The loop — dispatch: run, suspend, or pass through.
  • The renderer:human → elicitation form; gated → confirm card; :provider → "searched the web" affordance; :client → execute (often invisible); :server → tool-call card with result/progress.
  • Persistence / replay — the tool_calls table tracks executor and status so suspensions survive a kill.

Result convention

Structured results carry model-visible semantics in the content body as JSON — %{ok: true, result: ...} / %{ok: false, error: ...} — following ReqLLM, so follow-up turns don't depend on adapter-only metadata.