AtpMcp.Runtime (atp_mcp v0.3.0)

Copy Markdown View Source

GenServer that drives the MCP protocol loop: owns stdout, runs each tools/call in its own Task, and handles notifications/cancelled.

Why a process

The synchronous AtpMcp.handle_rpc/1 path is fine for tests but insufficient at runtime for two reasons:

  1. Cancellation. While a tool call is in flight, the server must still read further stdin lines so a notifications/cancelled can interrupt it. That requires the tool's work to run in a separate process the dispatcher can kill.
  2. Progress. Long-running tools should emit periodic notifications/progress frames when the client supplied a _meta.progressToken. Multiple processes therefore need to write to stdout — and the writes must not interleave.

This GenServer is the only owner of stdout. Tool calls run in short-lived Tasks; their results flow back through the GenServer's mailbox and are serialized onto stdout in arrival order.

Cancellation semantics

Process.exit(task_pid, :kill) is the cancellation signal. Per AtpClient.Backend's contract, each backend treats death of the calling process as a request to abort and tears down the upstream work it owns. In practice:

  • LocalExec — the spawned Port is closed, which SIGKILLs the prover binary; a small guard process also catches edge cases by SIGKILLing the OS pid directly.
  • StarExec — a cancel-guard process issues DELETE against the remote job before letting go.
  • Isabelle — the session is torn down, which drops any in-flight use_theories task on the server side. The next call pays the session-start cost again (typically a few seconds for HOL).
  • SystemOnTPTP — the local Req/Finch request errors out and the connection slot is released, but the remote prover keeps running to its TimeLimit. SOTPTP has no remote-cancel endpoint; :time_limit_sec is the only server-side bound.

The MCP response is suppressed in all four cases as the spec requires.

Progress semantics

When a tools/call carries _meta.progressToken, the Task emits a notifications/progress frame every :heartbeat_ms (default 5000) while the work is in flight. The progress field is the elapsed whole seconds; total is omitted because most ATP runs have no meaningful upper bound. Real per-step progress would require AtpClient changes (e.g. streaming query_selected_systems).

Summary

Functions

Block until the runtime has no in-flight tools/call tasks. Called by AtpMcp.main/1 after stdin closes so pending tool responses reach stdout before the BEAM exits.

Returns a specification to start this module under a supervisor.

Hand a raw stdin line to the runtime. Returns immediately; the runtime decodes, dispatches, and (for tools/call) spawns a Task.

Block until the runtime has drained its current cast backlog. Useful in tests after delivering a cancellation to confirm the kill has been processed before asserting that no response was written.

Write an arbitrary already-encoded line to the runtime's output sink. Primarily for tests; production paths flow through deliver/2.

Functions

await_idle(server \\ __MODULE__, poll_ms \\ 25)

@spec await_idle(GenServer.server(), non_neg_integer()) :: :ok

Block until the runtime has no in-flight tools/call tasks. Called by AtpMcp.main/1 after stdin closes so pending tool responses reach stdout before the BEAM exits.

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

deliver(line, server \\ __MODULE__)

@spec deliver(String.t(), GenServer.server()) :: :ok

Hand a raw stdin line to the runtime. Returns immediately; the runtime decodes, dispatches, and (for tools/call) spawns a Task.

start_link(opts \\ [])

sync(server \\ __MODULE__)

@spec sync(GenServer.server()) :: :ok

Block until the runtime has drained its current cast backlog. Useful in tests after delivering a cancellation to confirm the kill has been processed before asserting that no response was written.

write(line, server \\ __MODULE__)

@spec write(String.t(), GenServer.server()) :: :ok

Write an arbitrary already-encoded line to the runtime's output sink. Primarily for tests; production paths flow through deliver/2.