All notable changes to AtpClient are documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[0.4.0] - 2026-06-29

Changed

  • AtpClient.ResultNormalization.per_lemma_results/2 becomes /3 and now takes (payload, lemma_specs, opts) instead of (payload, opts). The caller supplies one %{name: String.t(), range: Range.t()} spec per lemma (AtpClient.Isabelle.lemma_specs/1 builds these from the theory body) and the classifier buckets messages by body-line range rather than re-deriving structure from pos.line alone. Returned lemma_result() maps lose the :line field — line numbers referred to the generated theory file, not the caller's source, and were ambiguous when a bucket contained multiple messages.
  • Per-lemma classifier no longer scans concatenated text across messages. Each message is classified individually before the bucket is reconciled. This fixes two regressions visible with multi-lemma Isabelle jobs: (a) Nitpick found a model + Nitpick found no counterexample surfaced as :csat because the "Nitpick found a" + "counterexample" substring test fired across the message join, and (b) by <tactic> failing on a False goal was reported as :thm because Isabelle echoes theorem name: <goal> at the by position alongside the Failed to finish proof error. Verdict precedence is documented on per_lemma_results/3.
  • per_lemma_results/3 accepts :file to drop messages whose pos.file does not end with a given suffix. prove_lemmas/4 uses this to filter out noise from the bundled TPTP.thy and other transitively imported theories that previously produced phantom lemma rows.
  • Lemma names always come through on the per-lemma path — the name comes from the body (via lemma_specs/1), not from parsing theorem name: out of Isabelle messages. Sledgehammer / Nitpick verdicts no longer surface as name: nil.

Added

  • AtpClient.Isabelle.lemma_specs/1 — extracts [%{name, range}] from a theory body. Used internally by prove_lemmas/4 and prove_tptp/3; exposed for callers that want to pre-compute specs.

Migration notes

  • If you call ResultNormalization.per_lemma_results(payload) or per_lemma_results(payload, line_offset: n), switch to per_lemma_results(payload, specs, opts) where specs come from AtpClient.Isabelle.lemma_specs(body). The high-level prove_lemmas/4 / prove_tptp/3 / query_tptp/2 entry points already do this for you.
  • Drop reads of the :line field on lemma_result entries. Attribute by :name instead — line numbers were a generated-theory leak and have been removed from the returned shape.

[0.3.0] - 2026-06-26

Added

  • AtpClient.LocalExec backend. Invokes a locally installed, TPTP-compliant prover binary (E, Vampire, …) via System.cmd/3 and normalizes its stdout through the existing SZS classifier. Two-layered timeout: the prover-side CPU limit (passed via :args) lets provers emit a clean SZS status Timeout, and an independent BEAM-side wall-clock timeout (:wall_timeout_ms) kills wedged processes. Both paths fold into the same {:ok, :timeout} result so callers do not have to branch on the failure mode. Binary resolution goes through System.find_executable/1; missing binaries surface {:error, {:prover_not_found, name}} rather than raising.
  • AtpClient.Backend behaviour so a UI (Smart Cell, Livebook, …) can enumerate and drive backends without hard-coding per-backend knowledge. Every backend (SystemOnTptp, StarExec, Isabelle, LocalExec) now implements config_key/0, label/0, config_schema/0, verify/1, and query/2. The new query/2 is the cross-backend entry point: it takes a TPTP-format problem string plus a keyword list and returns a single atp_result(), hiding session login/logout, prover selection, theory bookkeeping, etc. Per-backend low-level entry points (query_system/3, prove_theory/4, create_job/3, query_tptp/2, …) remain available for callers that need sessions, multi-system fan-out, or per-lemma detail.
  • AtpClient.Config.Field struct carrying the UI metadata each backend exposes via config_schema/0: logical :type (:string | :integer | :boolean | :string_list), layout :group (:connection | :defaults | :advanced), :required?, :secret?, :default, :label, and :doc. The library performs no coercion; values flow into Application.put_env/3 and the per-call opts as-is.
  • AtpClient.backends/0 returns the list of behaviour-implementing backend modules so a UI can discover them without hard-coding.
  • TPTP-shaped entry points on AtpClient.Isabelle. query_tptp/2 and prove_tptp/3 accept a TPTP/THF problem string, route it through IsabelleClient.TPTP.isabellize_theory/1, append the configured :proof_method (default "by auto") to each generated lemma, and submit the resulting theory to Isabelle with imports "TPTP" and unbundle from_TPTP active. The bundled TPTP.thy support theory is copied into :local_dir on first use so Isabelle's loader can resolve the import. Lemma names carry over from TPTP formula names. Pass proof_method: "by metis", "sledgehammer", etc. for stronger or probing tactics. Smart-cell consumers can now drive Isabelle from the same TPTP editor they use for the SystemOnTPTP / StarExec / LocalExec backends; theory-text entry points (prove_theory/4, prove_lemmas/4, query/3, query_lemmas/3) remain unchanged.
  • AtpClient.Isabelle.SessionOwner — a private GenServer that owns the link to IsabelleClient.Shared. open_session/1 now returns {:error, reason} on a failed connection without killing non-trapping callers, a later Shared crash surfaces through a monitor rather than an :EXIT, and dropping the caller monitors the owner to a clean shutdown so server-side Isabelle sessions are no longer orphaned. The Session struct gains an opaque :owner field; treat it as internal.
  • scripts/build_eprover.sh — builds the E theorem prover from source and installs it to priv/bin/eprover for use as the :local_exec backend's binary.
  • AtpClient.ResultNormalization.failure_t/0 gains {:prover_not_found, String.t()} for the LocalExec binary-resolution failure mode.

Changed

  • Library defaults are merged per-key inside AtpClient.Config.get/2 instead of being seeded via mix.exs's :env block. The previous arrangement let any config :atp_client, :<backend>, … in user config.exs silently replace the whole keyword list (the OTP Application.put_env/3 semantics), dropping defaults the user had not explicitly re-set — most visibly causing a fresh install with partial SystemOnTPTP config to fail with {:unrecognized_output, ""} because :url had vanished. Defaults now live in AtpClient.Config's @defaults (exposed via Config.defaults/0) and are merged underneath Application env on every read, so a partial user config only overrides the keys it actually names. No call-site changes required.
  • AtpClient.SystemOnTptp.list_provers/0 blocks on the first call until the startup refresh completes or :sotptp, :refresh_timeout_ms (default 15 s) elapses; subsequent calls return the cached list immediately. On timeout it returns [] rather than raising. Previously it could return [] immediately after application start, which the caller had no way to distinguish from "the SystemOnTPTP deployment is empty."
  • per_lemma_results/2 now applies check_tool_signals per source line. Sledgehammer's "found a proof" and Nitpick's "found a counterexample"/ "found a model" verdicts surface as {:ok, :thm} / {:ok, :csat} / {:ok, :sat} on the per-lemma path, matching interpret_isabelle_result/1. Previously only Isabelle's "theorem name:" completion notification was recognised, so probe-style proofs (sledgehammer nitpick oops) collapsed to :gave_up. {:ok, :timeout} and {:ok, :out_of_resources} are now surfaced per lemma too. Lemma names attach to results derived from theorem name: completions; sledgehammer/nitpick verdicts carry name: nil because the underlying messages do not name the lemma.
  • :isabelle_elixir is now pinned to its GitHub main branch (previously ~> 0.3 from Hex). The new TPTP entry points rely on the IsabelleClient.TPTP module, which is not in the 0.3.0 Hex release. Re-pin to a Hex constraint once a release containing IsabelleClient.TPTP ships.

Breaking

  • AtpClient.Isabelle.query/3 no longer defaults its opts argument. Pass [] explicitly if you have no options to set. The default was removed so the three-arity query/3 (theory text + name) does not collide with the Backend behaviour's query/2 (TPTP problem string + opts), both of which now live on the same module.
  • AtpClient.Isabelle.Session gained an enforced :owner field. Anyone constructing %Session{client: …, config: …} outside the library must now also pass :owner (the pid of an AtpClient.Isabelle.SessionOwner). Treat Session as opaque and construct it only via open_session/1.

Migration notes

  • If you call Isabelle.query(theory, name), change it to Isabelle.query(theory, name, []).
  • If you read the :env block of AtpClient.MixProject to discover library defaults, call AtpClient.Config.defaults/0 instead. The :env block has been removed since the defaults are no longer load-bearing there.
  • If you construct %AtpClient.Isabelle.Session{} directly, stop — use open_session/1 so the :owner pid is set up correctly.

[0.2.0]

Changed

  • Bumped :isabelle_elixir to ~> 0.2. The upstream library has moved to a task-based wire model; AtpClient.Isabelle has been rewritten on top of the new IsabelleClient (the stateful struct client) instead of IsabelleClientMini's polling API. End users of the high-level functions (open_session/1, prove_theory/4, query/3, close_session/1) see the same surface — only the internals and a few error / raw-result shapes have changed.
  • AtpClient.Isabelle.Session now wraps an %IsabelleClient{} rather than carrying :socket and :session_id directly. Code that pattern-matched %Session{socket: socket, session_id: id} must instead use %Session{client: %IsabelleClient{socket: socket, session_id: id}}.
  • :raw mode of prove_theory/4 and query/3 now returns the use_theories result map directly (with top-level "ok", "errors", "nodes" keys) instead of the previous keyword list of poll messages. AtpClient.ResultNormalization.extract_isabelle_text/1 already accepted this map shape, so call sites that used it continue to work unchanged.
  • AtpClient.ResultNormalization.interpret_isabelle_status/1 has been renamed to interpret_isabelle_result/1 and now takes the use_theories result map directly. The keyword-list input form is no longer supported — the previous name and shape were leaks of the old polling abstraction.
  • AtpClient.ResultNormalization.extract_isabelle_text/1 no longer accepts a keyword list; only the result map is supported.
  • {:error, {:isabelle_failed, payload, status}} errors now carry payload (the FAILED body) and notes (intermediate NOTE payloads accumulated by IsabelleClient.Task) instead of the old full poll-message keyword list. The "Cannot load theory file" annotation path continues to emit a 3-tuple {:isabelle_failed, payload, [hint:, local_dir:, isabelle_dir:]} as before.

Removed

  • The :poll_interval_ms Isabelle config setting is no longer used — the new client uses blocking await_task semantics rather than polling. Existing config :atp_client, :isabelle, poll_interval_ms: ... entries are removed.

Migration notes

For most callers the upgrade is a no-op beyond updating mix.exs. Action is only needed if you:

  • Pattern-match on AtpClient.Isabelle.Session fields directly. Use session.client.socket / session.client.session_id instead, or treat Session as opaque.
  • Call AtpClient.ResultNormalization.interpret_isabelle_status/1. Rename to interpret_isabelle_result/1 and pass the use_theories payload map (or task.result of a finished IsabelleClient.Task) instead of the previous poll keyword list.
  • Use prove_theory(..., raw: true) or query(..., raw: true) and consume the result with anything other than extract_isabelle_text/1. The shape is now a plain map; expect payload["nodes"], payload["ok"], etc.
  • Match on the third element of {:error, {:isabelle_failed, _, _}} errors. It is now either a list of NOTE payload maps (plain failures) or a [hint:, local_dir:, isabelle_dir:] keyword list (annotated "Cannot load theory file" failures).

[0.1.3]

Prior versions are not retroactively documented here. See git history for changes before 0.2.0.