All notable changes to AtpClient are documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[0.5.0] - 2026-06-30
Changed
AtpClient.ResultNormalization.atp_result()now surfaces the full SZS Ontology.{:ok, status}carries the SZS verdict as a downcased atom rather than one of six collapsed atoms that lost information. The recognised vocabulary covers every Success and NoSuccess atom a sledgehammer-targeted prover (or a contemporary TPTP-compliant deployment) is likely to emit:Success: :theorem, :unsatisfiable, :satisfiable, :counter_satisfiable, :contradictory_axioms, :equivalent, :counter_equivalent, :counter_theorem, :equivalent_counter_theorem, :equi_satisfiable, :tautology, :tautologous_conclusion, :weaker_conclusion, :no_consequence NoSuccess: :gave_up, :unknown, :incomplete, :timeout, :resource_out, :memory_out, :forced, :user, :inappropriate, :error, :input_errorThe most consequential distinctions the previous shape collapsed are
:theorem("conjecture follows from the axioms", e.g. Isabelle and model-checker endorsement) vs:unsatisfiable("the negated-conjecture clause set has no model", e.g. TPTP refutation provers), and:satisfiable("a model of the premises was found") vs:counter_satisfiable("a model of the negated conjecture was found"). See https://tptp.org/Seminars/SZSOntologies/Summary.html for the full ontology.Prover-specific patterns are reclassified to honest SZS atoms rather than sledgehammer's negated-conjecture reading. SPASS "Completion found" becomes
:satisfiable(was:gave_up); iProver "CNFRefutation" becomes:unsatisfiable(was:theorem); Vampire "Satisfiability detected" and "Termination reason: Satisfiable" become:satisfiable(were:csat); Alt-Ergo "Unknown" becomes:unknown(was:gave_up); Vampire SIGINT becomes:forced(was:interrupted). Callers running a negated-conjecture refutation pipeline should treat:satisfiablefrom these provers as a counter-model to the original goal.SPASS / Waldmeister input-rejection patterns move from
{:error, :malformed_input}to{:ok, :input_error}. SPASS "Undefined symbol", "Free Variable", "No formulae and clauses found in input file", and Waldmeister "Unexpected end of file" are the prover's own SZS InputError verdict — the prover ran, read the input, and rejected it. SPASS "Please report this error" and Waldmeister "Unrecoverable Segmentation Fault" remain{:error, :internal_error}because those are prover crashes, not verdicts. The SZSInappropriatestatus likewise moved from{:error, :malformed_input}to{:ok, :inappropriate}.reqbumped to~> 0.6. ResolvesGHSA-655f-mp8p-96gv(decompression-bomb DoS via auto-decoded archive / compressed response bodies, HIGH) andGHSA-px9f-whj3-246m(multipart form-data header injection, LOW). No API changes at the AtpClient surface.
Added
t:szs_status/0,t:szs_success/0,t:szs_no_success/0typespecs onAtpClient.ResultNormalizationenumerate the SZS atoms recognised explicitly.szs_statuswidens toatom()to admit the permissive fallback below.- Permissive SZS fallback. Any unrecognised
% SZS status <CamelCase>(or… says <CamelCase>) line passes through as its snake_case atom — so SZS additions likeEquivalentTheorembecome{:ok, :equivalent_theorem}without a code change. Bounded length and strict CamelCase keep the atom table safe against pathological input. - Word-boundary SZS line extraction. Replaces substring matching, so
longer names no longer collapse onto shorter prefixes:
EquivalentTheorem≠Equivalent,CounterSatisfiable≠Satisfiable,InputError≠Error.
Breaking
- The collapsed
t:success_t/0atoms (:thm,:csat,:sat,:out_of_resources,:interrupted) are gone fromAtpClient.ResultNormalization.atp_result()and fromAtpClient.Isabelle(which produced:thm/:csat/:sat/:timeout/:out_of_resources/:gave_upfrom the Isabelle message scanner). Pattern matches must be rewritten — see the migration notes. - The
{:error, :malformed_input}failure path is gone. SPASS / Waldmeister input-rejection patterns and the SZSInappropriatestatus surface in the{:ok, _}channel as:input_error/:inappropriaterespectively.failure_t()no longer lists:malformed_input.
Migration notes
A given old atom can map to one of several SZS atoms depending on which prover family produced the result. Match the disjunction to preserve coverage:
:thm→:theorem(Isabelle / Sledgehammer / Alt-Ergo Valid) or:unsatisfiable(TPTP refutation provers, iProver CNFRefutation).:csat→:counter_satisfiable(SZS CounterSatisfiable, Isabelle / Nitpick counter-example) or:satisfiable(SZS Satisfiable, Isabelle / Nitpick model, Vampire "Satisfiability detected", SPASS "Completion found").:sat→:satisfiable(only emitted by the Isabelle classifier for Nitpick / Quickcheck models).:out_of_resources→:resource_out(SZS ResourceOut, SPASS / Waldmeister resource patterns) or:memory_out(SZS MemoryOut, Isabelle "Out of memory").:interrupted→:forced(SZS Forced, Vampire SIGINT) or:user(SZS User).:gave_upagainst an Alt-Ergo verdict →:unknown; against SZSIncomplete→:incomplete; otherwise still:gave_up.{:error, :malformed_input}→ match{:ok, :input_error}instead.
So a previously written TPTP triage like
case AtpClient.SystemOnTptp.query(problem, default_system: "cvc5---1.3.0") do
{:ok, :thm} -> :proved
{:ok, :csat} -> :disproved
{:ok, atom} when atom in [:timeout, :out_of_resources, :gave_up, :interrupted]
-> :inconclusive
{:error, :malformed_input} -> :bad_input
{:error, _} -> :error
endbecomes
case AtpClient.SystemOnTptp.query(problem, default_system: "cvc5---1.3.0") do
{:ok, status} when status in [:theorem, :unsatisfiable]
-> :proved
{:ok, status} when status in [:satisfiable, :counter_satisfiable]
-> :disproved
{:ok, status} when status in [:timeout, :resource_out, :memory_out,
:gave_up, :unknown, :incomplete,
:forced, :user, :inappropriate]
-> :inconclusive
{:ok, :input_error} -> :bad_input
{:error, _} -> :error
end[0.4.0] - 2026-06-29
Changed
AtpClient.ResultNormalization.per_lemma_results/2becomes/3and now takes(payload, lemma_specs, opts)instead of(payload, opts). The caller supplies one%{name: String.t(), range: Range.t()}spec per lemma (AtpClient.Isabelle.lemma_specs/1builds these from the theory body) and the classifier buckets messages by body-line range rather than re-deriving structure frompos.linealone. Returnedlemma_result()maps lose the:linefield — line numbers referred to the generated theory file, not the caller's source, and were ambiguous when a bucket contained multiple messages.- Per-lemma classifier no longer scans concatenated text across
messages. Each message is classified individually before the bucket is
reconciled. This fixes two regressions visible with multi-lemma Isabelle
jobs: (a)
Nitpick found a model+Nitpick found no counterexamplesurfaced as:csatbecause the"Nitpick found a" + "counterexample"substring test fired across the message join, and (b)by <tactic>failing on a False goal was reported as:thmbecause Isabelle echoestheorem name: <goal>at thebyposition alongside theFailed to finish prooferror. Verdict precedence is documented onper_lemma_results/3. per_lemma_results/3accepts:fileto drop messages whosepos.filedoes not end with a given suffix.prove_lemmas/4uses this to filter out noise from the bundledTPTP.thyand other transitively imported theories that previously produced phantom lemma rows.- Lemma names always come through on the per-lemma path — the name
comes from the body (via
lemma_specs/1), not from parsingtheorem name:out of Isabelle messages. Sledgehammer / Nitpick verdicts no longer surface asname: nil.
Added
AtpClient.Isabelle.lemma_specs/1— extracts[%{name, range}]from a theory body. Used internally byprove_lemmas/4andprove_tptp/3; exposed for callers that want to pre-compute specs.
Migration notes
- If you call
ResultNormalization.per_lemma_results(payload)orper_lemma_results(payload, line_offset: n), switch toper_lemma_results(payload, specs, opts)wherespecscome fromAtpClient.Isabelle.lemma_specs(body). The high-levelprove_lemmas/4/prove_tptp/3/query_tptp/2entry points already do this for you. - Drop reads of the
:linefield onlemma_resultentries. Attribute by:nameinstead — line numbers were a generated-theory leak and have been removed from the returned shape.
[0.3.0] - 2026-06-26
Added
AtpClient.LocalExecbackend. Invokes a locally installed, TPTP-compliant prover binary (E, Vampire, …) viaSystem.cmd/3and normalizes its stdout through the existing SZS classifier. Two-layered timeout: the prover-side CPU limit (passed via:args) lets provers emit a cleanSZS status Timeout, and an independent BEAM-side wall-clock timeout (:wall_timeout_ms) kills wedged processes. Both paths fold into the same{:ok, :timeout}result so callers do not have to branch on the failure mode. Binary resolution goes throughSystem.find_executable/1; missing binaries surface{:error, {:prover_not_found, name}}rather than raising.AtpClient.Backendbehaviour so a UI (Smart Cell, Livebook, …) can enumerate and drive backends without hard-coding per-backend knowledge. Every backend (SystemOnTptp,StarExec,Isabelle,LocalExec) now implementsconfig_key/0,label/0,config_schema/0,verify/1, andquery/2. The newquery/2is the cross-backend entry point: it takes a TPTP-format problem string plus a keyword list and returns a singleatp_result(), hiding session login/logout, prover selection, theory bookkeeping, etc. Per-backend low-level entry points (query_system/3,prove_theory/4,create_job/3,query_tptp/2, …) remain available for callers that need sessions, multi-system fan-out, or per-lemma detail.AtpClient.Config.Fieldstruct carrying the UI metadata each backend exposes viaconfig_schema/0: logical:type(:string | :integer | :boolean | :string_list), layout:group(:connection | :defaults | :advanced),:required?,:secret?,:default,:label, and:doc. The library performs no coercion; values flow intoApplication.put_env/3and the per-call opts as-is.AtpClient.backends/0returns the list of behaviour-implementing backend modules so a UI can discover them without hard-coding.- TPTP-shaped entry points on
AtpClient.Isabelle.query_tptp/2andprove_tptp/3accept a TPTP/THF problem string, route it throughIsabelleClient.TPTP.isabellize_theory/1, append the configured:proof_method(default"by auto") to each generatedlemma, and submit the resulting theory to Isabelle withimports "TPTP"andunbundle from_TPTPactive. The bundledTPTP.thysupport theory is copied into:local_diron first use so Isabelle's loader can resolve the import. Lemma names carry over from TPTP formula names. Passproof_method: "by metis","sledgehammer", etc. for stronger or probing tactics. Smart-cell consumers can now drive Isabelle from the same TPTP editor they use for the SystemOnTPTP / StarExec / LocalExec backends; theory-text entry points (prove_theory/4,prove_lemmas/4,query/3,query_lemmas/3) remain unchanged. AtpClient.Isabelle.SessionOwner— a private GenServer that owns the link toIsabelleClient.Shared.open_session/1now returns{:error, reason}on a failed connection without killing non-trapping callers, a later Shared crash surfaces through a monitor rather than an:EXIT, and dropping the caller monitors the owner to a clean shutdown so server-side Isabelle sessions are no longer orphaned. TheSessionstruct gains an opaque:ownerfield; treat it as internal.scripts/build_eprover.sh— builds the E theorem prover from source and installs it topriv/bin/eproverfor use as the:local_execbackend's binary.AtpClient.ResultNormalization.failure_t/0gains{:prover_not_found, String.t()}for theLocalExecbinary-resolution failure mode.
Changed
- Library defaults are merged per-key inside
AtpClient.Config.get/2instead of being seeded viamix.exs's:envblock. The previous arrangement let anyconfig :atp_client, :<backend>, …in userconfig.exssilently replace the whole keyword list (the OTPApplication.put_env/3semantics), dropping defaults the user had not explicitly re-set — most visibly causing a fresh install with partial SystemOnTPTP config to fail with{:unrecognized_output, ""}because:urlhad vanished. Defaults now live inAtpClient.Config's@defaults(exposed viaConfig.defaults/0) and are merged underneath Application env on every read, so a partial user config only overrides the keys it actually names. No call-site changes required. AtpClient.SystemOnTptp.list_provers/0blocks on the first call until the startup refresh completes or:sotptp, :refresh_timeout_ms(default 15 s) elapses; subsequent calls return the cached list immediately. On timeout it returns[]rather than raising. Previously it could return[]immediately after application start, which the caller had no way to distinguish from "the SystemOnTPTP deployment is empty."per_lemma_results/2now appliescheck_tool_signalsper source line. Sledgehammer's"found a proof"and Nitpick's"found a counterexample"/"found a model"verdicts surface as{:ok, :thm}/{:ok, :csat}/{:ok, :sat}on the per-lemma path, matchinginterpret_isabelle_result/1. Previously only Isabelle's"theorem name:"completion notification was recognised, so probe-style proofs (sledgehammer nitpick oops) collapsed to:gave_up.{:ok, :timeout}and{:ok, :out_of_resources}are now surfaced per lemma too. Lemma names attach to results derived fromtheorem name:completions; sledgehammer/nitpick verdicts carryname: nilbecause the underlying messages do not name the lemma.:isabelle_elixiris now pinned to its GitHubmainbranch (previously~> 0.3from Hex). The new TPTP entry points rely on theIsabelleClient.TPTPmodule, which is not in the 0.3.0 Hex release. Re-pin to a Hex constraint once a release containingIsabelleClient.TPTPships.
Breaking
AtpClient.Isabelle.query/3no longer defaults itsoptsargument. Pass[]explicitly if you have no options to set. The default was removed so the three-arityquery/3(theory text + name) does not collide with theBackendbehaviour'squery/2(TPTP problem string + opts), both of which now live on the same module.AtpClient.Isabelle.Sessiongained an enforced:ownerfield. Anyone constructing%Session{client: …, config: …}outside the library must now also pass:owner(the pid of anAtpClient.Isabelle.SessionOwner). TreatSessionas opaque and construct it only viaopen_session/1.
Migration notes
- If you call
Isabelle.query(theory, name), change it toIsabelle.query(theory, name, []). - If you read the
:envblock ofAtpClient.MixProjectto discover library defaults, callAtpClient.Config.defaults/0instead. The:envblock has been removed since the defaults are no longer load-bearing there. - If you construct
%AtpClient.Isabelle.Session{}directly, stop — useopen_session/1so the:ownerpid is set up correctly.
[0.2.0]
Changed
- Bumped
:isabelle_elixirto~> 0.2. The upstream library has moved to a task-based wire model;AtpClient.Isabellehas been rewritten on top of the newIsabelleClient(the stateful struct client) instead ofIsabelleClientMini's polling API. End users of the high-level functions (open_session/1,prove_theory/4,query/3,close_session/1) see the same surface — only the internals and a few error / raw-result shapes have changed. AtpClient.Isabelle.Sessionnow wraps an%IsabelleClient{}rather than carrying:socketand:session_iddirectly. Code that pattern-matched%Session{socket: socket, session_id: id}must instead use%Session{client: %IsabelleClient{socket: socket, session_id: id}}.:rawmode ofprove_theory/4andquery/3now returns theuse_theoriesresult map directly (with top-level"ok","errors","nodes"keys) instead of the previous keyword list of poll messages.AtpClient.ResultNormalization.extract_isabelle_text/1already accepted this map shape, so call sites that used it continue to work unchanged.AtpClient.ResultNormalization.interpret_isabelle_status/1has been renamed tointerpret_isabelle_result/1and now takes theuse_theoriesresult map directly. The keyword-list input form is no longer supported — the previous name and shape were leaks of the old polling abstraction.AtpClient.ResultNormalization.extract_isabelle_text/1no longer accepts a keyword list; only the result map is supported.{:error, {:isabelle_failed, payload, status}}errors now carrypayload(the FAILED body) andnotes(intermediateNOTEpayloads accumulated byIsabelleClient.Task) instead of the old full poll-message keyword list. The "Cannot load theory file" annotation path continues to emit a 3-tuple{:isabelle_failed, payload, [hint:, local_dir:, isabelle_dir:]}as before.
Removed
- The
:poll_interval_msIsabelle config setting is no longer used — the new client uses blockingawait_tasksemantics rather than polling. Existingconfig :atp_client, :isabelle, poll_interval_ms: ...entries are removed.
Migration notes
For most callers the upgrade is a no-op beyond updating mix.exs. Action is
only needed if you:
- Pattern-match on
AtpClient.Isabelle.Sessionfields directly. Usesession.client.socket/session.client.session_idinstead, or treatSessionas opaque. - Call
AtpClient.ResultNormalization.interpret_isabelle_status/1. Rename tointerpret_isabelle_result/1and pass theuse_theoriespayload map (ortask.resultof a finishedIsabelleClient.Task) instead of the previous poll keyword list. - Use
prove_theory(..., raw: true)orquery(..., raw: true)and consume the result with anything other thanextract_isabelle_text/1. The shape is now a plain map; expectpayload["nodes"],payload["ok"], etc. - Match on the third element of
{:error, {:isabelle_failed, _, _}}errors. It is now either a list of NOTE payload maps (plain failures) or a[hint:, local_dir:, isabelle_dir:]keyword list (annotated "Cannot load theory file" failures).
[0.1.3]
Prior versions are not retroactively documented here. See git history for changes before 0.2.0.