Interprets output from various provers into a standardized format.
The TPTP-oriented prover outputs (used by SystemOnTPTP and StarExec backends
when solvers follow the SZS Ontology) are interpreted by interpret_result/1.
Isabelle use_theories results are interpreted by
interpret_isabelle_result/1.
Summary
Types
Standardized format for ATP outputs.
Unexpected failures (including parsing errors of user input)
Expected ATP results
Functions
Concatenates all "message" strings from a use_theories payload (the
result field of a finished %IsabelleClient.Task{}) into a single
newline-separated string.
Interprets the result payload of a finished use_theories task — the map
returned in %IsabelleClient.Task{status: :finished, result: ...}, with
top-level keys "ok", "errors", "nodes", etc.
Interprets the output of provers that follow the SZS ontology (including those available on SystemOnTPTP) into the standardized representation.
Types
Standardized format for ATP outputs.
@type failure_t() :: :internal_error | :malformed_input | {:unrecognized_output, String.t()}
Unexpected failures (including parsing errors of user input)
@type success_t() ::
:thm | :csat | :sat | :timeout | :out_of_resources | :gave_up | :interrupted
Expected ATP results
Functions
Concatenates all "message" strings from a use_theories payload (the
result field of a finished %IsabelleClient.Task{}) into a single
newline-separated string.
Walks all nodes in the payload, not just the first — useful when a theory transitively imports others and the server reports messages against multiple nodes.
@spec interpret_isabelle_result(map()) :: atp_result()
Interprets the result payload of a finished use_theories task — the map
returned in %IsabelleClient.Task{status: :finished, result: ...}, with
top-level keys "ok", "errors", "nodes", etc.
Classification combines structural and textual signals from the payload:
- a "Nitpick found a counterexample" message yields
{:ok, :csat}; - a "Nitpick found a model" message yields
{:ok, :sat}; - a Sledgehammer "found a proof" message yields
{:ok, :thm}; - a "timed out" / "TIMEOUT" message yields
{:ok, :timeout}; - a "Out of memory" message yields
{:ok, :out_of_resources}; - otherwise, if the payload reports
ok: truewith no top-level errors and every theory node'sstatus.okis true, all proofs were discharged →{:ok, :thm}. This covers plainby auto,by simp, structuredproof ... qedblocks, and anything else that closes the goal without going through Sledgehammer or Nitpick; - remaining cases (finished with errors, or unclassifiable text) yield
{:ok, :gave_up}.
Failed tasks should be surfaced through AtpClient.Isabelle.prove_theory/4's
{:error, {:isabelle_failed, _, _}} channel and never reach this function.
@spec interpret_result(String.t()) :: atp_result()
Interprets the output of provers that follow the SZS ontology (including those available on SystemOnTPTP) into the standardized representation.
This might not work for all available systems but at least for those available
through the sledgehammer tool in Isabelle/HOL. Much of the result
interpretation is taken directly from the sledgehammer source code
(https://github.com/seL4/isabelle/blob/master/src/HOL/Tools/Sledgehammer/sledgehammer_atp_systems.ML).