AtpClient.ResultNormalization (AtpClient v0.1.7)

Copy Markdown View Source

Interprets output from various provers into a standardized format.

The TPTP-oriented prover outputs (used by SystemOnTPTP and StarExec backends when solvers follow the SZS Ontology) are interpreted by interpret_result/1. Isabelle use_theories results are interpreted by interpret_isabelle_status/1.

Summary

Types

Standardized format for ATP outputs.

Unexpected failures (including parsing errors of user input)

Expected ATP results

Functions

Concatenates all "message" strings from a finished Isabelle use_theories payload into a single newline-separated string, mirroring what the IsabelleClientMini.extract_results/1 helper produces.

Interprets an Isabelle use_theories poll result (the keyword list returned by IsabelleClientMini.poll_status/1 once a :finished tuple is present).

Interprets the output of provers that follow the SZS ontology (including those available on SystemOnTPTP) into the standardized representation.

Types

atp_result()

@type atp_result() :: {:ok, success_t()} | {:error, failure_t()}

Standardized format for ATP outputs.

failure_t()

@type failure_t() ::
  :internal_error | :malformed_input | {:unrecognized_output, String.t()}

Unexpected failures (including parsing errors of user input)

success_t()

@type success_t() ::
  :thm | :csat | :sat | :timeout | :out_of_resources | :gave_up | :interrupted

Expected ATP results

Functions

extract_isabelle_text(status)

@spec extract_isabelle_text(keyword() | map()) :: String.t()

Concatenates all "message" strings from a finished Isabelle use_theories payload into a single newline-separated string, mirroring what the IsabelleClientMini.extract_results/1 helper produces.

interpret_isabelle_status(status)

@spec interpret_isabelle_status(keyword()) :: atp_result()

Interprets an Isabelle use_theories poll result (the keyword list returned by IsabelleClientMini.poll_status/1 once a :finished tuple is present).

Classification combines structural and textual signals from the finished payload:

  • a "Nitpick found a counterexample" message yields {:ok, :csat};
  • a "Nitpick found a model" message yields {:ok, :sat};
  • a Sledgehammer "found a proof" message yields {:ok, :thm};
  • a "timed out" / "TIMEOUT" message yields {:ok, :timeout};
  • a "Out of memory" message yields {:ok, :out_of_resources};
  • otherwise, if the finished payload reports ok: true with no top-level errors and every theory node's status.ok is true, all proofs were discharged → {:ok, :thm}. This covers plain by auto, by simp, structured proof ... qed blocks, and anything else that closes the goal without going through Sledgehammer or Nitpick;
  • remaining cases (finished with errors, or unclassifiable text) yield {:ok, :gave_up}.

A :failed entry in the status list yields {:error, :malformed_input} as a defensive fallback; normally prove_theory/4 surfaces those as {:error, {:isabelle_failed, _, _}} before this function is called.

interpret_result(res_str)

@spec interpret_result(String.t()) :: atp_result()

Interprets the output of provers that follow the SZS ontology (including those available on SystemOnTPTP) into the standardized representation.

This might not work for all available systems but at least for those available through the sledgehammer tool in Isabelle/HOL. Much of the result interpretation is taken directly from the sledgehammer source code (https://github.com/seL4/isabelle/blob/master/src/HOL/Tools/Sledgehammer/sledgehammer_atp_systems.ML).