Equation metadata and generated lemmas

Copy Markdown View Source

Theoria's equation compiler is still internal groundwork, but compiled library definitions already carry auditable equation metadata.

Current flow:

Signature + Case.Template + Clause/Pattern
  → Theoria.Equation.Schema.Builder
  → Theoria.Equation.Compiler
  → Theoria.Equation.Compiled
  → Theoria.Equation.Definition.Spec package
  → Theoria.Equation.Recursor.Descriptor package
  → Theoria.Equation.Matcher.Descriptor package
  → Theoria.Equation.Matcher.Type.Shape plan
  → Theoria.Equation.Matcher.Type package
  → Theoria.Equation.Matcher.Spec package
  → Theoria.Equation.Info metadata in the environment
  → generated Theoria.Equation.Lemma metadata
  → generated Theoria.Equation.Matcher.Equation metadata
  → Theoria.Env.Matcher declarations
  → Theoria.Equation.Extension registry helpers
  → Theoria.Equation.Eqns / Matcher.Eqns lookup
  → optional/lazy theorem realization
  → optional opaque theorem declarations
  → rewrite rules/databases

The metadata is Theoria-owned data. It is checked by native validation and can be translated to Lean by the contributor-only oracle, but Lean is not part of the runtime trusted path.

Equation identities

Generated equation artifacts use Theoria.Equation.Identity as their canonical domain identity. The struct records the owner, kind, and target of an equation:

alias Theoria.Equation.Identity

Identity.equation(:nat_add, :succ)
Identity.unfold(:nat_add)
Identity.matcher_equation(:nat_add_match_1, :succ)
Identity.indexed_matcher_equation(:vec_match, :vec_cons)

Identity.format/1 is the human display form used by Mix tasks and docs, for example nat_add.eq_succ or vec_match.eq_vec_cons. When generated equation theorems are installed, the same struct is used as the environment key; no generated atom encoding is needed.

Selector-based APIs accept structured names or explicit selectors where that is clearer:

Theoria.Equation.Matcher.Eqns.indexed_statement(info, env, constructor: :vec_cons)
Theoria.Equation.Matcher.Eqns.indexed_realize(info, env, constructor: :vec_cons)

Trusted boundary

Equation, matcher, rewrite, simp, and Lean-oracle machinery are intentionally outside the trusted kernel. They may synthesize terms, metadata, declarations, or external validation modules, but they do not make proofs trusted by themselves.

Trusted:

  • Theoria.Kernel type-checks definitions, theorem declarations, inductive declarations, recursor metadata, and checked matcher declarations.
  • Environment replay validates that stored declarations can be reconstructed through kernel entrypoints.

Untrusted helpers:

The safety rule is simple: generated artifacts become meaningful only after the native kernel checks them. The Lean oracle is contributor confidence, not runtime trust.

Stored equation metadata

Compiled definitions such as bool_not, nat_add, list_length, and list_append store Theoria.Equation.Info in their environment declaration metadata. Use:

Theoria.Equation.Info.fetch(env, :nat_add)
Theoria.Equation.Info.all(env)
Theoria.Equation.Info.equation?(env, :nat_add)

Theoria.Equation.Signature summarizes the function telescope, recursive argument, result type, and fixed parameters. Fixed parameters are derived from the signature's parameter telescope by default, with explicit overrides still available for future edge cases. Theoria.Equation.Case.Template describes schematic constructor equations without tying schema generation to concrete library definition names. Theoria.Equation.Schema.Builder turns those inputs into validated schemas and matcher metadata, including discriminant names/positions/types and simple overlap records. The compiler returns Theoria.Equation.Compiled, which contains the recursor body plus generated clause/schema/matcher metadata. Theoria.Equation.Definition.Spec wraps that compiler result with the final checked definition type/value before installation. Theoria.Equation.Recursor.Descriptor reads checked Env.Recursor / Env.RecursorRule metadata, including indexed rule patterns, constructor fields, and recursive field indices. Theoria.Equation.Matcher.Descriptor can represent indexed Vec-like matcher shapes with normalized matcher-owned fields, but matcher type/body generation currently consumes descriptors fully only for the supported non-indexed families. Theoria.Equation.Matcher.Descriptor combines that recursor-derived shape with schema/matcher metadata into a structural matcher description with discriminants, alternatives, result, and recursor. Theoria.Equation.Matcher.Type first plans a descriptor-derived Matcher.Type.Shape containing parameters, motive, discriminants, alternatives, alternative case binders, result, recursor arguments, recursor metadata when available, and body, then generates checked matcher type/body packages by folding that telescope for supported simple shapes. Alternative case binders and recursor arguments are derived from planned binders and normalized descriptor fields for the current simple families; indexed descriptors are planned into validated shapes with major motive arguments/results, per-constructor case results, recursor metadata-backed arity/count checks, and a recursor application body. Experimental indexed emitters can fold those shapes into terms for inspection only. The emitted indexed type and value are kernel-checkable in the experimental path; checked matcher declaration emission remains disabled until indexed matcher integration with Matcher.Spec, validation, and generated equations is complete. Indexed matcher equation statements are dispatched by Theoria.Equation.Matcher.Statement; the current indexed Vec fragment lives in Theoria.Equation.Matcher.Statement.Vec. Matcher.Eqns exposes lookup/planning APIs while theorem realization remains separate. The current indexed Vec statement planner builds kernel-checkable constructor equations for vec_nil and vec_cons, including the recursive matcher call used by the vec_cons case. The indexed Vec planner uses Theoria.Equation.Matcher.Statement.Frame for explicit binder-frame operations and derives constructor fields, constructor applications, index patterns, and recursive hypotheses from planned matcher shapes where the indexed Vec fragment currently supports them. Recursive hypotheses use recursive field metadata instead of fixed Vec field names, so renamed case binders still produce kernel-checkable statements. Statement constants derive their current universe levels from the planned shape parameters, and recursive index instantiation reports tagged errors instead of using bang-style frame lookups. Planned statements can be converted to metadata-only indexed equation lemmas; these lemmas are validated as types and can be realized in the validation-only package with telescope-folded refl proof terms when the planned sides are definitionally equal. Theoria.Equation.Matcher.Indexed.Realization records the current proof strategy (:recursor_iota_refl) and realizes those checked equation artifacts without installing them into Prelude or rewrite/simp databases. Unsupported indexed statement shapes now return tagged errors instead of silently falling back to placeholder equality metadata. Theoria.Equation.Matcher.Spec installs those packages as Theoria.Env.Matcher metadata. Stored metadata records the definition name, checked type/value, recursive argument position, fixed parameters, level parameters, original clause metadata, Lean-shaped matcher alternatives, matcher discriminants, matcher declaration names, matcher mode, and a validated schema that drives schematic theorem generation.

Generated equation lemmas

For the currently supported library definitions, Theoria can generate schematic equation-lemma metadata from compiler-owned Equation.Schema entries:

{:ok, info} = Theoria.Equation.Info.fetch(env, :nat_add)
Theoria.Equation.Lemma.generated_for(info)
#=> [nat_add.eq_zero, nat_add.eq_succ]

Supported definitions today:

  • bool_not
  • bool_and
  • bool_or
  • nat_add
  • list_length
  • list_append

These generated lemmas are metadata until explicitly checked or installed. Schematic lemmas carry binders and an equality type, so they can be checked as forall-theorems by reflexivity when the two sides are definitionally equal, and optionally installed as opaque theorem declarations:

Theoria.Equation.Lemma.add_generated_to_env(env, :nat_add)

The prelude does not install generated equation theorems by default yet.

Equation lookup

Theoria.Equation.Eqns is the small Theoria equivalent of Lean's equation-theorem lookup layer:

alias Theoria.Equation.Identity

Theoria.Equation.Eqns.get(env, :nat_add)
#=> {:ok, [Identity.equation(:nat_add, :zero), Identity.equation(:nat_add, :succ)]}

Theoria.Equation.Eqns.generated(env, :nat_add)
Theoria.Equation.Eqns.unfold(env, :nat_add)
Theoria.Equation.Eqns.source(env, Identity.equation(:nat_add, :succ))
Theoria.Equation.Eqns.installed?(env, :nat_add)

Theoria.Equation.Matcher.Eqns is the matcher-equation side of the same groundwork and reads first-class matcher declarations from the environment:

Theoria.Equation.Matcher.Eqns.get(env, :nat_add_match_1)
#=> {:ok, [Identity.matcher_equation(:nat_add_match_1, :zero), Identity.matcher_equation(:nat_add_match_1, :succ)]}

Theoria.Equation.Matcher.Eqns.source(env, Identity.matcher_equation(:nat_add_match_1, :succ))
#=> {:ok, :nat_add_match_1}

Generated metadata can also be realized as checked Theoria.Equation.Realized artifacts without installing declarations:

Theoria.Equation.Eqns.realize(env, :nat_add)
Theoria.Equation.Eqns.realize(env, Identity.equation(:nat_add, :succ))
Theoria.Equation.Eqns.realize(env, Identity.unfold(:nat_add))
Theoria.Equation.Matcher.Eqns.realize(env, Identity.matcher_equation(:nat_add_match_1, :succ))

Theoria.Equation.Extension provides typed registry helpers for source lookup, matcher lookup, unfold identities, generated identity enumeration, structured theorem identities (theorem_ids/1, equation_ids/1, matcher_equation_ids/2, unfold_id/2), and an in-memory Extension.Registry snapshot. It is currently rebuilt from environment metadata rather than stored as a persistent disk-backed extension.

Matcher maturity

Matcher declarations have explicit modes:

  • :source_aligned — the declaration is kernel checked, but its type/value mirror the source definition while the matcher metadata is developed.
  • :matcher — the declaration has a generated matcher type and body.

Current status:

  • indexed Vec-like matcher descriptors can be represented, validated, and planned into Matcher.Type.Shape from vec_ind, including constructor fields, recursive fields, index binders, dependent motive binders/results, per-alternative index patterns, constructor applications, per-constructor case results, recursor arguments, and a planned recursor application body. Experimental indexed type/value term emission exists for inspection, and indexed matcher equation statement planning is separated under Matcher.Statement, but normal Matcher.Type/Matcher.Spec checked declaration emission intentionally rejects indexed shapes until dependent matcher generation exists;
  • unary Bool definitions such as bool_not use :matcher mode with type ∀ motive, ∀ b, motive → motive → motive and a Bool recursor body;
  • Nat definitions such as nat_add use :matcher mode with type ∀ motive, ∀ n, motive → (Nat → motive → motive) → motive and a Nat recursor body;
  • List definitions such as list_length and list_append use :matcher mode with a List recursor body;
  • binary Bool definitions such as bool_and and bool_or use :matcher mode with two discriminants and four alternatives.

Future stages will add true matcher arity for more families, independently compiled matcher bodies, and persistent matcher equation extensions.

Rewrite databases

Generated equation lemmas can feed the provisional rewrite database. Ordinary matcher equations can be included explicitly; indexed matcher equation lemmas remain metadata-only and are not added to rewrite/simp databases until proof realization exists:

database = Theoria.Rewrite.Database.from_env_equations(env)
Theoria.Rewrite.Database.once(database, term)

combined = Theoria.Rewrite.Database.from_env_all_equations(env)
Theoria.Rewrite.Database.once(combined, term)

The rewrite layer is intentionally untrusted and first-order for now. It can match schematic equation-rule binders, but it does not produce proofs and does not replace kernel checking.

Simp groundwork

Theoria.Simp is a tiny consumer of generated equation databases:

Theoria.Simp.once(env, term)
Theoria.Simp.normalize(env, term, max_steps: 100)

It repeatedly applies priority-sorted generated equation rewrite rules and returns the final term plus rich %Theoria.Simp.Step{} trace entries. It is not a trusted simplifier yet.

Indexed matcher equations are excluded from combined rewrite/simp databases by default. For explicit experiments, use:

Theoria.Rewrite.Database.from_env_indexed_matcher_equations(env)
Theoria.Simp.Database.from_env_indexed_matcher_equations(env)
Theoria.Rewrite.Database.from_env_all_equations(env, include_indexed_matchers: true)

A small smoke-test task runs built-in examples:

mix theoria.simp --examples

Mix tooling

List equation metadata, generated lemma names, matcher metadata, discriminants, alternatives, unfold metadata, and matcher equations:

mix theoria.equations
mix theoria.equations nat_add

Opt-in install generated equation theorems during the task run:

mix theoria.equations --install nat_add

Native validation kernel-checks generated equation theorems, matcher declarations, registry coherence, and lazy realization APIs, then reports them explicitly:

✓ matcher declarations: 6 checked matcher(s)
✓ generated equations: 16 artifact(s)
✓ matcher equations: 16 artifact(s)

It can also show equation metadata:

mix theoria.validate --equations
mix theoria.validate --equations --verbose

The verbose form prints generated lemma names under each stored equation definition.

Limitations by layer

LayerCurrent limitation
Public syntaxNo public equation-definition syntax yet. Library definitions feed the internal compiler directly.
Fixed parametersSignature-derived fixed parameters only; no mutual-recursive permutation/dependency analysis yet.
Recursor.DescriptorDescribes non-indexed and indexed recursor rules, including Vec index patterns, constructor fields, and recursive field indices; it is not yet a full dependent telescope analyzer.
Matcher.DescriptorRecursor-informed Bool/Nat/List descriptors plus indexed Vec-like descriptor shape with normalized matcher-owned fields; dependent/indexed matcher type/body generation is still unsupported.
Matcher.TypePlans and validates descriptor-derived Matcher.Type.Shape values with parameters, indices, motive binders/results, discriminants, alternatives, derived case binders, index patterns, constructor applications, per-constructor case results, recursor metadata-backed counts, recursor arguments, and recursor application bodies, then folds simple non-indexed telescopes. Experimental indexed emitters are term-planning only; not a general dependent matcher compiler.
Eqns / Matcher.EqnsGenerated equation metadata and Realized artifact helpers, not a persistent Lean-style environment extension. Indexed statement construction lives in Matcher.Statement; realization remains separate from optional theorem installation.
Extension.RegistryIn-memory snapshot rebuilt from environment metadata; not persisted to disk.
Structural recursionNo general structural recursion checker and no brecOn/below dictionaries.
Rewrite / SimpUntrusted first-order rewriting with no proof terms and no attribute/prioritized simp database.
Lean oracleContributor-only external validation; not a runtime dependency and not trusted by the kernel.

The current generator is deliberately small. It emits schematic equation lemmas from schemas for the built-in Bool/Nat/List definitions, derives fixed parameters from signatures, and stores basic matcher/discriminant/overlap metadata in checked matcher declarations. It is not yet Lean's full equation theorem machinery.