Equality Saturation
Copy MarkdownIntro
These notebooks are Livebook adaptations of the upstream egglog tutorial from https://egraphs-good.github.io/egglog-tutorial/. The egglog examples follow the upstream chapter order closely; the extra Elixir cells are there to make normally silent egglog state visible in Livebook.
This first chapter is about the core equality-saturation loop:
- Define a small term language.
- Insert ground terms into an e-graph.
- Add equations as rewrite rules.
- Run those rules until the e-graph contains useful equivalent forms.
- Extract a representative term from an equivalence class.
Each executable egglog block is sent directly to one live mutable Egglog.EGraph session. The setup cell creates that session and defines a small Tutorial module only for displaying outputs and snapshots. The tutorial cells mostly have this shape:
egglog = """
... <egglog code> ...
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)The egraph value is the stateful Elixir handle to the native egglog session. If you want to experiment, edit the egglog program between the triple quotes and re-run the cell.
For visual snapshots, install Graphviz and make sure dot is available on PATH. If dot is unavailable, the helper returns Graphviz DOT text instead of an SVG picture.
The helper deliberately keeps egglog syntax visible. The Elixir code is only notebook plumbing; congruence closure, matching, rebuilding, scheduling, and extraction all happen in native egglog.
When using the published package, replace the dependency with {:egglog_elixir, "~> 0.1"}.
Mix.install([
{:egglog_elixir, path: Path.expand("../..", __DIR__)},
{:kino, "~> 0.19"}
])defmodule Tutorial do
def print_outputs(result, code \\ "") do
(result.outputs ++ check_outputs(code))
|> Enum.map(& &1.text)
|> Enum.reject(&is_nil/1)
|> Enum.map(&String.trim_trailing/1)
|> Enum.reject(&(&1 == ""))
|> Enum.each(&IO.puts/1)
end
def render_snapshot(%{format: :svg, svg: svg}) when is_binary(svg) do
kino_html = Module.concat(Kino, HTML)
if Code.ensure_loaded?(kino_html) do
apply(kino_html, :new, [svg])
else
svg
end
end
def render_snapshot(%{text: text}), do: text
def summarize_egraph(egraph, opts \\ []) do
summary =
egraph
|> json_snapshot(opts)
|> Egglog.Snapshot.summary()
%{
e_nodes: summary.nodes,
e_classes: summary.classes,
by_operator: summary.operators,
omitted: summary.omitted
}
end
def sample_json_snapshot(egraph, opts \\ []) do
snapshot = json_snapshot(egraph, opts)
graph = Jason.decode!(snapshot.json)
%{
format: snapshot.format,
top_level_keys: Map.keys(graph) |> Enum.sort(),
node_count: map_size(graph["nodes"]),
class_count: map_size(graph["class_data"]),
sample_node: graph["nodes"] |> Enum.take(1)
}
end
defp json_snapshot(egraph, opts) do
Egglog.EGraph.snapshot!(egraph, Keyword.merge([render: :json], opts))
end
defp check_outputs(code) do
code
|> String.split("\n")
|> Enum.map(&String.trim/1)
|> Enum.reject(&(&1 == ""))
|> Enum.filter(&String.contains?(&1, "(check "))
|> Enum.map(fn line ->
text =
if String.starts_with?(line, "(fail ") do
"expected failure observed: #{line}"
else
"check succeeded: #{line}"
end
%{type: :check_success, text: text <> "\n"}
end)
end
end
{:ok, egraph} = Egglog.EGraph.new()From here on, the important API call is Egglog.EGraph.run(egraph, egglog). The e-graph is stateful, so every successful run mutates the same native session.
The remaining Tutorial functions are display helpers:
Tutorial.print_outputs(result, egglog) shows printed egglog output and friendly messages for successful checks.
Tutorial.summarize_egraph(egraph) asks the wrapper for a JSON snapshot and summarizes the current e-graph.
Tutorial.sample_json_snapshot(egraph) shows a small sample of the native JSON snapshot format.
Tutorial.render_snapshot(snapshot) turns an SVG snapshot into Livebook HTML when Graphviz is available.
Equality Saturation
In the upstream tutorial, this chapter begins by building an optimizer for a subset of linear algebra using egglog. We start with simple integer arithmetic expressions. Our initial DSL supports constants, variables, addition, and multiplication.
This Livebook version stays close to native egglog syntax. We write the object language explicitly:
2 * (x + 3) in the mathematical DSL becomes (Mul (Num 2) (Add (Var "x") (Num 3))) in egglog.
(datatype Expr
(Num i64)
(Var String)
(Add Expr Expr)
(Mul Expr Expr))This declaration introduces a first-order signature:
Num : i64 -> ExprVar : String -> ExprAdd : Expr x Expr -> ExprMul : Expr x Expr -> Expr
At this point there are no interesting Expr terms yet. We have described the grammar, but we have not inserted any arithmetic expressions into the e-graph.
egglog = """
(datatype Expr
(Num i64)
(Var String)
(Add Expr Expr)
(Mul Expr Expr))
"""
{:ok, _result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.summarize_egraph(egraph)Now, let's define some simple expressions.
; expr1 = 2 * (x * 3)
(let expr1 (Mul (Num 2) (Mul (Var "x") (Num 3))))
; expr2 = 6 * x
(let expr2 (Mul (Num 6) (Var "x")))egglog = """
(let expr1 (Mul (Num 2) (Mul (Var "x") (Num 3))))
(let expr2 (Mul (Num 6) (Var "x")))
"""
{:ok, _result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.summarize_egraph(egraph)Now the e-graph under discussion contains two named terms:
expr1 = 2 * (x * 3)expr2 = 6 * x
They are not equal yet. We have only inserted them.
The upstream web tutorial shows the e-graph in the browser. In Livebook we ask the wrapper for a comparable snapshot. The snapshot is produced by native egglog's serializer; the helper only asks Graphviz to render it.
Egglog.EGraph.snapshot!(egraph)
|> Tutorial.render_snapshot()The same native serializer can export JSON. This is more useful for programmatic inspection than for visualization. The top-level maps expose serialized e-nodes and e-class metadata.
Tutorial.sample_json_snapshot(egraph)The JSON summary is intentionally small: count e-nodes, count e-classes, and show which constructors currently appear. Think of e-nodes as operation-symbol applications and e-classes as congruence classes.
We can use the extract command to print the results of expressions in egglog.
(extract "Hello, world!") ; prints "Hello, world!"
(extract 42) ;; prints 42
(extract expr1) ;; prints (Mul (Num 2) (Mul (Var "x") (Num 3)))
(extract expr2) ;; prints (Mul (Num 6) (Var "x"))egglog = """
(extract "Hello, world!")
(extract 42)
(extract expr1)
(extract expr2)
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)
Tutorial.summarize_egraph(egraph)Right now extraction is boring: each e-class mostly contains the term we inserted. Later, after saturation, extraction will choose among many equivalent terms.
We can use the check commands to check properties of our e-graph.
Successful checks in egglog are assertions: they normally print nothing. The Livebook helper makes them visible by adding a small :check_success output. We also ask for an extraction and a JSON snapshot summary so the cell shows what the assertion is talking about.
(check (= expr1 (Mul x y)))
(extract expr1)egglog = """
(check (= expr1 (Mul x y)))
(extract expr1)
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)
Tutorial.summarize_egraph(egraph)This checks if expr1 is equivalent to some expression (Mul x y), where x and y are
variables that can be mapped to any expression in the e-graph.
There are two different uses of the word “variable” here:
(Var "x")is an object-language variable, represented by theVarconstructor in ourExprdatatype. It is part of the expression being optimized.- The bare names
xandyin(check (= expr1 (Mul x y)))are egglog pattern variables. They range over e-classes while egglog tries to match the pattern.
This distinction is easy to miss because both are printed as variables, but they live at different levels: object-language terms versus egglog patterns.
Checks can fail. For example the following check fails because expr1 is not equivalent to
(Add x y) for any x and y in the e-graph.
(fail (check (= expr1 (Add x y))))egglog = """
(fail (check (= expr1 (Add x y))))
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)fail is egglog's way to turn an expected failed check into a successful command. In a tutorial or test, this is usually better than letting a failed check abort the whole cell.
Let us define some rewrite rules over our small DSL.
(rewrite (Add x y) (Add y x))egglog = """
(rewrite (Add x y) (Add y x))
"""
{:ok, _result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.summarize_egraph(egraph)This rule asserts that addition is commutative. Operationally: whenever the e-graph contains a term matching (Add x y), egglog may add (Add y x) and union the two e-classes.
Similarly, we can define the associativity rule for addition.
(rewrite (Add x (Add y z)) (Add (Add x y) z))egglog = """
(rewrite (Add x (Add y z)) (Add (Add x y) z))
"""
{:ok, _result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.summarize_egraph(egraph)This rule says that a right-nested addition may be associated to the left. Notice that we are giving egglog equations, not an evaluator. A rewrite adds equality information; it does not destructively replace the old term.
There are two subtleties to rules in egglog.
- Defining a rule is different from running it. The addition commutativity rule is now registered, but it has not produced new equivalences yet. We can make that visible with a small probe term:
(let add_probe (Add (Var "x") (Num 3)))
(fail (check (= add_probe (Add (Num 3) (Var "x")))))egglog = """
(let add_probe (Add (Var "x") (Num 3)))
(fail (check (= add_probe (Add (Num 3) (Var "x")))))
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)
Tutorial.summarize_egraph(egraph)- Rules are not instantiated for every possible ground term in the free algebra. They are instantiated against terms currently represented in the e-graph. To show this cleanly, we run only the addition rules and then check a term that was never inserted as a named expression.
(run 1)
(check (= add_probe (Add (Num 3) (Var "x"))))
(fail (check (= (Add (Num -2) (Num 2)) (Add (Num 2) (Num -2)))))egglog = """
(run 1)
(check (= add_probe (Add (Num 3) (Var "x"))))
(fail (check (= (Add (Num -2) (Num 2)) (Add (Num 2) (Num -2)))))
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)
Tutorial.summarize_egraph(egraph)After the one-step run, add_probe is equivalent to its swapped form because add_probe was present and matched the rule. The unrelated -2 + 2 and 2 + -2 terms are not automatically equated merely because the rule exists: equality saturation is saturation of the current e-graph, not enumeration of every ground term over the signature.
Let's also define commutativity and associativity for multiplication.
(rewrite (Mul x y) (Mul y x))
(rewrite (Mul x (Mul y z)) (Mul (Mul x y) z))egglog = """
(rewrite (Mul x y) (Mul y x))
(rewrite (Mul x (Mul y z)) (Mul (Mul x y) z))
"""
{:ok, _result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.summarize_egraph(egraph)egglog also defines a set of built-in functions over the primitive types, such as + and *.
(extract (+ 1 2))egglog = """
(extract (+ 1 2))
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)egglog supports operator overloading, so the same operator can be used with different types.
(extract (+ "1" "2"))
(extract (+ 1.0 2.0))egglog = """
(extract (+ "1" "2"))
(extract (+ 1.0 2.0))
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)
Tutorial.summarize_egraph(egraph)With primitives, we can define rewrite rules that talk about the semantics of operators. The following rules show constant folding over addition and multiplication.
(rewrite (Add (Num a) (Num b))
(Num (+ a b)))
(rewrite (Mul (Num a) (Num b))
(Num (* a b)))egglog = """
(rewrite (Add (Num a) (Num b))
(Num (+ a b)))
(rewrite (Mul (Num a) (Num b))
(Num (* a b)))
"""
{:ok, _result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.summarize_egraph(egraph)The addition rules have already acted on add_probe, but the multiplication and constant-folding rules have not yet optimized expr1 or connected it to expr2. To run the rules we have defined so far, we can use the run command.
(run 10)
(extract expr1)
(extract expr2)egglog = """
(run 10)
(extract expr1)
(extract expr2)
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)
Tutorial.summarize_egraph(egraph)At this point the graph contains the original expressions and the alternatives created by associativity, commutativity, and constant folding. In particular, egglog can discover the path
2 * (x * 3) = (2 * 3) * x = 6 * x.
This snapshot shows equality saturation as growth of an equivalence structure, not as a single destructive rewrite path.
Egglog.EGraph.snapshot!(egraph)
|> Tutorial.render_snapshot()This tells egglog to run our rules for 10 iterations. More precisely, egglog runs the following pseudo code:
G = currentEgraph()
for i in 1..10:
for each rule r:
ms = r.find_matches(G)
for m in ms:
G = G.apply_rule(r, m)
G = rebuild(G)In other words, egglog computes all the matches for one iteration before making any
updates to the e-graph. This is in contrast to an evaluation model where rules are immediately
applied and the matches are obtained on demand over a changing e-graph.
Thus, after running the rules, you should see the e-graph has grown
a little bit, and that (Mul (Num 2) (Mul (Var "x") (Num 3))) and (Mul (Num 6) (Var "x")) are in the same E-class. In fact, we can check that
(check (= expr1 expr2))
(extract expr1)
(extract expr2)egglog = """
(check (= expr1 expr2))
(extract expr1)
(extract expr2)
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)
Tutorial.summarize_egraph(egraph)The check is the important mathematical assertion: the two named terms now lie in the same congruence class generated by the rewrite equations and the currently represented terms.
Now that expr1 and expr2 have been checked equal, the snapshot should show their representatives living in the same e-class.
Egglog.EGraph.snapshot!(egraph)
|> Tutorial.render_snapshot()Exercises
Distributivity. Define the distributivity rule for multiplication over addition. When you're done, the following code should pass.
; expr3 = 2 * (x + 3)
(let expr3 (Mul (Num 2) (Add (Var "x") (Num 3))))
; expr4 = 6 + 2 * x
(let expr4 (Add (Num 6) (Mul (Num 2) (Var "x"))))
(fail (check (= expr3 expr4)))
(run 10)
(check (= expr3 expr4))Try this yourself first. One possible solution is below. This cell continues from the same live e-graph session as the upstream chapter.
(rewrite (Mul a (Add b c))
(Add (Mul a b) (Mul a c)))
(let expr3 (Mul (Num 2) (Add (Var "x") (Num 3))))
(let expr4 (Add (Num 6) (Mul (Num 2) (Var "x"))))
(fail (check (= expr3 expr4)))
(run 10)
(check (= expr3 expr4))
(extract expr3)
(extract expr4)egglog = """
(rewrite (Mul a (Add b c))
(Add (Mul a b) (Mul a c)))
(let expr3 (Mul (Num 2) (Add (Var "x") (Num 3))))
(let expr4 (Add (Num 6) (Mul (Num 2) (Var "x"))))
(fail (check (= expr3 expr4)))
(run 10)
(check (= expr3 expr4))
(extract expr3)
(extract expr4)
"""
{:ok, result} = Egglog.EGraph.run(egraph, egglog)
Tutorial.print_outputs(result, egglog)
Tutorial.summarize_egraph(egraph)The first check is wrapped in fail because before running the new rule, expr3 and expr4 should not yet be known equal. After run 10, distributivity, commutativity, associativity, and constant folding are enough to put them in the same e-class.
Egglog.EGraph.snapshot!(egraph)
|> Tutorial.render_snapshot()Birewrite. Egglog has the birewrite keyword, which is similar to rewrite, but it allows rewriting in both directions.
Try making some of the rules above into birewrite rules. Looking at the rules we have defined:
- When is a bidirectional equation useful for exploration?
- When can it cause avoidable e-graph growth?
- When is a direction not even well-defined because one side has variables or primitive computations that do not make sense in reverse?
For example, commutativity is naturally bidirectional. Constant folding is not: (Num (+ a b)) does not uniquely determine a and b.
Take Away
Equality saturation is not normal-form rewriting. We did not choose one rewrite order and hope it was good. Instead, we accumulated many equivalent terms in one congruence structure, then asked extraction to choose a representative.
The finite e-graph matters. Rules fire against represented terms and represented patterns; they do not enumerate all possible ground terms over the signature.