A production-grade Datalog engine for Elixir.

ExDatalog implements bottom-up Datalog evaluation using semi-naive fixpoint computation. Programs are built with a declarative builder API, validated, compiled to an engine-neutral IR, and evaluated by a pluggable backend.

Hex.pm Hex.pm Hex.pm HexDocs.pm Coverage Status

New to Datalog? Read the What is Datalog? guide for a comprehensive introduction covering history, concepts, industry use cases, and how Datalog can serve as a knowledge layer for LLMs.

Why Datalog Matters

Datalog remains one of the most elegant ways to express:

  • recursive queries;
  • graph traversal;
  • rule-based systems;
  • derived knowledge;
  • dependency analysis;
  • provenance tracking;
  • temporal reasoning;
  • incremental computation.

It continues to influence modern databases, compilers, static analysis tools, knowledge graphs, authorization systems, and AI reasoning engines.

Features

  • Builder API for constructing programs (relations, facts, rules, constraints)
  • Constraint types: comparisons, arithmetic, type predicates, string predicates, membership
  • Negation with stratified evaluation
  • Recursive rules with semi-naive fixpoint evaluation
  • Pluggable storage backends: Storage.Map (default, on-heap) and Storage.ETS (off-heap, concurrent reads)
  • Provenance / derivation explain (explain: true)
  • Telemetry integration (:telemetry events for query lifecycle)
  • Deterministic: same program + same facts = same result regardless of backend
  • 601 tests, 0 failures, credo clean, dialyzer clean

Installation

Add ex_datalog to your dependencies in mix.exs:

def deps do
  [
    {:ex_datalog, "~> 0.2.0"}
  ]
end

Quick Start

Transitive closure

The classic Datalog example: compute all ancestors from parent facts.

alias ExDatalog
alias ExDatalog.{Program, Rule, Atom, Term}

{:ok, result} =
  Program.new()
  |> Program.add_relation("parent", [:atom, :atom])
  |> Program.add_relation("ancestor", [:atom, :atom])
  |> Program.add_fact("parent", [:alice, :bob])
  |> Program.add_fact("parent", [:bob, :carol])
  |> Program.add_fact("parent", [:carol, :dave])
  |> Program.add_rule(
       Rule.new(
         Atom.new("ancestor", [Term.var("X"), Term.var("Y")]),
         [{:positive, Atom.new("parent", [Term.var("X"), Term.var("Y")])}]
       )
     )
  |> Program.add_rule(
       Rule.new(
         Atom.new("ancestor", [Term.var("X"), Term.var("Z")]),
         [
           {:positive, Atom.new("parent", [Term.var("X"), Term.var("Y")])},
           {:positive, Atom.new("ancestor", [Term.var("Y"), Term.var("Z")])}
         ]
       )
     )
  |> ExDatalog.query()

result.relations["ancestor"]
#=> MapSet.new([{:alice, :bob}, {:bob, :carol}, {:carol, :dave},
#=>             {:alice, :carol}, {:bob, :dave}, {:alice, :dave}])

Arithmetic constraints

Compute derived values in rules. Find numbers and their doubles:

{:ok, result} =
  Program.new()
  |> Program.add_relation("number", [:integer])
  |> Program.add_relation("doubled", [:integer, :integer])
  |> Program.add_fact("number", [1])
  |> Program.add_fact("number", [2])
  |> Program.add_fact("number", [3])
  |> Program.add_rule(
       Rule.new(
         Atom.new("doubled", [Term.var("X"), Term.var("Y")]),
         [{:positive, Atom.new("number", [Term.var("X")])}],
         [Constraint.add(Term.var("X"), {:const, 2}, Term.var("Y"))]
       )
     )
  |> ExDatalog.query()

result.relations["doubled"]
#=> MapSet.new([{1, 3}, {2, 4}, {3, 5}])

Type predicates and membership

Filter bindings by Elixir type or list membership:

# Keep only integer values from a mixed-type relation
Rule.new(
  Atom.new("int_value", [Term.var("X")]),
  [{:positive, Atom.new("value", [Term.var("X")])}],
  [Constraint.type_integer(Term.var("X"))]
)

# Keep only "primary" colors
Rule.new(
  Atom.new("primary_color", [Term.var("X")]),
  [{:positive, Atom.new("color", [Term.var("X")])}],
  [Constraint.member(Term.var("X"), {:const, [:red, :blue, :green]})]
)

# Keep only strings that start with "hel"
Rule.new(
  Atom.new("hello_word", [Term.var("X")]),
  [{:positive, Atom.new("word", [Term.var("X")])}],
  [Constraint.starts_with(Term.var("X"), {:const, "hel"})]
)

Negation

Use negative body atoms with stratified evaluation. Find people who are not parents:

Program.add_rule(
  Rule.new(
    Atom.new("childless", [Term.var("X")]),
    [
      {:positive, Atom.new("person", [Term.var("X")])},
      {:negative, Atom.new("parent", [Term.var("X"), Term.wildcard()])}
    ]
  )
)

ETS backend

For workloads exceeding ~100K facts, use the ETS backend for off-heap storage and reduced GC pressure:

{:ok, result} = ExDatalog.query(program, storage: ExDatalog.Storage.ETS)

# Or with options:
{:ok, result} =
  ExDatalog.query(program,
    storage: ExDatalog.Storage.ETS,
    storage_opts: [access: :public, write_concurrency: true]
  )

Provenance / explain

Track which rule derived each fact:

{:ok, result} = ExDatalog.query(program, explain: true)
result.provenance.fact_origins
#=> %{"ancestor" => %{{:alice, :bob} => "rule_0", ...}, ...}

Telemetry

ExDatalog emits :telemetry events at the start, end, and on exceptions during query evaluation:

:telemetry.attach("my-handler", [:ex_datalog, :query, :stop], &handle_stop/4, nil)

def handle_stop(_event, measurements, metadata, _config) do
  IO.puts("Query completed in #{measurements.duration} µs (#{measurements.iterations} iterations)")
  IO.puts("Storage backend: #{metadata.storage_type}")
end

Architecture

ExDatalog.Program  (builder)
        |
        v
ExDatalog.Validator  (structural + semantic + stratification)
        |
        v
ExDatalog.Compiler  (AST -> IR)
        |
        v
ExDatalog.Engine  (behaviour)
        |
        v
ExDatalog.Engine.Naive  (semi-naive fixpoint)
        |
        v  uses
ExDatalog.Storage.Map | ExDatalog.Storage.ETS
        |
        v
ExDatalog.Result

The Storage behaviour defines the contract for pluggable backends. Storage.Map uses on-heap Maps and MapSets; Storage.ETS uses per-relation ETS tables with wrapped-tuple keys for O(1) membership and correct set semantics. Both guarantee deterministic output order from stream/2, so identical programs produce identical results on either backend.

Constraint evaluation dispatches through ExDatalog.Constraint.evaluate/3, which routes to the appropriate module (Comparison, Arithmetic, Type, StringPredicate, Membership) based on the constraint's op field. The evaluation context carries the storage backend's capabilities, enabling future constraint types to inspect them.

Storage backends

BackendStorageBest forConcurrent reads
Storage.MapOn-heap Maps/MapSets<100K factsNo
Storage.ETSPer-relation ETS tables>100K facts, GC reliefYes (with access: :public)

Both backends produce identical output for the same program and facts. The portability test suite verifies this for transitive closure, arithmetic, comparison, negation, type predicates, string predicates, and membership constraints.

See docs/storage_backends.md for the full backend contract, options, and capability model.

Constraints

See docs/constraints.md for the complete constraint reference.

CategoryOpsBinds result?
Comparisongt, lt, gte, lte, eq, neqNo (filter)
Arithmeticadd, sub, mul, divYes
Type predicatetype_integer, type_binary, type_atomNo (filter)
String predicatestarts_with, containsNo (filter)
MembershipmemberNo (filter)

Documentation

  • What is Datalog? — introduction, history, Prolog comparison, industry use cases, LLM integration
  • Constraints — constraint types, evaluation, and the dispatch model
  • Storage Backends — Map vs ETS, options, capabilities, determinism guarantee
  • API reference — full module and function documentation

Generate docs locally:

mix docs

Further Reading on Datalog

Datalog sits at the intersection of databases, logic programming, graph reasoning, and knowledge systems.
The following references are highly recommended for understanding both the theory and practice behind modern Datalog systems.

Modern Datalog Systems

  • DataScript

    • Immutable in-memory Datalog database for Clojure and JavaScript.
    • Excellent examples of practical Datalog query design.
  • XTDB

    • Bitemporal document database with a powerful Datalog query engine.
    • Demonstrates how Datalog can power modern data-intensive systems.
  • Soufflé

    • High-performance Datalog engine widely used in static analysis and security research.
    • Excellent resource for understanding scalable and compiled Datalog execution.
  • Differential Datalog (DDlog)

    • Incremental and streaming-oriented Datalog system built on differential dataflow.
    • Useful for understanding reactive and continuously updated derived state.

Foundational Theory


Talks and Presentations


Roadmap

VersionDescription
v0.2.0ETS backend, constraint behaviour, type/string/membership predicates, capabilities, provenance, telemetry
v0.3.0Aggregation (count, sum, min, max), general predicates as deterministic BEAM callbacks
v0.4.0Magic sets / demand-driven evaluation, external solver adapter (experimental Z3/Soufflé)
v1.0.0Stable public API, hardened production semantics

License

MIT