ExDatalog.Storage.Map (ExDatalog v0.2.0)

Copy Markdown View Source

Map-based storage implementation using Maps and MapSets.

This is the default storage backend. It uses immutable Elixir data structures (Maps and MapSets) which are thread-safe and easy to inspect, but may suffer GC pressure for very large fact sets (>100K tuples). For workloads exceeding 100K facts, use Storage.ETS which provides off-heap storage and better concurrent read scalability.

Internal structure

%{
  relations: %{
    "parent" => MapSet.new([{:alice, :bob}, {:carol, :dave}]),
    ...
  },
  indexes: %{
    {"parent", [1]} => %{
      {:bob} => MapSet.new([{:alice, :bob}]),
      {:dave} => MapSet.new([{:carol, :dave}])
    }
  },
  schemas: %{
    "parent" => %{arity: 2, types: [:atom, :atom]},
    ...
  }
}

Index keys are always tuples, even for single-column indexes. This avoids ambiguity between a plain value and a 1-tuple.

Summary

Functions

Builds a hash index on the specified columns for a relation.

Returns the capabilities of this storage backend.

Retrieves tuples matching an indexed key.

Initializes storage for the given relation schemas.

Inserts a single tuple into a relation.

Inserts an enumerable of tuples into a relation.

Checks whether a tuple exists in a relation.

Returns a sorted list of all relation names in the storage.

Returns the number of tuples stored in a relation.

Returns all tuples in a relation as a deterministically sorted list.

Releases resources held by this storage backend.

Incrementally merges delta tuples into an existing index.

Types

index()

@type index() :: %{required(ExDatalog.Storage.key_values()) => MapSet.t()}

t()

@type t() :: %ExDatalog.Storage.Map{
  indexes: %{
    required({ExDatalog.Storage.relation_name(), ExDatalog.Storage.index_key()}) =>
      index()
  },
  relations: %{required(ExDatalog.Storage.relation_name()) => MapSet.t()},
  schemas: ExDatalog.Storage.schemas()
}

Functions

build_index(state, relation, columns)

Builds a hash index on the specified columns for a relation.

Creates a lookup structure mapping projected column values to sets of tuples. Raises ArgumentError if relation is not in the schema.

capabilities(map)

@spec capabilities(t()) :: ExDatalog.Capabilities.t()

Returns the capabilities of this storage backend.

The Map backend supports arithmetic, comparison, type-check, and string constraints, provenance, but does not support indexed lookup or concurrent reads. It uses on-heap immutable Elixir data structures.

get_indexed(map, relation, columns, key)

Retrieves tuples matching an indexed key.

Returns tuples from a pre-built index that match the given key on the specified columns. Returns [] if the index or key does not exist.

Must call build_index/3 before calling get_indexed/4 for the same relation and columns.

init(schemas)

@spec init(ExDatalog.Storage.schemas()) :: t()

Initializes storage for the given relation schemas.

schemas is a map from relation name to %{arity: n, types: [atom()]}. Creates an empty MapSet for each relation and returns the initial state.

Raises if two schemas share the same relation name (should be caught by validation before reaching storage).

init(schemas, opts)

@spec init(
  ExDatalog.Storage.schemas(),
  keyword()
) :: t()

insert(state, relation, tuple)

Inserts a single tuple into a relation.

Idempotent: inserting a tuple that already exists is a no-op (MapSet semantics). Raises ArgumentError if relation is not in the schema.

insert_many(state, relation, tuples)

@spec insert_many(t(), ExDatalog.Storage.relation_name(), Enumerable.t()) :: t()

Inserts an enumerable of tuples into a relation.

More efficient than repeated insert/3 calls for bulk loading, as it reduces Map updates. Idempotent for individual tuples. Raises ArgumentError if relation is not in the schema.

member?(map, relation, tuple)

Checks whether a tuple exists in a relation.

Returns true if tuple is a member of the relation's MapSet, false otherwise. Returns false if relation is unknown.

relations(map)

@spec relations(t()) :: [ExDatalog.Storage.relation_name()]

Returns a sorted list of all relation names in the storage.

Useful for debugging and introspection.

size(map, relation)

Returns the number of tuples stored in a relation.

Returns 0 if relation is unknown.

stream(map, relation)

Returns all tuples in a relation as a deterministically sorted list.

Used by the engine to iterate over a relation's facts during join evaluation. Returns [] if relation is unknown.

The output is sorted to guarantee deterministic iteration order regardless of the underlying data structure's internal ordering. This ensures that identical programs with identical facts produce identical results across all storage backends.

teardown(map)

@spec teardown(t()) :: :ok

Releases resources held by this storage backend.

For the Map backend, this is a no-op since all data is held in immutable Elixir data structures that are garbage-collected automatically.

update_index(state, relation, columns, delta)

Incrementally merges delta tuples into an existing index.

If no index exists for the given relation and columns, one is built from the current relation data first, then the delta is merged in.

Raises ArgumentError if relation is not in the schema.