View Source README

Hex.pm Version

Elixir implementation of a pairing heap.

Background

A heap is a tree data structure that ensures that the root always has the minimum or maximum key value, depending on the desired sort order. The heap was first introduced in the context of efficient sorting (see heapsort), but it can also be used to implement a priority queue.

Each node of a heap has a key on which the data is ordered, as well as a collection of child nodes. Every heap node satisfies the heap property that the node key is either less than or equal to (a min-heap) or greater than or equal to (a max-heap) the keys of its children. In a min-heap, the root node has the smallest key in the tree, while for a max-heap, the root key is the largest.

A pairing heap is one of a handful of commonly-used heap data structures. It has excellent performance characteristics, with O(1) time complexity for an insert and for viewing the root node, and O(log n) amortized time complexity for removing the root node and rebuilding the heap. The algorithms for maintaining a pairing heap have especially simple implementations in a functional language, which is one reason for doing this in Elixir.

Usage

PairingHeap provides operations for the creation of a heap, for insertion, retrieval, and deletion of elements, and for merging multiple heaps.

To create an empty min-heap that compares keys with &<=/2, use PairingHeap.new/1:

iex> PairingHeap.new(:min)
#PairingHeap<root: :empty, size: 0, mode: :min>

An empty heap is indicated with root: :empty.

The PairingHeap struct is not intended to be used directly. Use the public API when working with an instance of PairingHeap.

The data contained in each node of a heap is a key-value pair expressed as the tuple {key, value}. A key is what is being ordered in the heap, while the value can be data of any type. Note that a heap is a not a set; duplicate keys, values, and key-value pairs can be present.

Multiple key-value pairs can be added to a heap at the time of creation with PairingHeap.new/2:

iex> PairingHeap.new(:min, [{2, :b}, {1, :a}])
#PairingHeap<root: {1, :a}, size: 2, mode: :min>

The first argument to PairingHeap.new/1 and PairingHeap.new/2 is the mode of the heap, which can be one of

  • :min - a min-heap, where keys are compared with &<=/2
  • :max - a max-heap, where keys are compared with &>=/2
  • {:min | :max, module} - a min- or max-heap, where the module must implement a compare function with signature (any, any -> :lt | :eq | :gt)

For example, a min-heap with Date keys would be initialized as

iex> PairingHeap.new({:min, Date}, [{~D[2023-09-01], :b}, {~D[2023-08-01], :a}])
#PairingHeap<root: {~D[2023-08-01], :a}, size: 2, mode: {:min, Date}>

To add a single key-value pair to a heap, use PairingHeap.put/3:

iex> PairingHeap.new(:min) |> PairingHeap.put(1, :a)
#PairingHeap<root: {1, :a}, size: 1, mode: :min>

PairingHeap.peek/1 is used to obtained the key-value pair of the root without modifying the heap:

iex> PairingHeap.new(:min, [{1, :a}]) |> PairingHeap.peek()
{:ok, {1, :a}}

Extraction and removal of the root node is accomplished with PairingHeap.pop/1:

iex> {:ok, {key, value}, heap} = 
...>   PairingHeap.new(:min, [{1, :a}]) 
...>   |> PairingHeap.pop()
iex> {key, value}
{1, :a}
iex> heap
#PairingHeap<root: :empty, size: 0, mode: :min>

Both PairingHeap.peek/1 and PairingHeap.pop/1 return :error if the heap is empty.

PairingHeap.pull/2 pops zero or more key-value pairs from the heap and returns the updated heap:

iex> {pairs, heap} =
...>   PairingHeap.new(:min, [{3, :c}, {1, :a}, {2, :b}])
...>   |> PairingHeap.pull(2)
iex> pairs
[{1, :a}, {2, :b}]
iex> heap
#PairingHeap<root: {3, :c}, size: 1, mode: :min>

If the number in PairingHeap.pull/2 is greater than the heap size, all of the key-value pairs are returned, along with an empty heap.

Two heaps with the same mode can be merged with PairingHeap.merge/2:

iex> heap1 = PairingHeap.new(:min, [{2, :b}])
iex> heap2 = PairingHeap.new(:min, [{1, :a}])
iex> PairingHeap.merge(heap1, heap2)
#PairingHeap<root: {1, :a}, size: 2, mode: :min>

Similarly, one or more heaps can be merged with PairingHeap.merge/1, which takes a list of heaps as its sole argument:

iex> heap1 = PairingHeap.new(:min, [{2, :b}])
iex> heap2 = PairingHeap.new(:min, [{1, :a}])
iex> PairingHeap.merge([heap1, heap2])
#PairingHeap<root: {1, :a}, size: 2, mode: :min>

To quickly dump the key-values pairs in the heap without guarantees on the ordering, use PairingHeap.dump/1:

iex> h = PairingHeap.new(:min, [{2, :b}, {1, :a}, {3, :c}])
iex> PairingHeap.dump(h)
[{1, :a}, {3, :c}, {2, :b}]

This is faster than Enum.to_list/1 (see below).

Enumerable and Collectable

PairingHeap implements the Enumerable and Collectable protocols, meaning that functions from Enum and Stream are available.

Enum.into is a simple way to add a batch of key-value pairs to a heap:

iex> heap = PairingHeap.new(:min, [{2, :b}])
iex> [{3, :c}, {1, :a}] |> Enum.into(heap)
#PairingHeap<root: {1, :a}, size: 3, mode: :min>

Use Enum.to_list to list the entire contents of the heap in key order:

iex> heap = PairingHeap.new(:min, [{3, :c}, {1, :a}, {2, :b}])
iex> heap |> Enum.to_list()
[{1, :a}, {2, :b}, {3, :c}]

map, filter, and reduce are also available for PairingHeap. But beware:

Any operation that pops k elements from a heap will have O(k log n) complexity on average.

Miscellaneous details

  • Like most other heap algorithms, a pairing heap does not produce a stable sort.

  • In this implementation of a pairing heap, a new item whose key equals the key of the root node will become the new root node.

iex> heap = PairingHeap.new(:min, [{2, :b}, {1, :a}])
iex> heap |> PairingHeap.put(1, :aa) |> Enum.to_list()
[{1, :aa}, {1, :a}, {2, :b}]
  • Emptying a PairingHeap with 1,000,000 randomly inserted items takes a couple seconds. For PairingHeap sizes less than 1,000,000 the time to pop an item is of order 1 microsecond.