prolly v0.2.0 Prolly.HyperLogLog
Use HyperLogLog when you want to count the numer of distinct elements in a stream in sublinear memory
m
= the number of registers, >= 16
a
= the “alpha” corrective factor, varied by m
b
= the number of least-significant bits that go toward the index. Must be log2(m)
, ie 64 registers
means the 6 rightmost bits are the ones devoted to determining a registers
alpha_m_squared
= a * m * m
, memoized
Link to this section Summary
Link to this section Types
Link to this section Functions
Get the count-distinct from a HyperLogLog
Examples
iex> require Prolly.HyperLogLog, as: HLL
iex> hll = HLL.new(64, fn(value) -> :erlang.phash2(value) end)
iex> Enum.reduce(1..5800, hll, fn(val, acc) -> HLL.update(acc, val) end) |> HLL.count
5813
Create a new HyperLogLog
Examples
iex> require Prolly.HyperLogLog, as: HLL
iex> HLL.new(64, fn(value) -> :erlang.phash2(value) end).m
64
iex> require Prolly.HyperLogLog, as: HLL
iex> HLL.new(64, fn(value) -> :erlang.phash2(value) end).a
0.709
iex> require Prolly.HyperLogLog, as: HLL
iex> HLL.new(64, fn(value) -> :erlang.phash2(value) end).b
6
iex> require Prolly.HyperLogLog, as: HLL
iex> HLL.new(64, fn(value) -> :erlang.phash2(value) end).alpha_m_squared
2904.064
iex> require Prolly.HyperLogLog, as: HLL
iex> HLL.new(64, fn(value) -> :erlang.phash2(value) end).registers |> Vector.to_list
Enum.map(1..64, fn _ -> 0 end)
Update a HyperLogLog
Examples
# with a String
iex> require Prolly.HyperLogLog, as: HLL
iex> hll = HLL.new(64, fn(value) -> :erlang.phash2(value) end)
iex> HLL.update(hll, "hi").registers |> Vector.to_list
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
# with any term
iex> require Prolly.HyperLogLog, as: HLL
iex> hll = HLL.new(64, fn(value) -> :erlang.phash2(value) end)
iex> HLL.update(hll, 4242).registers |> Vector.to_list
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]