prolly v0.1.0 Prolly.CountMinSketch

Use CountMinSketch when you want to count and query the approximate number of occurences of values in a stream using sublinear memory

For example, “how many times has the string foo been in the stream so far?” is a reasonable question for CountMinSketch.

A CountMinSketch will not undercount occurences, but may overcount occurences, reporting a count that is higher than the real number of occurences for a given value.

Link to this section Summary

Functions

Query a sketch for the count of a given value

Create a CountMinSketch

Union two sketches by cell-wise adding their counts

Update a sketch with a value

Link to this section Types

Link to this type t()
t() :: Prolly.CountMinSketch

Link to this section Functions

Link to this function get_count(count_min_sketch, value)

Query a sketch for the count of a given value

Examples

iex> require Prolly.CountMinSketch, as: Sketch
iex> Sketch.new(3, 5, [:sha, :md5, :sha256]) |> Sketch.update("hi") |> Sketch.get_count("hi")
1

iex> require Prolly.CountMinSketch, as: Sketch
iex> sketch = Sketch.new(3, 5, [:sha, :md5, :sha256])
...> |> Sketch.update("hi")
...> |> Sketch.update("hi")
...> |> Sketch.update("hi")
iex> Sketch.get_count(sketch, "hi")
3
Link to this function new(width, depth, hashes)

Create a CountMinSketch

Examples

iex> require Prolly.CountMinSketch, as: Sketch
iex> Sketch.new(3, 5, [:sha, :md5, :sha256]).matrix |> Enum.map(&Vector.to_list(&1))
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]

iex> require Prolly.CountMinSketch, as: Sketch
iex> Sketch.new(3, 5, [:sha, :md5, :sha256]).hashes
[:sha, :md5, :sha256]
Link to this function union(sketch, count_min_sketch)

Union two sketches by cell-wise adding their counts

Examples

iex> require Prolly.CountMinSketch, as: Sketch
iex> sketch1 = Sketch.new(3, 5, [:sha, :md5, :sha256]) |> Sketch.update("hi")
iex> sketch2 = Sketch.new(3, 5, [:sha, :md5, :sha256]) |> Sketch.update("hi")
iex> Sketch.union(sketch1, sketch2).matrix |> Enum.map(&Vector.to_list(&1))
[[0, 2, 0, 0, 0], [0, 0, 2, 0, 0], [0, 2, 0, 0, 0]]
Link to this function update(sketch, value)

Update a sketch with a value

Examples

iex> require Prolly.CountMinSketch, as: Sketch
iex> sketch = Sketch.new(3, 5, [:sha, :md5, :sha256]) |> Sketch.update("hi")
iex> sketch.matrix |> Enum.map(&Vector.to_list(&1))
[[0, 1, 0, 0, 0], [0, 0, 1, 0, 0], [0, 1, 0, 0, 0]]