prolly v0.1.0 Prolly.BloomFilter
Use a Bloom filter when you want to keep track of whether you have seen a given value or not.
For example, the quesetion “have I seen the string foo
so far in the stream?”
is a reasonble question for a Bloom filter.
Specifically, a Bloom filter can tell you two things:
- When a value may be in a set.
- When a value is definitely not in a set
Carefully note that a Bloom filter can only tell you that a value might be in a set or that a value is definitely not in a set. It cannot tell you that a value is definitely in a set.
Link to this section Summary
Functions
Find the false positive rate for a given filter size, expected input size, and number of hash functions
Create a Bloom filter
Find the optimal number of hash functions for a given filter size and expected input size
Test if something might be in a bloom filter
Add a value to a bloom filter
Link to this section Types
Link to this section Functions
Find the false positive rate for a given filter size, expected input size, and number of hash functions
Examples
iex> alias Prolly.BloomFilter
iex> BloomFilter.false_positive_rate(10000, 3000, 3) |> (fn(n) -> :erlang.round(n * 100) / 100 end).()
0.21
Create a Bloom filter.
iex> alias Prolly.BloomFilter
iex> BloomFilter.new(20, [:md5, :sha, :sha256]).filter |> Enum.to_list
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
iex> alias Prolly.BloomFilter
iex> BloomFilter.new(20, [:md5, :sha, :sha256]).hashes
[:md5, :sha, :sha256]
iex> alias Prolly.BloomFilter
iex> BloomFilter.new(20, Enum.into([:md5, :sha, :sha256], MapSet.new)).hashes
#MapSet<[:md5, :sha, :sha256]>
Find the optimal number of hash functions for a given filter size and expected input size
Examples
iex> alias Prolly.BloomFilter
iex> BloomFilter.optimal_number_of_hashes(10000, 1000) |> round
7
Test if something might be in a bloom filter
Examples
iex> alias Prolly.BloomFilter
iex> bf = BloomFilter.new(20, [:md5, :sha, :sha256])
iex> bf = BloomFilter.update(bf, "hi")
iex> BloomFilter.possible_member?(bf, "hi")
true
iex> alias Prolly.BloomFilter
iex> bf = BloomFilter.new(20, [:md5, :sha, :sha256])
iex> bf = BloomFilter.update(bf, "hi")
iex> BloomFilter.possible_member?(bf, "this is not hi!")
false
Add a value to a bloom filter
This operation runs in time proportional to the number of hash functions.
Examples
iex> alias Prolly.BloomFilter
iex> bf = BloomFilter.new(20, [:md5, :sha, :sha256])
iex> BloomFilter.update(bf, "hi").filter |> Enum.to_list
[0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]