View Source Talan.CountingBloomFilter (talan v0.1.3)
Counting bloom filter implementation with concurrent accessibility, powered by :atomics module.
Features
- Fixed size Counting Bloom filter
- Concurrent reads & writes
- Custom & default hash functions
- Estimate number of unique elements
- Estimate false positive probability
Counting bloom filters support probabilistic deletion of elements but have higher memory consumption because they need to store a counter of N bits for every bloom filter bit.
Summary
Functions
Returns probabilistic count of term in counter
.
Probabilistically delete term
from bloom_filter
and
decrement counters in counter
.
See Talan.BloomFilter.false_positive_probability/1
for
docs.
Returns a new %Talan.CountingBloomFilter{}
struct.
Puts term
into bloom_filter
and increments counters in counter
.
Types
@type t() :: %Talan.CountingBloomFilter{ bloom_filter: reference(), counter: Abit.Counter.t() }
Functions
@spec cardinality(t()) :: non_neg_integer()
See Talan.BloomFilter.cardinality/1
for docs.
Examples
iex> cbf = Talan.CountingBloomFilter.new(10_000)
iex> cbf |> Talan.CountingBloomFilter.put("hat")
iex> cbf |> Talan.CountingBloomFilter.put("hat")
iex> cbf |> Talan.CountingBloomFilter.put("hat")
iex> cbf |> Talan.CountingBloomFilter.put("car keys")
iex> cbf |> Talan.CountingBloomFilter.cardinality()
2
@spec count(t(), any()) :: non_neg_integer()
Returns probabilistic count of term in counter
.
This means that (given no hash collisions) it returns how many times the item was put into the CountingBloomFilter. A few hash collisions should be also fine since it returns the average count of the counters. An single item is hashed with multiple counters.
Examples
iex> cbf = Talan.CountingBloomFilter.new(10_000)
iex> cbf |> Talan.CountingBloomFilter.put("hat")
iex> cbf |> Talan.CountingBloomFilter.put("hat")
iex> cbf |> Talan.CountingBloomFilter.put("hat")
iex> cbf |> Talan.CountingBloomFilter.count("hat")
3
Probabilistically delete term
from bloom_filter
and
decrement counters in counter
.
Examples
iex> cbf = Talan.CountingBloomFilter.new(10_000)
iex> cbf |> Talan.CountingBloomFilter.put("hat")
iex> cbf |> Talan.CountingBloomFilter.count("hat")
1
iex> cbf |> Talan.CountingBloomFilter.delete("hat")
:ok
iex> cbf |> Talan.CountingBloomFilter.count("hat")
0
iex> cbf |> Talan.CountingBloomFilter.delete("this wasn't there")
iex> cbf |> Talan.CountingBloomFilter.count("this wasn't there")
-1
See Talan.BloomFilter.false_positive_probability/1
for
docs.
See Talan.BloomFilter.member?/2
for docs.
Examples
iex> cbf = Talan.CountingBloomFilter.new(10_000)
iex> cbf |> Talan.CountingBloomFilter.put("hat")
iex> cbf |> Talan.CountingBloomFilter.member?("hat")
true
@spec new(pos_integer(), list()) :: t()
Returns a new %Talan.CountingBloomFilter{}
struct.
cardinality
is the expected number of unique items. Duplicated items
can be infinite.
Options
:counters_bit_size
- bit size of counters, defaults to8
:signed
- to have signed or unsigned counters, defaults totrue
:false_positive_probability
- a float, defaults to0.01
:hash_functions
- a list of hash functions, defaults to randomly seeded murmur
Examples
iex> cbf = Talan.CountingBloomFilter.new(10_000)
iex> cbf |> Talan.CountingBloomFilter.put("hat")
iex> cbf |> Talan.CountingBloomFilter.put("hat")
iex> cbf |> Talan.CountingBloomFilter.put("phone")
:ok
iex> cbf |> Talan.CountingBloomFilter.count("hat")
2
iex> cbf |> Talan.CountingBloomFilter.count("phone")
1
Puts term
into bloom_filter
and increments counters in counter
.
After this the member?/2
function will return true
for the membership of term
unless bits representing
membership are modified by the delete/2
function.
Returns :ok
.
Examples
iex> cbf = Talan.CountingBloomFilter.new(10_000)
iex> cbf |> Talan.CountingBloomFilter.put("hat")
:ok