Bloomex v1.0.2 Bloomex View Source

This module implements a Scalable Bloom Filter.

Examples

iex> bf = Bloomex.scalable(1000, 0.1, 0.1, 2)
%Bloomex.ScalableBloom...

iex> bf = Bloomex.add(bf, 5)
%Bloomex.ScalableBloom...

iex> Bloomex.member?(bf, 5)
true

iex> bf = Bloomex.add(bf, 100)
%Bloomex.ScalableBloom...

iex> Bloomex.member?(bf, 100)
true

iex> Bloomex.member?(bf, 105)
false

iex> Bloomex.member?(bf, 101) # false positive
true

Link to this section Summary

Functions

Returns a bloom filter with the element e added

Returns the capacity of the bloom filter

Returns true if the element e exists in the bloom filter, otherwise returns false

Returns a plain Bloom filter based on the provided arguments:

  • capacity, used to calculate the size of each bitvector slice
  • error, the error probability
  • hash_func, a hashing function

Returns a scalable Bloom filter based on the provided arguments:

  • capacity, the initial capacity before expanding
  • error, the error probability
  • error_ratio, the error probability ratio
  • growth, the growth ratio when full
  • hash_func, a hashing function

Returns the number of elements currently in the bloom filter

Link to this section Types

Link to this section Functions

Returns a bloom filter with the element e added.

Link to this function capacity(arg1) View Source
capacity(Bloomex.t()) :: pos_integer() | :infinity

Returns the capacity of the bloom filter.

A plain bloom filter will always have a fixed capacity, while a scalable one will always have a theoretically infite capacity.

Link to this function member?(bloom, e) View Source
member?(Bloomex.t(), any()) :: boolean()

Returns true if the element e exists in the bloom filter, otherwise returns false.

Keep in mind that you may get false positives, but never false negatives.

Link to this function plain(capacity, error, hash_func \\ fn x -> :erlang.phash2(x, 1 <<< 32) end) View Source
plain(pos_integer(), float(), (term() -> pos_integer())) :: Bloomex.Bloom.t()

Returns a plain Bloom filter based on the provided arguments:

  • capacity, used to calculate the size of each bitvector slice
  • error, the error probability
  • hash_func, a hashing function

If a hash function is not provided then :erlang.phash2/2 will be used with the maximum range possible (2^32).

Restrictions:

  • capacity must be a positive integer
  • error must be a float between 0 and 1
  • hash_func must be a function of type term -> pos_integer

The function follows a rule of thumb due to double hashing where capacity >= 4 / error must hold true.

Link to this function scalable(capacity, error, error_ratio, growth, hash_func \\ fn x -> :erlang.phash2(x, 1 <<< 32) end) View Source
scalable(pos_integer(), float(), float(), 1 | 2 | 3, (term() -> pos_integer())) ::
  Bloomex.ScalableBloom.t()

Returns a scalable Bloom filter based on the provided arguments:

  • capacity, the initial capacity before expanding
  • error, the error probability
  • error_ratio, the error probability ratio
  • growth, the growth ratio when full
  • hash_func, a hashing function

If a hash function is not provided then :erlang.phash2/2 will be used with the maximum range possible (2^32).

Restrictions:

  • capacity must be a positive integer
  • error must be a float between 0 and 1
  • error_ratio must be a float between 0 and 1
  • growth must be a positive integer between 1 and 3
  • hash_func must be a function of type term -> pos_integer

The function follows a rule of thumb due to double hashing where capacity >= 4 / (error * (1 - error_ratio)) must hold true.

Returns the number of elements currently in the bloom filter.