Cuckoo
This module implements a Cuckoo Filter.
Implementation Details
The implementation follows the specification as per the paper above.
For hashing we use the x64_128 variant of Murmur3 and the Erlang phash2.
Examples
iex> cf = Cuckoo.new(1000, 16, 4)
%Cuckoo.Filter{...}
iex> {:ok, cf} = Cuckoo.insert(cf, 5)
%Cuckoo.Filter{...}
iex> Cuckoo.contains?(cf, 5)
true
iex> {:ok, cf} = Cuckoo.delete(cf, 5)
%Cuckoo.Filter{...}
iex> Cuckoo.contains?(cf, 5)
false
Summary
contains?(filter, element) | Checks if the Cuckoo Filter contains |
delete!(filter, element) | Returns a filter with the removed element or raises |
delete(filter, element) | Attempts to delete |
insert!(filter, element) | Returns a filter with the inserted element or raises |
insert(filter, element) | Tries to insert |
new(max_num_keys, fingerprint_size, fingerprints_per_bucket \\ 4) | Creates a new Cuckoo Filter using the given |
Functions
Specs:
- contains?(Cuckoo.Filter.t, any) :: boolean
Checks if the Cuckoo Filter contains element
.
Returns true
if does, otherwise returns false
.
Specs:
- delete(Cuckoo.Filter.t, any) :: {:ok, Cuckoo.Filter.t} | {:err, :inexistent}
Attempts to delete element
from the Cuckoo Filter if it contains it.
Returns {:err, :inexistent}
if the element doesn’t exist in the filter, otherwise
returns {:ok, filter}
.
Specs:
- delete!(Cuckoo.Filter.t, any) :: Cuckoo.Filter.t | no_return
Returns a filter with the removed element or raises Cuckoo.Error
if an error occurs.
Specs:
- insert(Cuckoo.Filter.t, any) :: {:ok, Cuckoo.Filter.t} | {:err, :full}
Tries to insert element
into the Cuckoo Filter.
Returns {:ok, filter}
if successful, otherwise returns {:err, :full}
from which
you should consider the Filter to be full.
Specs:
- insert!(Cuckoo.Filter.t, any) :: Cuckoo.Filter.t | no_return
Returns a filter with the inserted element or raises Cuckoo.Error
if an error occurs.
Specs:
- new(pos_integer, pos_integer, pos_integer) :: Cuckoo.Filter.t
Creates a new Cuckoo Filter using the given max_num_keys
, fingerprint_size
and
fingerprints_per_bucket
.
The suggested values for the last two according to one of the publications should
be 16
and 4
respectively, as it allows the Cuckoo Filter to achieve a sweet spot
in space effiency and table occupancy.