View Source SpiritFingers.SimHash (SpiritFingers v0.4.1)

SimHash Module which delegates to Rust NIFs which will perform the hashing, similarity and distance calculations.

Link to this section Summary

Types

64 bit floating point represenation of the Hamming Distance between 2 SimHash.t.

Similarity between two SimHash.t, represented as a value between 0.0 and 1.0.

t()

unsigned 64 bit integer represenation of simhash

Functions

Bitwise hamming distance of two SimHash.t hashes

Calculate similarity as SimHash.similarity of two hashes. 0.0 means no similarity, 1.0 means identical.

Calculate similarity SimHash.similarity of two string slices split by whitespace by simhash.

Calculate SimHash.t split by whitespace.

Link to this section Types

@type distance() :: float()

64 bit floating point represenation of the Hamming Distance between 2 SimHash.t.

@type similarity() :: float()

Similarity between two SimHash.t, represented as a value between 0.0 and 1.0.

  • 0.0 means no similarity,
  • 1.0 means identical.
@type t() :: pos_integer()

unsigned 64 bit integer represenation of simhash

Link to this section Functions

Link to this function

hamming_distance(hash0, hash1)

View Source
@spec hamming_distance(t(), t()) :: {:ok, distance()}

Bitwise hamming distance of two SimHash.t hashes

examples

Examples

iex> SpiritFingers.SimHash.hamming_distance(0, 0)
{:ok, 0}

iex> SpiritFingers.SimHash.hamming_distance(0b1111111, 0b0000000)
{:ok, 7}

iex> SpiritFingers.SimHash.hamming_distance(0b0100101, 0b1100110)
{:ok, 3}
Link to this function

hash_similarity(hash0, hash1)

View Source
@spec hash_similarity(t(), t()) :: {:ok, similarity()}

Calculate similarity as SimHash.similarity of two hashes. 0.0 means no similarity, 1.0 means identical.

examples

Examples

iex> SpiritFingers.SimHash.hash_similarity(0, 0)
{:ok, 1.0}

iex> SpiritFingers.SimHash.hash_similarity(0xFFFFFFFFFFFFFFFF, 0)
{:ok, 0.0}

iex> SpiritFingers.SimHash.hash_similarity(0xFFFFFFFF, 0)
{:ok, 0.5}
Link to this function

similarity(text0, text1)

View Source
@spec similarity(binary(), binary()) :: {:ok, similarity()}

Calculate similarity SimHash.similarity of two string slices split by whitespace by simhash.

examples

Examples

iex> SpiritFingers.SimHash.similarity("Stop hammertime", "Stop hammertime")
{:ok, 1.0}

iex> SpiritFingers.SimHash.similarity("Hocus pocus", "Hocus pocus pilatus pas")
{:ok, 0.9375}

iex> SpiritFingers.SimHash.similarity("Peanut butter", "Strawberry cocktail")
{:ok, 0.59375}
@spec similarity_hash(binary()) :: {:ok, t()}

Calculate SimHash.t split by whitespace.

examples

Examples

iex> SpiritFingers.SimHash.similarity_hash("The cat sat on the mat")
{:ok, 2595200813813010837}

iex> SpiritFingers.SimHash.similarity_hash("The cat sat under the mat")
{:ok, 2595269945604666783}

iex> SpiritFingers.SimHash.similarity_hash("Why the lucky stiff")
{:ok, 1155526875459215761}