SpiritFingers v0.1.2 SpiritFingers.SimHash View Source

SimHash Module which delegates to Rust NIFs which will perform the hashing, similarity and distance calculations.

Link to this section Summary

Types

64 bit floating point represenation of the Hamming Distance between 2 SimHash.t

Similarity between two SimHash.t, represented as a value between 0.0 and 1.0.

  • 0.0 means no similarity,
  • 1.0 means identical
t()

unsigned 64 bit integer represenation of simhash

Functions

Bitwise hamming distance of two SimHash.t hashes

Calculate similarity as SimHash.similarity of two hashes. 0.0 means no similarity, 1.0 means identical

Calculate SimHash.t split by whitespace

Calculate similarity SimHash.similarity of two string slices split by whitespace by simhash

Link to this section Types

Link to this type distance() View Source
distance() :: float()

64 bit floating point represenation of the Hamming Distance between 2 SimHash.t.

Link to this type similarity() View Source
similarity() :: float()

Similarity between two SimHash.t, represented as a value between 0.0 and 1.0.

  • 0.0 means no similarity,
  • 1.0 means identical.

unsigned 64 bit integer represenation of simhash

Link to this section Functions

Link to this function hamming_distance(hash0, hash1) View Source
hamming_distance(t(), t()) :: {:ok, distance()}

Bitwise hamming distance of two SimHash.t hashes

Examples

iex> SpiritFingers.SimHash.hamming_distance(0, 0)
{:ok, 0.0}

iex> SpiritFingers.SimHash.hamming_distance(0b1111111, 0b0000000)
{:ok, 7.0}

iex> SpiritFingers.SimHash.hamming_distance(0b0100101, 0b1100110)
{:ok, 3.0}
Link to this function hash_similarity(hash0, hash1) View Source
hash_similarity(t(), t()) :: {:ok, similarity()}

Calculate similarity as SimHash.similarity of two hashes. 0.0 means no similarity, 1.0 means identical.

Examples

iex> SpiritFingers.SimHash.hash_similarity(0, 0)
{:ok, 1.0}

iex> SpiritFingers.SimHash.hash_similarity(0xFFFFFFFFFFFFFFFF, 0)
{:ok, 0.0}

iex> SpiritFingers.SimHash.hash_similarity(0xFFFFFFFF, 0)
{:ok, 0.5}
Link to this function simhash(bin) View Source
simhash(binary()) :: {:ok, t()}

Calculate SimHash.t split by whitespace.

Examples

iex> SpiritFingers.SimHash.simhash("The cat sat on the mat")
{:ok, 2595200813813010837}

iex> SpiritFingers.SimHash.simhash("The cat sat under the mat")
{:ok, 2595269945604666783}

iex> SpiritFingers.SimHash.simhash("Why the lucky stiff")
{:ok, 1155526875459215761}
Link to this function similarity(text0, text1) View Source
similarity(binary(), binary()) :: {:ok, similarity()}

Calculate similarity SimHash.similarity of two string slices split by whitespace by simhash.

Examples

iex> SpiritFingers.SimHash.similarity("Stop hammertime", "Stop hammertime")
{:ok, 1.0}

iex> SpiritFingers.SimHash.similarity("Hocus pocus", "Hocus pocus pilatus pas")
{:ok, 0.9375}

iex> SpiritFingers.SimHash.similarity("Peanut butter", "Strawberry cocktail")
{:ok, 0.59375}