Nilsimsa (nilsimsa v1.0.0) View Source
Nilsimsa is an implementation of a locality-sensitive hashing algorithm where similar input values produce similar hashes. The more similar the input strings are, the smaller the bitwise different between the out generated hashes.
Nilsimsa hashes are useful for detecting texts of the same origin.
Processing a string
To process a string, pass the value to the process/1
function:
Nilsimsa.process("abcdefgh")
You can also process a stream:
"war_and_peace.txt"
|> File.stream!()
|> Enum.reduce(Nilsimsa.process(""), &Nilsimsa.process/2)
Generating a digest
To generate a digest of the Nilsimsa hash, just pass the process struct to the to_string/1
function:
to_string(Nilsimsa.process("abcdefgh"))
# => 14c8118000000000030800000004042004189020001308014088003280000078
Comparing values
To compare two values, use the compare/2
function:
Nilsimsa.compare(Nilsimsa.process("hello world"), Nilsimsa.process("all of your base"))
# => 3
Link to this section Summary
Functions
Compare two hashed values
Generate the digest of a hash
Process the given string as a Nilsimsa hash
Process the given string as a Nilsimsa hash using the given accumulator struct
Link to this section Types
Specs
Link to this section Functions
Specs
Compare two hashed values
This returns a value between -127 and 128 where -127 is different and 128 is similar.
Examples
iex> Nilsimsa.compare(Nilsimsa.process("abc"), Nilsimsa.process("def"))
126
Specs
Generate the digest of a hash
Examples
iex> to_string(Nilsimsa.digest(Nilsimsa.process("abcdefgh")))
"14c8118000000000030800000004042004189020001308014088003280000078"
Specs
Process the given string as a Nilsimsa hash
Examples
iex> to_string(Nilsimsa.process("abcdefghijklmnopqrstuvwxyz"))
"94ca95850773045cabb93869ba8657373499beb81a17587fd6f9107fc54cc978"
Specs
Process the given string as a Nilsimsa hash using the given accumulator struct
Examples
iex> to_string(Nilsimsa.process("abcdefghijklmnopqrstuvwxyz", %Nilsimsa{}))
"94ca95850773045cabb93869ba8657373499beb81a17587fd6f9107fc54cc978"