View Source FuzzyCompare.ChunkSet (fuzzy_compare v1.1.0)
For strings which among shared words also contain many dissimilar words the ChunkSet is ideal.
It works in the following way:
Our input strings are
"oscar claude monet"
"alice hoschedé was the wife of claude monet"
From the input string three strings are created.
common_words = "claude monet"
common_words_plus_remaining_words_left = "claude monet oscar"
common_words_plus_remaining_words_right = "claude monet alice hoschedé was the wife of"
These are then all compared with each other in pairs and the maximum ratio is returned.
Examples
iex> FuzzyCompare.ChunkSet.standard_similarity("oscar claude monet", "alice hoschedé was the wife of claude monet")
0.8958333333333334
iex> FuzzyCompare.ChunkSet.substring_similarity("oscar claude monet", "alice hoschedé was the wife of claude monet")
1.0