fuzzy_compare v1.0.0 FuzzyCompare.ChunkSet View Source

For strings which among shared words also contain many dissimilar words the ChunkSet is ideal.

It works in the following way:

Our input strings are

  • "oscar claude monet"
  • "alice hoschedé was the wife of claude monet"

From the input string three strings are created.

  • common_words = "claude monet"
  • common_words_plus_remaining_words_left = "claude monet oscar"
  • common_words_plus_remaining_words_right = "claude monet alice hoschedé was the wife of"

These are then all compared with each other in pairs and the maximum ratio is returned.

Examples

iex> FuzzyCompare.ChunkSet.standard_similarity("oscar claude monet", "alice hoschedé was the wife of claude monet")
0.8958333333333334

iex> FuzzyCompare.ChunkSet.substring_similarity("oscar claude monet", "alice hoschedé was the wife of claude monet")
1.0

Link to this section Summary

Link to this section Functions

Link to this function standard_similarity(left, right) View Source
Link to this function substring_similarity(left, right) View Source
substring_similarity(
  binary() | FuzzyCompare.Preprocessed.t(),
  binary() | FuzzyCompare.Preprocessed.t()
) :: float()