TheFuzz

Contains shortforms to execute different string metric algorithms to compare given strings.

Source

Summary

compare(metric_type, a, b)

Compares given strings using the corresponding string metric algorithm

compare(metric_type, a, b, opts)

Compares given strings using the corresponding string metric algorithm with given opts

Functions

compare(metric_type, a, b)

Specs:

Compares given strings using the corresponding string metric algorithm.

Available metric types are:

  • Sorensen Dice coefficient: :dice_sorensen
  • Hamming distance: :hamming
  • Jaccard Similarity coefficient: :jaccard
  • Jaro distance: :jaro
  • Jaro Winkler distance: :jaro_winkler
  • Levenshtein distance: :levenshtein
  • n Gram similarity: :n_gram
  • Overlap coefficient: :overlap
  • Tanimoto coefficient: :tanimoto
  • Weighted Levenshtein distance: :weighted_levenshtein

Note: Some of these metrics will use default values for other parameters they might need like n gram size in case of Jaccard

Source
compare(metric_type, a, b, opts)

Compares given strings using the corresponding string metric algorithm with given opts

opts can be n gram size in case of Dice Sorensen, Jaccard, N Gram similarity and can be weights in case of Weighted Levenshtein

Available metric types are:

  • Sorensen Dice coefficient: :dice_sorensen
  • Jaccard Similarity coefficient: :jaccard
  • n Gram similarity: :n_gram
  • Tversky index: :tversky
  • Weighted Levenshtein distance: :weighted_levenshtein
Source