Newxp. SimilarityUtils
(newxp v0.1.1)
Copy Markdown
Summary
Functions
Calculate Jaccard similarity between two token lists over n-grams up to n_range.
Extract n-grams from a list of tokens.
Lowercase and split text into word tokens.
Functions
Calculate Jaccard similarity between two token lists over n-grams up to n_range.
Jaccard(A, B) = |A ∩ B| / |A ∪ B|
Score range is 0.0–1.0, where 1.0 means identical.
Extract n-grams from a list of tokens.
Lowercase and split text into word tokens.