Peach v0.2.0 Peach View Source

Peach provides fuzzy matching as well as tools for preprocessing text

Example

iex> "hɘllo🧐 " |> Peach.pre_process |> Peach.levenshtein_distance("hello") 1

Link to this section Summary

Functions

Get string in to one line

Find if there is an exact match to keyword set. The keywords may be numbers.

Find the fuzzy matches to the keyword_threshold set. Each keyword has its own threshold.

Find the fuzzy matches to the keyword set. All keywords use the same threshold.

Extract the first few characters of the utterance.

Calculate the Levenshtein edit distance.

Normalize text Unicode NFKC (Normalisation Form Compatibility Composition) normalisation.

Replace spans of whitespace with a single space

Pre-process an utterance in prepartion for number AND keyword matching

Remove emojis from a string.

Remove numbers without substitution. Applied before keyword matching

Remove punctuation without substitution

Replace punctuation marks with spaces

Link to this section Functions

Link to this function

convert_to_one_line(phrase)

View Source

Get string in to one line

Link to this function

find_exact_match(input, keyword_set)

View Source

Find if there is an exact match to keyword set. The keywords may be numbers.

Link to this function

find_fuzzy_matches(input, keyword_threshold_set)

View Source

Find the fuzzy matches to the keyword_threshold set. Each keyword has its own threshold.

Link to this function

find_fuzzy_matches(input, keyword_set, threshold)

View Source

Find the fuzzy matches to the keyword set. All keywords use the same threshold.

Link to this function

get_brief(phrase, num_chars \\ 20)

View Source

Extract the first few characters of the utterance.

Link to this function

levenshtein_distance(first_phrase, second_phrase)

View Source

Calculate the Levenshtein edit distance.

Normalize text Unicode NFKC (Normalisation Form Compatibility Composition) normalisation.

Link to this function

normalise_whitespace(phrase)

View Source

Replace spans of whitespace with a single space

Pre-process an utterance in prepartion for number AND keyword matching

Remove emojis from a string.

WARNING: this currently does not work for 0️⃣ 1️⃣ 2️⃣ 3️⃣ 4️⃣ 5️⃣ 6️⃣ 7️⃣ 8️⃣ 9️⃣ #️⃣ *️⃣ ©️ ®️ it replaces it with the symbol, rather than remove it completely

Remove numbers without substitution. Applied before keyword matching

Remove punctuation without substitution

Replace punctuation marks with spaces