Modules
This module provides functions that implement some of the Unicode standards
Implements the Unicode break algorithms for graphemes, words, sentences and line-breaks.
Single-pass DFA-style implementation of UAX #29 grapheme cluster segmentation.
Single-pass line-break implementation following UAX #14.
Single-pass DFA-style implementation of UAX #29 sentence break with locale-specific class extensions and abbreviation suppressions.
Single-pass DFA-style implementation of UAX #29 word break.
Implements the Unicode Case Folding algorithm.
The Unicode Case Mapping algorithm defines the process and data to transform text into upper case, lower case or title case.
Implements the special upper casing rules for for the Greek language.
Implements basic dictionary functions for dictionary-based work break.
Implements ICU's lookahead-based dictionary word break algorithm for scripts that don't use spaces between words.
Implements the compilation of the Unicode segment rules.
Mix Tasks
Downloads the ICU (Unicode) dictionaries supporting word breaks for Chinese, Japanese, Thai, Burmese and Laotion languages.