View Source Needlepoint.Tokenizer.Treebank (Needlepoint v0.1.0)

A port of the NLTK Treebank Tokenizer

examples

Examples

iex(1)> alias Needlepoint.Tokenizer.Treebank
Needlepoint.Tokenizer.Treebank

iex(2)> Treebank.tokenize("Good muffins cost $3.88 in New York.  Please buy me two of them. Thanks.")
["Good", "muffins", "cost", "$", "3.88", "in", "New", "York.", "Please", "buy",
 "me", "two", "of", "them.", "Thanks", "."]

iex(3)> Treebank.tokenize("They'll save and invest more.")
["They", "'ll", "save", "and", "invest", "more", "."]

iex(4)> Treebank.tokenize("hi, my name can't hello,")
["hi", ",", "my", "name", "ca", "n't", "hello", ","]