Simple Bayes v0.7.1 SimpleBayes.Tokenizer

Summary

Functions

Converts a list with a value into a map, and merges the maps with accumulated values

Filters out a list based on another list

Converts a list with a value into a map

Converts a string into a list of words

Functions

accumulate(map, list, acc_size)

Converts a list with a value into a map, and merges the maps with accumulated values.

Examples

iex> SimpleBayes.Tokenizer.accumulate(%{}, [:cat, :dog], 1)
%{cat: 1, dog: 1}

iex> SimpleBayes.Tokenizer.accumulate(%{cat: 1, fish: 1}, [:cat, :dog], 2)
%{cat: 3, fish: 1, dog: 2}

iex> SimpleBayes.Tokenizer.accumulate(%{cat: 1, fish: 1}, [:cat, :cat, :dog], 1)
%{cat: 3, fish: 1, dog: 1}
filter_out(list, filter_list)

Filters out a list based on another list.

Examples

iex> SimpleBayes.Tokenizer.filter_out(["foo", "bar", "baz"], ["baz"])
["foo", "bar"]

iex> SimpleBayes.Tokenizer.filter_out(["foo", "bar", "baz"], ["baz", "bazz"])
["foo", "bar"]
map_values(list, value)

Converts a list with a value into a map.

Examples

iex> SimpleBayes.Tokenizer.map_values([:cat, :dog], 1)
%{cat: 1, dog: 1}

iex> SimpleBayes.Tokenizer.map_values([:cat, :cat, :dog], 1)
%{cat: 2, dog: 1}
tokenize(string)

Converts a string into a list of words.

Examples

iex> SimpleBayes.Tokenizer.tokenize("foobar")
["foobar"]

iex> SimpleBayes.Tokenizer.tokenize("foo bar")
["foo", "bar"]

iex> SimpleBayes.Tokenizer.tokenize(",foo  bar  .")
["foo", "bar"]

iex> SimpleBayes.Tokenizer.tokenize("Foo bAr")
["foo", "bar"]

iex> SimpleBayes.Tokenizer.tokenize("foo, bar")
["foo", "bar"]

iex> SimpleBayes.Tokenizer.tokenize("foo bar.")
["foo", "bar"]

iex> SimpleBayes.Tokenizer.tokenize(~s(fo-o's ba_r"ed.))
~w(fo-o's ba_r"ed)