View Source ExAequo.RegexTokenizer (ExAequo v0.6.0)
Allows tokenizing text by means of priorized regular expressions
Summary
Functions
A simple example first
Functions
A simple example first
iex(1)> tokens = [
...(1)> { "\\d+", &String.to_integer/1 },
...(1)> { "[\\s,]+", &nil_fn/1 }, # from ExAequo.Fn
...(1)> { "\\w+", &String.to_atom/1 } ]
...(1)> tokenize("42, and 43", tokens)
{:ok, [42, nil, :and, nil, 43]}
If we want to ignore nil
(or other values)
iex(2)> tokens = [
...(2)> { "\\d+", &String.to_integer/1 },
...(2)> { "[\\s,]+", &nil_fn/1 }, # from ExAequo.Fn
...(2)> { "\\w+", &String.to_atom/1 } ]
...(2)> tokenize("42, and 43", tokens, ignores: [nil])
{:ok, [42, :and, 43]}
And a little bit more complex example as used in this library
iex(3)> tokens = [
...(3)> "\\\\(.)", # same as {"\\.\\s+", &(&1)}
...(3)> "\\.\\s+",
...(3)> { "\\.(\\w+)\\.", &String.to_atom/1 },
...(3)> ".[^\\\\.]+" ]
...(3)> [
...(3)> tokenize!(".red.hello", tokens),
...(3)> tokenize!(". \\.red.blue\\..green.", tokens)]
[
[:red, "hello"],
[". ", ".", "red", ".blue", ".", :green]
]