Stripper v1.0.0 Stripper.Whitespace View Source
This module exists for dealing with whitespace. A space is a space is a space, right? Wrong. There are multiple unicode characters that represent whitespace: tabs, newlines, line-feeds, and a slew of lesser-known characters that are technically different entities but all of which could be referred to as "space".
Sometimes, too many distinctions is a bad thing. A human might be able to read text peppered with a dozen different variations in space characters, but some processes may not. This module offers functions that strip away all the nonsense and leaves bare the simple spaces as nature intended.
Link to this section Summary
Functions
The normalize/1
function works the same way as the normalize!/1
function
but it returns its output as an :ok
tuple.
Strip out any redundant spaces or other whitespace characters and normalize
them to simple spaces (i.e. " "
). Multiple spaces all get collapsed down to
one space. Newlines, carriage returns, tabs, line-feeds et al all get replaced
with a regular space character.
Link to this section Functions
The normalize/1
function works the same way as the normalize!/1
function
but it returns its output as an :ok
tuple.
This is a convenience function provided to have idiomatic function specs.
Usage Examples
iex> normalize("a \t\tbunch\n of \f nonsense\n")
{:ok, "a bunch of nonsense"}
Strip out any redundant spaces or other whitespace characters and normalize
them to simple spaces (i.e. " "
). Multiple spaces all get collapsed down to
one space. Newlines, carriage returns, tabs, line-feeds et al all get replaced
with a regular space character.
Functionally, this is equivalent to something like the following:
iex> value = "your value here"
iex> String.trim(Regex.replace(~r/\s+/u, value, " "))
Examples
iex> normalize!("a \t\tbunch\n of \f nonsense\n")
"a bunch of nonsense"
iex> normalize!(" trim me please ")
"trim me please"
iex> normalize!("foo\n\n\nbar")
"foo bar"
iex> normalize!("\u2009unicode\u2008space\u2003")
"unicode space"