Unicode Set v0.7.0 Unicode.Regex View Source
Preprocesses a binary regular expression to expand
Unicode Sets which are then interpolated back into
the Regular Expression which is then compiled with
Regex.compile/2
or Regex.compile!/2
.
Link to this section Summary
Functions
Compiles a binary regular expression after interpolating any Unicode Sets.
Compiles a binary regular expression after interpolating any Unicode Sets.
Link to this section Functions
Compiles a binary regular expression after interpolating any Unicode Sets.
Arguments
string
is a regular expression in string formoptions
is a string or a list which is passed unchanged toRegex.compile/2
. The default is "u" meaning the regular expression will operate in Unicode mode
Returns
{:ok, regex}
or{:error, {exception, message}}
Notes
This function operates by splitting the string at the boundaries of Unicode Set markers which are:
- Posix style:
[:
and:]
- Perl style:
\p{
and}
This parsing is naive meaning that is does not take any character escaping into account when s plitting the string.
Example
iex> Unicode.Regex.compile("[:Zs:]")
{:ok, ~r/[\x{20}\x{A0}\x{1680}\x{2000}-\x{200A}\x{202F}\x{205F}\x{3000}]/u}
iex> Unicode.Regex.compile("\\p{Zs}")
{:ok, ~r/[\x{20}\x{A0}\x{1680}\x{2000}-\x{200A}\x{202F}\x{205F}\x{3000}]/u}
iex> Unicode.Regex.compile("[:ZZZZ:]")
{:error, {Unicode.Set.ParseError,
"Unable to parse \"[:ZZZZ:]\". The unicode script, category or property \"zzzz\" is not known."}}
Compiles a binary regular expression after interpolating any Unicode Sets.
Arguments
string
is a regular expression in string form.options
is a string or a list which is passed unchanged toRegex.compile/2
. The default is "u" meaning the regular expression will operate in Unicode mode
Returns
regex
orraises an exception
Example
iex> Unicode.Regex.compile!("[:Zs:]")
~r/[\x{20}\x{A0}\x{1680}\x{2000}-\x{200A}\x{202F}\x{205F}\x{3000}]/u