Unicode Set v0.7.0 Unicode.Regex View Source

Preprocesses a binary regular expression to expand Unicode Sets which are then interpolated back into the Regular Expression which is then compiled with Regex.compile/2 or Regex.compile!/2.

Link to this section Summary

Functions

Compiles a binary regular expression after interpolating any Unicode Sets.

Compiles a binary regular expression after interpolating any Unicode Sets.

Link to this section Functions

Link to this function

compile(string, options \\ "u")

View Source

Compiles a binary regular expression after interpolating any Unicode Sets.

Arguments

  • string is a regular expression in string form

  • options is a string or a list which is passed unchanged to Regex.compile/2. The default is "u" meaning the regular expression will operate in Unicode mode

Returns

  • {:ok, regex} or

  • {:error, {exception, message}}

Notes

This function operates by splitting the string at the boundaries of Unicode Set markers which are:

  • Posix style: [: and :]
  • Perl style: \p{ and }

This parsing is naive meaning that is does not take any character escaping into account when s plitting the string.

Example

iex> Unicode.Regex.compile("[:Zs:]")
{:ok, ~r/[\x{20}\x{A0}\x{1680}\x{2000}-\x{200A}\x{202F}\x{205F}\x{3000}]/u}

iex> Unicode.Regex.compile("\\p{Zs}")
{:ok, ~r/[\x{20}\x{A0}\x{1680}\x{2000}-\x{200A}\x{202F}\x{205F}\x{3000}]/u}

iex> Unicode.Regex.compile("[:ZZZZ:]")
{:error, {Unicode.Set.ParseError,
  "Unable to parse \"[:ZZZZ:]\". The unicode script, category or property \"zzzz\" is not known."}}
Link to this function

compile!(string, opts \\ "u")

View Source

Compiles a binary regular expression after interpolating any Unicode Sets.

Arguments

  • string is a regular expression in string form.

  • options is a string or a list which is passed unchanged to Regex.compile/2. The default is "u" meaning the regular expression will operate in Unicode mode

Returns

  • regex or

  • raises an exception

Example

iex> Unicode.Regex.compile!("[:Zs:]")
~r/[\x{20}\x{A0}\x{1680}\x{2000}-\x{200A}\x{202F}\x{205F}\x{3000}]/u