unicode v1.0.0 Unicode
Provides functionality to efficiently check properties of Unicode codepoints, graphemes and strings.
The current implementation is based on Unicode version 8.0.0.
Summary
Functions
Checks if a single Unicode codepoint (or all characters in the given binary string) adhere to the Derived Core Property Alphabetic
True for alphanumeric characters, but much more performant than an :alnum:
regexp checking the same thing
Checks if a single Unicode codepoint (or all characters in the given binary string) adhere to the Derived Core Property Lowercase
Checks if a single Unicode codepoint (or all characters in the given binary string) adhere to the Derived Core Property Math
True for the digits [0-9], but much more performant than a
regexp checking the same thing
Checks if a single Unicode codepoint (or all characters in the given binary string) adhere to the Derived Core Property Uppercase
Functions
Checks if a single Unicode codepoint (or all characters in the given binary string) adhere to the Derived Core Property Alphabetic
.
These are all characters that are usually used as representations of letters/syllabes/ in words/sentences. The function takes a unicode codepoint or a string as input.
For the string-version, the result will be true only if all codepoints in the string adhere to the property.
Examples
iex>Unicode.alphabetic?(?a)
true
iex>Unicode.alphabetic?("A")
true
iex>Unicode.alphabetic?("Elixir")
true
iex>Unicode.alphabetic?("الإكسير")
true
iex>Unicode.alphabetic?("foo, bar") # comma and whitespace
false
iex>Unicode.alphabetic?("42")
false
iex>Unicode.alphabetic?("龍王")
true
iex>Unicode.alphabetic?("∑") # Summation, ∑
false
iex>Unicode.alphabetic?("Σ") # Greek capital letter sigma, Σ
true
True for alphanumeric characters, but much more performant than an :alnum:
regexp checking the same thing.
Returns true if Unicode.alphabetic?(x) or Unicode.numeric?(x)
.
Derived from http://www.unicode.org/reports/tr18/#alnum
Examples
iex> Unicode.alphanumeric? "1234"
true
iex> Unicode.alphanumeric? "KeyserSöze1995"
true
iex> Unicode.alphanumeric? "3段"
true
iex> Unicode.alphanumeric? "dragon@example.com"
false
Checks if a single Unicode codepoint (or all characters in the given binary string) adhere to the Derived Core Property Lowercase
.
Notice that there are many languages that do not have a distinction between cases. Their characters are not included in this group.
The function takes a unicode codepoint or a string as input.
For the string-version, the result will be true only if all codepoints in the string adhere to the property.
Examples
iex>Unicode.lowercase?(?a)
true
iex>Unicode.lowercase?("A")
false
iex>Unicode.lowercase?("Elixir")
false
iex>Unicode.lowercase?("léon")
true
iex>Unicode.lowercase?("foo, bar")
false
iex>Unicode.lowercase?("42")
false
iex>Unicode.lowercase?("Σ")
false
iex>Unicode.lowercase?("σ")
true
Checks if a single Unicode codepoint (or all characters in the given binary string) adhere to the Derived Core Property Math
.
These are all characters whose primary usage is in mathematical concepts (and not in alphabets).
Notice that the numerical digits are not part of this group. Use Unicode.digit?/1
instead.
The function takes a unicode codepoint or a string as input.
For the string-version, the result will be true only if all codepoints in the string adhere to the property.
Examples
iex>Unicode.math?(?=)
true
iex>Unicode.math?("=")
true
iex>Unicode.math?("1+1=2") # Note that digits themselves are not part of `Math`.
false
iex>Unicode.math?("परिस")
false
iex>Unicode.math?("∑") # Summation, ∑
true
iex>Unicode.math?("Σ") # Greek capital letter sigma, Σ
false
True for the digits [0-9], but much more performant than a
regexp checking the same thing.
Derived from http://www.unicode.org/reports/tr18/#digit
Examples
iex> Unicode.numeric?("65535")
true
iex> Unicode.numeric?("42")
true
iex> Unicode.numeric?("lapis philosophorum")
false
Checks if a single Unicode codepoint (or all characters in the given binary string) adhere to the Derived Core Property Uppercase
.
Notice that there are many languages that do not have a distinction between cases. Their characters are not included in this group.
The function takes a unicode codepoint or a string as input.
For the string-version, the result will be true only if all codepoints in the string adhere to the property.
Examples
iex>Unicode.uppercase?(?a)
false
iex>Unicode.uppercase?("A")
true
iex>Unicode.uppercase?("Elixir")
false
iex>Unicode.uppercase?("CAMEMBERT")
true
iex>Unicode.uppercase?("foo, bar")
false
iex>Unicode.uppercase?("42")
false
iex>Unicode.uppercase?("Σ")
true
iex>Unicode.uppercase?("σ")
false