Unicode v1.5.0 Unicode.GeneralCategory.Derived View Source

For certain operations and transformations (especially in Unicode Sets) there is an expectation that certain derived general categories exists even though they are not defined in the unicode character database.

These categories are:

  • :any which is the full unicode character range 0x0..0x10ffff

  • :assigned which is the set of codepoints that are assigned and which is therefore equivalent to :any - :Cn. In fact that is exactly how it is calculated using unicode_set and the results are statically copied here so that there is no mutual dependency.

  • :ascii which is the range for the US ASCII character set of 0x0..0x7f

In addition there are derived categories not part of the Unicode specification that support additional use cases. These include:

  • Categories related to recognising quotation marks. See the module Unicode.Category.QuoteMarks.

  • :printable: which implements the same semantics as String.printable?/1

  • :visible: which includes characters from the [[:L:][:N:][:M:][:P:][:S:][:Zs:]] set

Link to this section Summary

Functions

Returns a map of the aliases for the derived General Categories

Returns a map of the derived General Categories

Link to this section Functions

Returns a map of the aliases for the derived General Categories

Link to this function

categories()

View Source
categories() :: map()

Returns a map of the derived General Categories