Cldr.Collation.Numeric (Cldr Collation v1.1.0)

Copy Markdown View Source

Numeric collation support (kn=true / numeric=true).

When enabled, sequences of decimal digits are treated as numeric values for primary sorting, ensuring "file2" sorts before "file10".

The numeric value is encoded as a length-prefixed big-endian number in the primary weight.

Summary

Functions

Encode a sequence of digit codepoints as numeric collation elements.

Process codepoint/element pairs, replacing digit sequence CEs with numeric-value-based CEs.

Functions

encode_numeric_value(codepoints)

Encode a sequence of digit codepoints as numeric collation elements.

Follows ICU's approach: converts digits to numeric values, strips leading zeros, then encodes as a length prefix CE followed by one CE per digit.

Arguments

  • codepoints - a list of integer codepoints representing decimal digits.

Returns

A list of %Cldr.Collation.Element{} structs: one length-prefix CE followed by one CE per significant digit.

Examples

iex> result = Cldr.Collation.Numeric.encode_numeric_value([0x31, 0x30])
iex> length(result)
3

process_elements(ce_pairs)

@spec process_elements([{[non_neg_integer()], [Cldr.Collation.Element.t()]}]) :: [
  Cldr.Collation.Element.t()
]

Process codepoint/element pairs, replacing digit sequence CEs with numeric-value-based CEs.

Groups consecutive decimal digit codepoints into runs and replaces their collation elements with length-prefixed numeric encodings so that "2" sorts before "10".

Arguments

  • ce_pairs - a list of {codepoints, [%Cldr.Collation.Element{}]} pairs.

Returns

A flat list of %Cldr.Collation.Element{} structs with digit sequences replaced by numeric collation elements.

Examples

iex> pairs = [{[0x31], [{0x21E7, 0x0020, 0x0002, false}]}, {[0x30], [{0x21E6, 0x0020, 0x0002, false}]}]
iex> result = Cldr.Collation.Numeric.process_elements(pairs)
iex> length(result)
3