Script reordering for collation (kr= / reorder option).
Remaps primary weights to change the relative order of scripts.
For example, reorder: [:Grek, :Latn] would sort Greek characters
before Latin characters.
Script boundaries are determined from the fractional lead bytes in FractionalUCA.txt, which cleanly partition scripts. Since the CLDR allkeys integer primary weights interleave scripts within lead bytes, a per-primary-weight lookup is used to identify each weight's script before applying the reorder permutation.
Summary
Functions
Apply a reorder mapping to a primary weight.
Build a reorder mapping function from the given script codes.
Load a mapping from allkeys integer primary weights to their fractional lead bytes and sub-bytes.
Load the script-to-lead-byte-range mapping from FractionalUCA.txt.
Functions
@spec apply_mapping((non_neg_integer() -> non_neg_integer()) | nil, non_neg_integer()) :: non_neg_integer()
Apply a reorder mapping to a primary weight.
Arguments
mapping_fn- a reorder mapping function frombuild_mapping/1, ornil.primary- the primary weight to remap.
Returns
The remapped primary weight, or the original if mapping_fn is nil.
Examples
iex> Cldr.Collation.Reorder.apply_mapping(nil, 0x2A00)
0x2A00
iex> mapping = Cldr.Collation.Reorder.build_mapping([:Grek, :Latn])
iex> remapped = Cldr.Collation.Reorder.apply_mapping(mapping, 0x2A00)
iex> is_integer(remapped)
true
@spec build_mapping([atom()]) :: (non_neg_integer() -> non_neg_integer()) | nil
Build a reorder mapping function from the given script codes.
Creates a function that remaps primary weights to reorder scripts. Core codes (space, punct, symbol, currency, digit) that are not explicitly listed are prepended automatically.
Arguments
reorder_codes- a list of script code atoms (e.g.,[:Grek, :Latn]). Supports ISO 15924 codes (:Latn,:Grek,:Cyrl) and special codes (:space,:punct,:symbol,:currency,:digit,:others).
Returns
- A function
(primary :: integer()) -> integer()that remaps primary weights. nilif the list is empty or no valid mappings were found.
Examples
iex> Cldr.Collation.Reorder.build_mapping([])
nil
iex> mapping = Cldr.Collation.Reorder.build_mapping([:Grek, :Latn])
iex> is_function(mapping, 1)
true
@spec load_primary_to_fractional_lead() :: %{ required(integer() | {:sub, integer()}) => non_neg_integer() }
Load a mapping from allkeys integer primary weights to their fractional lead bytes and sub-bytes.
Parses FractionalUCA.txt data lines to extract both the fractional CE (which gives the lead byte and sub-byte) and the allkeys integer primary weight (from the comment portion).
The returned map has two types of entries:
primary_weight => fractional_lead_byte- the script-identifying lead byte.{:sub, primary_weight} => fractional_sub_byte- the within-script sub-byte for preserving relative ordering during remapping.
Returns
A map %{integer() | {:sub, integer()} => non_neg_integer()}.
@spec load_script_ranges() :: %{ required(String.t()) => {non_neg_integer(), non_neg_integer()} }
Load the script-to-lead-byte-range mapping from FractionalUCA.txt.
Parses [top_byte ...] entries from the data file. Falls back to
hardcoded defaults if the file is not found.
Returns
A map %{String.t() => {start_byte, end_byte}} where keys are lowercase
script/group names and values are fractional lead byte range tuples.
Examples
iex> ranges = Cldr.Collation.Reorder.load_script_ranges()
iex> is_map(ranges)
true