View Source Unicode.Unihan (Unicode v0.1.0)
Functions to introspect the Unicode Unihan character database.
Link to this section Summary
Functions
Filter the Unihan database returning selected codepoints.
Filter the Unihan database returning selected codepoints that are not rejected by the provided function.
Takes an integer codepoint, a Unihan codepoint map, or list of maps and returns the grapheme (or list of graphemes) of the codepoint.
Returns the Unihan database as a mapping of a codepoint to its metadata.
Returns the Unihan database metadata for a given codepoint.
Returns the field information for the data in the Unihan database.
Link to this section Functions
Filter the Unihan database returning selected codepoints.
arguments
Arguments
fun
is a1-arity
function that is passed the attribute map for a given codepoint. if the function returns atruthy
value then the codepoint is included in the returned data. If the return value isfalsy
then the codepoint is ommitted from the returned list.
returns
Returns
- a map of the filtered codepoints mapped to their attributes.
example
Example
iex> Unicode.Unihan.filter(&(&1.kTotalStrokes[:"Hans"] > 30))
...> |> Enum.count()
238
iex> Unicode.Unihan.filter(&(&1.kTotalStrokes[:"Hans"] != &1.kTotalStrokes[:"Hant"]))
...> |> Enum.count
3
iex> Unicode.Unihan.filter(&(&1[:kGradeLevel] <= 6))
...> |> Enum.count
2632
Filter the Unihan database returning selected codepoints that are not rejected by the provided function.
arguments
Arguments
fun
is a1-arity
function that is passed the attribute map for a given codepoint. if the function returns afalsy
value then the codepoint is included in the returned data. If the return value istruthy
then the codepoint is ommitted from the returned list.
returns
Returns
- a map of the codepoints that are not rejected mapped to their attributes.
example
Example
iex> Unicode.Unihan.reject(&(&1.kTotalStrokes[:"Hans"] > 30))
...> |> Enum.count()
97822
Takes an integer codepoint, a Unihan codepoint map, or list of maps and returns the grapheme (or list of graphemes) of the codepoint.
examples
Examples
iex> Unicode.Unihan.to_string(25342)
"拾"
iex> Unicode.Unihan.unihan("拾")
...> |> Unicode.Unihan.to_string()
"拾"
Returns the Unihan database as a mapping of a codepoint to its metadata.
Returns the Unihan database metadata for a given codepoint.
The codepoint can be expressed as an integer or a grapheme.
examples
Examples
iex> Unicode.Unihan.unihan(171339)
%{
codepoint: 171339,
kCantonese: %{coda: "", final: "u", jyutping: "ju4", nucleus: "u", onset: "j", tone: "4"},
kDefinition: ["(J) nonstandard variant of 魚 U+9B5A, fish"],
kHanYu: %{page: 4674, position: 9, virtual: false, volume: 7},
kIRGHanyuDaZidian: %{page: 4674, position: 9, virtual: false, volume: 7},
kIRGKangXi: %{page: 1465, position: 1, virtual: true},
kIRG_GSource: %{mapping: ["74674.09"], source: "GHZ"},
kIRG_TSource: %{mapping: "3043", source: "T4"},
kIRG_VSource: %{mapping: "29D4B", source: "VN"},
kJapaneseKun: ["UO", "SAKANA", "SUNADORU"],
kJapaneseOn: "GYO",
kKangXi: %{page: 1465, position: 1, virtual: true},
kNelson: 692,
kPhonetic: %{class: 1605},
kRSAdobe_Japan1_6: [
%{cid: 13717, code: "C", kangxi: 195, strokes_radical: 10, strokes_residue: 0},
%{cid: 13718, code: "V", kangxi: 195, strokes_radical: 10, strokes_residue: 0}
],
kRSKangXi: %{radical: 195, strokes: 0},
kRSUnicode: %{radical: 195, simplified_radical: false, strokes: 0},
kTotalStrokes: %{Hans: 11, Hant: 11}
}
iex> Unicode.Unihan.unihan("㝰")
%{
codepoint: 14192,
kCangjie: ["J", "H", "U", "S"],
kCantonese: %{coda: "n", final: "in", jyutping: "min4", nucleus: "i", onset: "m", tone: "4"},
kDefinition: ["unable to meet, empty room"],
kHanYu: %{page: 957, position: 3, virtual: false, volume: 2},
kHanyuPinyin: %{location: [%{page: 20957, position: 3, virtual: false}], readings: ["mián"]},
kIRGHanyuDaZidian: %{page: 957, position: 3, virtual: false, volume: 2},
kIRGKangXi: %{page: 293, position: 1, virtual: false},
kIRG_GSource: %{mapping: ["3E3C"], source: "G5"},
kIRG_KSource: %{mapping: "236A", source: "K3"},
kIRG_TSource: %{mapping: "5A7D", source: "T4"},
kKangXi: %{page: 293, position: 1, virtual: false},
kMandarin: "mián",
kRSUnicode: %{radical: 40, simplified_radical: false, strokes: 15},
kSBGY: %{page: 135, position: 35},
kTotalStrokes: %{Hans: 18, Hant: 18}
}
Returns the field information for the data in the Unihan database.