Language tag utilities used across the package.
Every function in text that takes a "language" option accepts:
an atom (
:fr,:zh),a string (
"fr","fr-CA","zh-Hans-CN"),or a
Localize.LanguageTagstruct, when the optionallocalizedependency is available.
This module provides the normalisation helpers that unify those shapes so the call sites remain simple.
normalize/1 — to a language-subtag atom
Most internal lookups (sentiment lexicons, classifier outputs, …)
key on the bare ISO 639-1 language subtag. normalize/1 extracts
that subtag from any of the accepted shapes:
iex> Text.Language.normalize(:fr)
:fr
iex> Text.Language.normalize("fr-CA")
:fr
iex> Text.Language.normalize("ZH-Hans-CN")
:zhto_locale_string/1 — to a BCP-47 string
Some downstream APIs (CLDR-aware tokenisation, locale-aware
formatting) want the full BCP-47 form. to_locale_string/1 produces
a normalised string suitable for passing to unicode_string,
localize, etc.
iex> Text.Language.to_locale_string(:fr)
"fr"
iex> Text.Language.to_locale_string("fr_CA")
"fr-CA"
Summary
Types
Anything normalize/1 and to_locale_string/1 accept.
Functions
Returns the language subtag of input as a lowercase atom.
Returns a normalised BCP-47 locale string for input.
Types
Anything normalize/1 and to_locale_string/1 accept.
When :localize is available, also includes Localize.LanguageTag
structs.
Functions
Returns the language subtag of input as a lowercase atom.
Arguments
inputis one of the accepted shapes — atom, string, or (when:localizeis loaded) aLocalize.LanguageTagstruct.
Returns
- An atom — the language subtag of the input (e.g.
:frfor"fr-CA"or aLanguageTagwhose language is:fr).
Examples
iex> Text.Language.normalize(:fr)
:fr
iex> Text.Language.normalize("fr-CA")
:fr
iex> Text.Language.normalize("FR")
:fr
Returns a normalised BCP-47 locale string for input.
Splits on _ (Java-style separator) as well as - and joins the
subtags with -. The language subtag is lowercased; subsequent
subtags are passed through unchanged. For a Localize.LanguageTag
the canonical id is used when present, otherwise the
language/script/territory triple is composed.
Arguments
inputis one of the accepted shapes.
Returns
- A
String.t/0.
Examples
iex> Text.Language.to_locale_string(:fr)
"fr"
iex> Text.Language.to_locale_string("fr_CA")
"fr-CA"
iex> Text.Language.to_locale_string("ZH-Hans-CN")
"zh-Hans-CN"