Lingua (lingua v0.1.0)
Lingua wraps Peter M. Stahl's linuga-rs language detection library. This wrapper follows the lingua-rs API closely, so consult the documentation for more information.
Link to this section Summary
Functions
Detect the language of the given input text. By default, all supported languages will be considered and
the minimum relative distance is 0.0
.
Like detect
, but returns the result value or raises an error.
Initialize the detector. Calling this is optional but it may come in handy in cases where you want lingua-rs to load
the language corpora so that subsequent calls to detect
are fast. The first time the detector is run it can take some time to load (~12 seconds on my Macbook Pro).
Link to this section Functions
detect(text, options \\ [])
Specs
Detect the language of the given input text. By default, all supported languages will be considered and
the minimum relative distance is 0.0
.
Returns the detected language, or a list of languages and their confidence values, or :no_match
if the given text
doesn't match a language.
Options:
builder_option:
- can be one of the following (defaults to:all_languages
)::all_languages
- consider every supported language:all_spoken_languages
- consider only currently spoken languages:all_languages_with_arabic_script
- consider only languages written in Arabic script:all_languages_with_cyrillic_script
- consider only languages written in Cyrillic script:all_languages_with_devanagari_script
- consider only languages written in Devanagari script:all_languages_with_latin_script
- consider only languages written in Latin script:with_languages
- consider only the languages supplied in thelanguages
option. Two or more are required. (see below):without_languages
- consider all languages except those supplied in thelanguages
option. Two or more are required. (see below)
languages:
- specify two or more languages to consider or to not consider depending on thebuilder_option:
(defaults to[]
)with_minimum_relative_distance:
- specify the minimum relative distance (0.0 - 0.99) required for a language to be considered a match for the input. See the lingua-rs documentation for details. (defaults to0.0
)compute_language_confidence_values:
- returns the full list of language matches for the input and their confidence values. (defaults tofalse
)
Example
iex> Lingua.detect("this is definitely English")
{:ok, :english}
iex> Lingua.detect("וזה בעברית")
{:ok, :hebrew}
iex> Lingua.detect("państwowych", builder_option: :with_languages, languages: [:english, :russian, :polish])
{:ok, :polish}
iex> Lingua.detect("ѕидови", builder_option: :all_languages_with_cyrillic_script)
{:ok, :macedonian}
iex> Lingua.detect("כלב", builder_option: :with_languages, languages: [:english, :russian, :polish])
{:ok, :no_match}
iex> Lingua.detect("what in the world is this", builder_option: :with_languages, languages: [:english, :russian, :hebrew], compute_language_confidence_values: true)
{:ok, [english: 1.0]}
detect!(text, options \\ [])
Specs
Like detect
, but returns the result value or raises an error.
init()
Specs
init() :: any()
Initialize the detector. Calling this is optional but it may come in handy in cases where you want lingua-rs to load
the language corpora so that subsequent calls to detect
are fast. The first time the detector is run it can take some time to load (~12 seconds on my Macbook Pro).
Example
iex> Lingua.init()
:ok