Lingua (lingua v0.1.0)

Lingua wraps Peter M. Stahl's linuga-rs language detection library. This wrapper follows the lingua-rs API closely, so consult the documentation for more information.

Link to this section Summary

Functions

Detect the language of the given input text. By default, all supported languages will be considered and the minimum relative distance is 0.0.

Like detect, but returns the result value or raises an error.

Initialize the detector. Calling this is optional but it may come in handy in cases where you want lingua-rs to load the language corpora so that subsequent calls to detect are fast. The first time the detector is run it can take some time to load (~12 seconds on my Macbook Pro).

Link to this section Functions

Link to this function

detect(text, options \\ [])

Specs

detect(any(), keyword()) :: any()

Detect the language of the given input text. By default, all supported languages will be considered and the minimum relative distance is 0.0.

Returns the detected language, or a list of languages and their confidence values, or :no_match if the given text doesn't match a language.

Options:

  • builder_option: - can be one of the following (defaults to :all_languages):

    • :all_languages - consider every supported language
    • :all_spoken_languages - consider only currently spoken languages
    • :all_languages_with_arabic_script - consider only languages written in Arabic script
    • :all_languages_with_cyrillic_script - consider only languages written in Cyrillic script
    • :all_languages_with_devanagari_script - consider only languages written in Devanagari script
    • :all_languages_with_latin_script - consider only languages written in Latin script
    • :with_languages - consider only the languages supplied in the languages option. Two or more are required. (see below)
    • :without_languages - consider all languages except those supplied in the languages option. Two or more are required. (see below)
  • languages: - specify two or more languages to consider or to not consider depending on the builder_option: (defaults to [])

  • with_minimum_relative_distance: - specify the minimum relative distance (0.0 - 0.99) required for a language to be considered a match for the input. See the lingua-rs documentation for details. (defaults to 0.0)

  • compute_language_confidence_values: - returns the full list of language matches for the input and their confidence values. (defaults to false)

Example

iex> Lingua.detect("this is definitely English")
{:ok, :english}

iex> Lingua.detect("וזה בעברית")
{:ok, :hebrew}

iex> Lingua.detect("państwowych", builder_option: :with_languages, languages: [:english, :russian, :polish])
{:ok, :polish}

iex> Lingua.detect("ѕидови", builder_option: :all_languages_with_cyrillic_script)
{:ok, :macedonian}

iex> Lingua.detect("כלב", builder_option: :with_languages, languages: [:english, :russian, :polish])
{:ok, :no_match}

iex> Lingua.detect("what in the world is this", builder_option: :with_languages, languages: [:english, :russian, :hebrew], compute_language_confidence_values: true)
{:ok, [english: 1.0]}
Link to this function

detect!(text, options \\ [])

Specs

detect!(any(), keyword()) :: any()

Like detect, but returns the result value or raises an error.

Specs

init() :: any()

Initialize the detector. Calling this is optional but it may come in handy in cases where you want lingua-rs to load the language corpora so that subsequent calls to detect are fast. The first time the detector is run it can take some time to load (~12 seconds on my Macbook Pro).

Example

iex> Lingua.init()
:ok