paasaa v0.2.2 Paasaa

Provides language detection functions

Examples

iex> Paasaa.detect "Detect this!"
"eng"

Link to this section Summary

Functions

Detects a language. Returns a list of languages scored by probability

Detects a language. Returns a string with ISO6393 language code (e.g. “eng”)

Link to this section Types

Link to this type options()
options() :: [min_length: integer, max_length: integer, whitelist: [String.t], blacklist: [String.t]]
Link to this type result()
result() :: [{language :: String.t, score :: number}]

Link to this section Functions

Link to this function all(str, options \\ [min_length: 10, max_length: 2048, whitelist: [], blacklist: []])
all(str :: String.t, options) :: result

Detects a language. Returns a list of languages scored by probability.

Parameters

  • str - a text string
  • options - a keyword list with options, see detect/2 for details.

Examples

Detect language and limit results to 5:

iex> Paasaa.all("Detect this!") |> Enum.take(5)
[
  {"eng", 1.0},
  {"sco", 0.8668304668304668},
  {"nob", 0.6054054054054054},
  {"swe", 0.5921375921375922},
  {"nno", 0.5518427518427518}
]
Link to this function detect(str, options \\ [min_length: 10, max_length: 2048, whitelist: [], blacklist: []])
detect(str :: String.t, options) :: language :: String.t

Detects a language. Returns a string with ISO6393 language code (e.g. “eng”).

Parameters

  • str - a text string
  • options - a keyword list with options:

    • :min_length - If the text is shorter than :min_length it will return und. Default: 10.
    • :max_length - Maximum length to analyze. Default: 2048.
    • :whitelist - Allow languages. Default: [].
    • :blacklist - Disallow languages. Default: [].

Examples

Detect a string:

iex> Paasaa.detect "Detect this!"
"eng"

With the :blacklist option:

iex> Paasaa.detect "Detect this!", blacklist: ["eng"]
"sco"

With the :min_length option:

iex> Paasaa.detect "Привет", min_length: 6
"rus"

It returns und for undetermined language:

iex> Paasaa.detect "1234567890"
"und"