mecab v1.0.0 Mecab

Elixir bindings for MeCab, a Japanese morphological analyzer.

Each parser function returns a list of map. The map’s keys meanings is as follows.

  • surface_form: 表層形
  • part_of_speech: 品詞
  • part_of_speech_subcategory1: 品詞細分類1
  • part_of_speech_subcategory2: 品詞細分類2
  • part_of_speech_subcategory3: 品詞細分類3
  • conjugation_form: 活用形
  • conjugation: 活用型
  • lexical_form: 原形
  • yomi: 読み
  • pronunciation: 発音

Note: To distinguish things clearly, we call this module “Mecab” and a mecab command either “MeCab” or “mecab”.

Summary

Functions

Parse given string and returns an list of map

Parse given file and returns an list of map

Parse given file and returns an list of map

Functions

parse(str, option \\ [])
parse(String.t, Keyword.t) :: [Map.t, ...]

Parse given string and returns an list of map.

Options can also be supplied:

  • :mecab_option — specify MeCab options
    (e.g. "-d /usr/local/lib/mecab/dic/ipadic")

Examples

iex> Mecab.parse("今日は晴れです")
[%{"conjugation" => "",
   "conjugation_form" => "",
   "lexical_form" => "今日",
   "part_of_speech" => "名詞",
   "part_of_speech_subcategory1" => "副詞可能",
   "part_of_speech_subcategory2" => "",
   "part_of_speech_subcategory3" => "",
   "pronunciation" => "キョー",
   "surface_form" => "今日",
   "yomi" => "キョウ"},
 %{"conjugation" => "",
   "conjugation_form" => "",
   "lexical_form" => "は",
   "part_of_speech" => "助詞",
   "part_of_speech_subcategory1" => "係助詞",
   "part_of_speech_subcategory2" => "",
   "part_of_speech_subcategory3" => "",
   "pronunciation" => "ワ",
   "surface_form" => "は",
   "yomi" => "ハ"},
 %{"conjugation" => "",
   "conjugation_form" => "",
   "lexical_form" => "晴れ",
   "part_of_speech" => "名詞",
   "part_of_speech_subcategory1" => "一般",
   "part_of_speech_subcategory2" => "",
   "part_of_speech_subcategory3" => "",
   "pronunciation" => "ハレ",
   "surface_form" => "晴れ",
   "yomi" => "ハレ"},
 %{"conjugation" => "基本形",
   "conjugation_form" => "特殊・デス",
   "lexical_form" => "です",
   "part_of_speech" => "助動詞",
   "part_of_speech_subcategory1" => "",
   "part_of_speech_subcategory2" => "",
   "part_of_speech_subcategory3" => "",
   "pronunciation" => "デス",
   "surface_form" => "です",
   "yomi" => "デス"},
 %{"conjugation" => "",
   "conjugation_form" => "",
   "lexical_form" => "",
   "part_of_speech" => "",
   "part_of_speech_subcategory1" => "",
   "part_of_speech_subcategory2" => "",
   "part_of_speech_subcategory3" => "",
   "pronunciation" => "",
   "surface_form" => "EOS",
   "yomi" => ""}]
read(path, option \\ [])
read(String.t, Keyword.t) ::
  {:ok, [Map.t, ...]} |
  {:error, String.t}

Parse given file and returns an list of map.

In addition to Mecab.parse(path, from_file: true), this function checks if a given path is exists.

Options can also be supplied:

  • :mecab_option — specify MeCab options
    (e.g. "-d /usr/local/lib/mecab/dic/ipadic")

Examples

iex> File.read!("sample.txt")
"今日は晴れです。\n明日は雨でしょう。\n"

iex> Mecab.read("sample.txt")
{:ok,
 [%{"surface_form" => "今日", "part_of_speech" => "名詞", ...},
  %{"surface_form" => "は", "part_of_speech" => "助詞", ...},
  %{"surface_form" => "晴れ", "part_of_speech" => "名詞", ...},
  %{"surface_form" => "です", "part_of_speech" => "助動詞", ...},
  %{"surface_form" => "。", "part_of_speech" => "記号", ...},
  %{"surface_form" => "EOS", ...},
  %{"surface_form" => "明日", "part_of_speech" => "名詞", ...},
  %{"surface_form" => "は", "part_of_speech" => "助詞", ...},
  %{"surface_form" => "雨", "part_of_speech" => "名詞", ...},
  %{"surface_form" => "でしょ", "part_of_speech" => "助動詞", ...},
  %{"surface_form" => "う", "part_of_speech" => "助動詞", ...},
  %{"surface_form" => "。", "part_of_speech" => "記号", ...},
  %{"surface_form" => "EOS", ...}]}

iex> Mecab.read("not_found.txt")
{:error, "no such a file or directory: not_found.txt"}
read!(path, option \\ [])
read!(String.t, Keyword.t) :: [Map.t, ...]

Parse given file and returns an list of map.

In addition to Mecab.parse(path, from_file: true), this function checks if a given path is exists.

Options can also be supplied:

  • :mecab_option — specify MeCab options
    (e.g. "-d /usr/local/lib/mecab/dic/ipadic")

Examples

iex> File.read!("sample.txt")
"今日は晴れです。\n明日は雨でしょう。\n"

iex> Mecab.read!("sample.txt")
[%{"surface_form" => "今日", "part_of_speech" => "名詞", ...},
 %{"surface_form" => "は", "part_of_speech" => "助詞", ...},
 %{"surface_form" => "晴れ", "part_of_speech" => "名詞", ...},
 %{"surface_form" => "です", "part_of_speech" => "助動詞", ...},
 %{"surface_form" => "。", "part_of_speech" => "記号", ...},
 %{"surface_form" => "EOS", ...},
 %{"surface_form" => "明日", "part_of_speech" => "名詞", ...},
 %{"surface_form" => "は", "part_of_speech" => "助詞", ...},
 %{"surface_form" => "雨", "part_of_speech" => "名詞", ...},
 %{"surface_form" => "でしょ", "part_of_speech" => "助動詞", ...},
 %{"surface_form" => "う", "part_of_speech" => "助動詞", ...},
 %{"surface_form" => "。", "part_of_speech" => "記号", ...},
 %{"surface_form" => "EOS", ...}]