mecab v0.1.0 Mecab
Elixir bindings for MeCab, a morphological analyzer.
Each parser function returns a list of map. The map’s keys meanings is as follows.
- surface_form: 表層形
- part_of_speech: 品詞
- part_of_speech_subcategory1: 品詞細分類1
- part_of_speech_subcategory2: 品詞細分類2
- part_of_speech_subcategory3: 品詞細分類3
- conjugation_form: 活用形
- conjugation: 活用型
- lexical_form: 原形
- yomi: 読み
- pronunciation: 発音
Note: To distinguish things clearly, we call this module “Mecab” and a mecab command either “MeCab” or “mecab”.
Summary
Functions
Parse given string and returns an list of map
Parse given file and returns an list of map
Parse given file and returns an list of map
Functions
Parse given string and returns an list of map.
Examples
iex> Mecab.parse("今日は晴れです")
[%{"conjugation" => "*", "conjugation_form" => "*", "lexical_form" => "今日",
"part_of_speech" => "名詞", "part_of_speech_subcategory1" => "副詞可能",
"part_of_speech_subcategory2" => "*", "part_of_speech_subcategory3" => "*",
"pronunciation" => "キョー", "surface_form" => "今日", "yomi" => "キョウ"},
%{"conjugation" => "*", "conjugation_form" => "*", "lexical_form" => "は",
"part_of_speech" => "助詞", "part_of_speech_subcategory1" => "係助詞",
"part_of_speech_subcategory2" => "*", "part_of_speech_subcategory3" => "*",
"pronunciation" => "ワ", "surface_form" => "は", "yomi" => "ハ"},
%{"conjugation" => "*", "conjugation_form" => "*", "lexical_form" => "晴れ",
"part_of_speech" => "名詞", "part_of_speech_subcategory1" => "一般",
"part_of_speech_subcategory2" => "*", "part_of_speech_subcategory3" => "*",
"pronunciation" => "ハレ", "surface_form" => "晴れ", "yomi" => "ハレ"},
%{"conjugation" => "基本形", "conjugation_form" => "特殊・デス", "lexical_form" => "です",
"part_of_speech" => "助動詞", "part_of_speech_subcategory1" => "*",
"part_of_speech_subcategory2" => "*", "part_of_speech_subcategory3" => "*",
"pronunciation" => "デス", "surface_form" => "です", "yomi" => "デス"},
%{"conjugation" => "", "conjugation_form" => "", "lexical_form" => "",
"part_of_speech" => "", "part_of_speech_subcategory1" => "",
"part_of_speech_subcategory2" => "", "part_of_speech_subcategory3" => "",
"pronunciation" => "", "surface_form" => "EOS", "yomi" => ""}]
Parse given file and returns an list of map.
In addition to Mecab.parse(filepath, from_file: true), this function checks if a given filepath is exists.
Examples
iex> File.read!("sample.txt")
"今日は晴れです。\n明日は雨でしょう。\n"
iex> Mecab.read("sample.txt")
{:ok,
[%{"surface_form" => "今日", "part_of_speech" => "名詞", ...},
%{"surface_form" => "は", "part_of_speech" => "助詞", ...},
%{"surface_form" => "晴れ", "part_of_speech" => "名詞", ...},
%{"surface_form" => "です", "part_of_speech" => "助動詞", ...},
%{"surface_form" => "。", "part_of_speech" => "記号", ...},
%{"surface_form" => "EOS", ...},
%{"surface_form" => "明日", "part_of_speech" => "名詞", ...},
%{"surface_form" => "は", "part_of_speech" => "助詞", ...},
%{"surface_form" => "雨", "part_of_speech" => "名詞", ...},
%{"surface_form" => "でしょ", "part_of_speech" => "助動詞", ...},
%{"surface_form" => "う", "part_of_speech" => "助動詞", ...},
%{"surface_form" => "。", "part_of_speech" => "記号", ...},
%{"surface_form" => "EOS", ...}]}
iex> Mecab.read("not_found.txt")
{:error, "no such a file or directory: not_found.txt"}
Parse given file and returns an list of map.
Examples
iex> File.read!("sample.txt")
"今日は晴れです。\n明日は雨でしょう。\n"
iex> Mecab.read!("sample.txt")
[%{"surface_form" => "今日", "part_of_speech" => "名詞", ...},
%{"surface_form" => "は", "part_of_speech" => "助詞", ...},
%{"surface_form" => "晴れ", "part_of_speech" => "名詞", ...},
%{"surface_form" => "です", "part_of_speech" => "助動詞", ...},
%{"surface_form" => "。", "part_of_speech" => "記号", ...},
%{"surface_form" => "EOS", ...},
%{"surface_form" => "明日", "part_of_speech" => "名詞", ...},
%{"surface_form" => "は", "part_of_speech" => "助詞", ...},
%{"surface_form" => "雨", "part_of_speech" => "名詞", ...},
%{"surface_form" => "でしょ", "part_of_speech" => "助動詞", ...},
%{"surface_form" => "う", "part_of_speech" => "助動詞", ...},
%{"surface_form" => "。", "part_of_speech" => "記号", ...},
%{"surface_form" => "EOS", ...}]