Penelope v0.5.0 Penelope.NLP.POSTagger View Source
The part-of-speech tagger transforms a tokenized sentence into a list of
{token, pos_tag}
tuples. The tagger takes no responsibility for
tokenization; this means that callers must be careful to maintain the same
tokenization scheme between training and evaluating to ensure the best
results.
As this tagger does not ship with a pretrained model, it is both
language- and tagset-agnostic, though the default feature set used
(see POSFeaturizer
) was designed for English.
See POSTaggerTrainer.train/2
for an example
of how to train a new POS tagger model.
Link to this section Summary
Functions
Imports parameters from a serialized model
Exports a runtime model to a serializable data structure
Fits the tagger model. Custom featurizers may be supplied
Attaches part of speech tags to a list of tokens
Link to this section Types
Link to this section Functions
Imports parameters from a serialized model.
Exports a runtime model to a serializable data structure.
Fits the tagger model. Custom featurizers may be supplied.
Attaches part of speech tags to a list of tokens.
Example:
iex> POSTagger.tag(model, %{}, ["Judy", "saw", "her"])
[{"Judy", "NNP"}, {"saw", "VBD"}, {"her", "PRP$"}]