Gibran.Counter
A set of functions that retrieve statistics on a list of tokens.
Summary
Functions
Returns a float of the average characters per token
Returns an integer of the number of characters in a list
Returns a List
of two-element tuples of the longest tokens.
Each tuple contains the token and its length respectively
Returns a List
of two-element tuples with the highest frequency. Each tuple consists of
the token and its frequency
Returns an integer of the number of tokens in a list
Returns a HashDict
of tokens and their densities as floats
Returns a HashDict
of tokens and the number of times they occur
Returns an unordered HashDict
of tokens and their lengths
Returns an integer of the number of unique tokens in a list
Given a list of tokens, it returns a unique list
Functions
Returns a float of the average characters per token.
Examples
iex> Gibran.Counter.average_chars_per_token(["twenty", "drawings"])
7.0
iex> Gibran.Counter.average_chars_per_token(["The", "Treasured", "Writings", "of", "Kahlil", "Gibran"], precision: 4)
5.6667
Options
:precision
The maximum total number of decimal digits that will be returned. Theprecision
must be an integer.
Returns an integer of the number of characters in a list.
Examples
iex> Gibran.Counter.char_count(["the", "wanderer"])
11
Returns a List
of two-element tuples of the longest tokens.
Each tuple contains the token and its length respectively.
Examples
iex> Gibran.Counter.longest_tokens(["kingdom", "of", "the", "imagination"])
[{"imagination", 11}]
Returns a List
of two-element tuples with the highest frequency. Each tuple consists of
the token and its frequency.
Examples
iex> Gibran.Counter.most_frequent_tokens(["the", "prophet", "eye", "of", "the", "prophet"])
[{"prophet", 2}, {"the", 2}]
Returns an integer of the number of tokens in a list.
Examples
iex> Gibran.Counter.token_count(["the", "madman"])
2
Returns a HashDict
of tokens and their densities as floats.
Examples
iex> Gibran.Counter.token_density(["the", "prophet", "eye", "of", "the", "prophet"])
#HashDict<[{"the", 0.33}, {"eye", 0.17}, {"of", 0.17}, {"prophet", 0.33}]>
iex> Gibran.Counter.token_density(["the", "prophet", "eye", "of", "the", "prophet"], 4)
#HashDict<[{"the", 0.3333}, {"eye", 0.1667}, {"of", 0.1667}, {"prophet", 0.3333}]>
Options
:precision
The maximum total number of decimal digits that will be included in desnity. Theprecision
must be an integer.
Returns a HashDict
of tokens and the number of times they occur.
Examples
iex> Gibran.Counter.token_frequency(["the", "prophet", "eye", "of", "the", "prophet"])
#HashDict<[{"the", 2}, {"eye", 1}, {"of", 1}, {"prophet", 2}]>
Returns an unordered HashDict
of tokens and their lengths.
Examples
iex> Gibran.Counter.token_lengths(["voice", "and", "master"])
#HashDict<[{"and", 3}, {"master", 6}, {"voice", 5}]>
Returns an integer of the number of unique tokens in a list.
Examples
iex> Gibran.Counter.uniq_token_count(["the", "prophet", "eye", "of", "the", "prophet"])
4