Gibran.Counter

A set of functions that retrieve statistics on a list of tokens.

Summary

Functions

Returns a float of the average characters per token

Returns an integer of the number of characters in a list

Returns a List of two-element tuples of the longest tokens. Each tuple contains the token and its length respectively

Returns a List of two-element tuples with the highest frequency. Each tuple consists of the token and its frequency

Returns an integer of the number of tokens in a list

Returns a HashDict of tokens and the number of times they occur

Returns an unordered HashDict of tokens and their lengths

Returns an integer of the number of unique tokens in a list

Given a list of tokens, it returns a unique list

Functions

average_chars_per_token(list, opts \\ [])

Returns a float of the average characters per token.

Examples

iex> Gibran.Counter.average_chars_per_token(["twenty", "drawings"])
7.0
iex> Gibran.Counter.average_chars_per_token(["The", "Treasured", "Writings", "of", "Kahlil", "Gibran"], precision: 4)
5.6667

Options

  • :precision The maximum total number of decimal digits that will be returned. The precision must be an integer.
char_count(list)

Returns an integer of the number of characters in a list.

Examples

iex> Gibran.Counter.char_count(["the", "wanderer"])
11
longest_tokens(list)

Returns a List of two-element tuples of the longest tokens. Each tuple contains the token and its length respectively.

Examples

iex> Gibran.Counter.longest_tokens(["kingdom", "of", "the", "imagination"])
[{"imagination", 11}]
most_frequent_tokens(list)

Returns a List of two-element tuples with the highest frequency. Each tuple consists of the token and its frequency.

Examples

iex> Gibran.Counter.most_frequent_tokens(["the", "prophet", "eye", "of", "the", "prophet"])
[{"prophet", 2}, {"the", 2}]
token_count(list)

Returns an integer of the number of tokens in a list.

Examples

iex> Gibran.Counter.token_count(["the", "madman"])
2
token_frequency(list)

Returns a HashDict of tokens and the number of times they occur.

Examples

iex> Gibran.Counter.token_frequency(["the", "prophet", "eye", "of", "the", "prophet"])
#HashDict<[{"the", 2}, {"eye", 1}, {"of", 1}, {"prophet", 2}]>
token_lengths(list)

Returns an unordered HashDict of tokens and their lengths.

Examples

iex> Gibran.Counter.token_lengths(["voice", "and", "master"])
#HashDict<[{"and", 3}, {"master", 6}, {"voice", 5}]>
uniq_token_count(list)

Returns an integer of the number of unique tokens in a list.

Examples

iex> Gibran.Counter.uniq_token_count(["the", "prophet", "eye", "of", "the", "prophet"])
4
uniq_tokens(list)

Given a list of tokens, it returns a unique list.

Examples

iex> Gibran.Counter.uniq_tokens(["the", "prophet", "eye", "of", "the", "prophet"])
["the", "prophet", "eye", "of"]