Plurality.Engine (Plurality v0.3.0)

Copy Markdown View Source

Three-tier resolution engine for English noun inflection.

This module is the core of Plurality. It loads irregular pairs, uncountable words, and suffix rules at compile time, then exposes functions that resolve any English noun through three tiers in order:

Resolution tiers

Tier 1 — Uncountables (MapSet, O(1) membership test)

Words like "sheep", "software", and "news" that have no distinct plural form. Returned unchanged by both pluralize/2 and singularize/1.

Built from ~1,022 words curated from multiple sources and verified against Oxford and Merriam-Webster dictionaries.

Tier 2 — Irregulars (Map, O(1) lookup)

Direct singular↔plural mappings for words whose plural form cannot be predicted by suffix rules (e.g., "child""children", "person""people").

Built from ~1,110 pairs curated from multiple sources, with modern English forms preferred over classical Latin (e.g., "schema""schemas" instead of "schemata").

Two maps are maintained:

  • singular → plural — used by pluralize/2
  • plural → singular — used by singularize/1, built from ALL sources (including overridden entries) so both old and new plural forms resolve

Tier 3 — Suffix rules (last-byte dispatch, O(1))

Pattern-based transformation using Plurality.Rules. Extracts the last byte of the word, dispatches via BEAM select_val jump table, then confirms the full suffix with a sized-skip binary match. See Plurality.Rules for details.

Compile-time data pipeline

All data is loaded from TSV and TXT files in priv/ at compile time via module attributes. The pipeline:

  1. Load pre-merged data from priv/data/irregulars.tsv and priv/data/uncountables.txt
  2. Build forward map (singular → plural) and reverse map (plural → singular)
  3. Auto-exclude words from uncountables if they appear in irregulars with a different plural form (e.g., "data" is uncountable but "data""datum" exists in irregulars)
  4. Apply force overrides (@force_uncountable, @force_countable)
  5. Build downcased lookup maps with lowercase-entry priority

There is zero runtime file I/O, zero regex, and zero ETS usage.

Case-insensitive matching

All lookups are performed against downcased maps. When case-variant entries exist in the source data (e.g., "jerry""jerries" AND "Jerry""Jerrys"), the lowercase entry takes priority since it represents the common noun form. This is implemented by sorting entries lowercase-first and using Map.put_new/3.

Singularize ordering

singularize/1 checks the irregular reverse map before the uncountables set. This is intentional: words like "data", "graffiti", and "testes" appear in both sets, and singularization should resolve them to their base forms ("datum", "graffito", "testis").

Usage

This module is not typically called directly. Use the public API in Plurality instead, which delegates to this module.

Summary

Functions

Inflects a word based on count.

Returns true if the word is in plural form or is uncountable.

Converts a word to its plural form.

Returns true if the word is in singular form or is uncountable.

Converts a word to its singular form.

Functions

inflect(word, arg2, opts)

@spec inflect(
  word :: String.t(),
  count :: integer(),
  opts :: Plurality.pluralize_opts()
) :: String.t()

Inflects a word based on count.

See Plurality.inflect/2 for full documentation.

plural?(word)

@spec plural?(word :: String.t()) :: boolean()

Returns true if the word is in plural form or is uncountable.

See Plurality.plural?/1 for full documentation.

pluralize(word, opts \\ [])

@spec pluralize(word :: String.t(), opts :: Plurality.pluralize_opts()) :: String.t()

Converts a word to its plural form.

See Plurality.pluralize/2 for full documentation.

singular?(word)

@spec singular?(word :: String.t()) :: boolean()

Returns true if the word is in singular form or is uncountable.

See Plurality.singular?/1 for full documentation.

singularize(word)

@spec singularize(word :: String.t()) :: String.t()

Converts a word to its singular form.

See Plurality.singularize/1 for full documentation.