Snowball.Grouping (snowball v0.1.1)

Copy Markdown View Source

Compile-time helpers for building Snowball grouping bit-tables.

A grouping in canonical Snowball is a set of codepoints. Generated code emits these as compact bit-tables for O(1) membership testing. This module produces the {min_codepoint, bits, max_codepoint} tuple consumed by Snowball.Runtime.in_grouping/2 and friends.

Generated stemmer modules call from_string/1 or from_codepoints/1 inside a module attribute so the table is computed at compile time.

Summary

Functions

Build a grouping table from a list of codepoints.

Build a grouping table from a UTF-8 string whose codepoints are the members of the group.

Functions

from_codepoints(codepoints)

@spec from_codepoints([integer()]) :: {integer(), binary(), integer()}

Build a grouping table from a list of codepoints.

Arguments

  • codepoints is a list of integers.

Returns

  • A {min_cp, bits, max_cp} tuple.

Examples

iex> {97, _bits, 99} = Snowball.Grouping.from_codepoints([?a, ?b, ?c])

from_string(string)

@spec from_string(binary()) :: {integer(), binary(), integer()}

Build a grouping table from a UTF-8 string whose codepoints are the members of the group.

Arguments

  • string is a UTF-8 binary listing each member codepoint.

Returns

  • A {min_cp, bits, max_cp} tuple suitable for the runtime grouping primitives.

Examples

iex> {97, _bits, 117} = Snowball.Grouping.from_string("aeiou")