Loads and caches tiktoken rank files from priv/ranks/.
Rank files contain base64-encoded token bytes mapped to integer ranks.
Loaded ranks are cached in :persistent_term for fast repeated access.
Summary
Functions
Returns the inverse rank map (rank -> token bytes) for decoding.
Returns the rank map for the given encoding.
Returns the list of supported encodings.
Pre-loads rank maps for all supported encodings into persistent_term.
Functions
@spec inverse(atom()) :: %{required(non_neg_integer()) => binary()}
Returns the inverse rank map (rank -> token bytes) for decoding.
Raises ArgumentError for unsupported encodings.
@spec load(atom()) :: %{required(binary()) => non_neg_integer()}
Returns the rank map for the given encoding.
The rank map is %{binary() => non_neg_integer()} where keys are raw
token bytes and values are their BPE merge ranks.
Results are cached in :persistent_term after first load.
Raises ArgumentError for unsupported encodings.
@spec supported_encodings() :: [atom()]
Returns the list of supported encodings.
@spec warmup() :: :ok
Pre-loads rank maps for all supported encodings into persistent_term.
Call this at application startup to avoid first-call latency and concurrent parsing races.