Tiktokenex.Ranks (Tiktokenex v0.1.0)

Copy Markdown View Source

Loads and caches tiktoken rank files from priv/ranks/.

Rank files contain base64-encoded token bytes mapped to integer ranks. Loaded ranks are cached in :persistent_term for fast repeated access.

Summary

Functions

Returns the inverse rank map (rank -> token bytes) for decoding.

Returns the rank map for the given encoding.

Returns the list of supported encodings.

Pre-loads rank maps for all supported encodings into persistent_term.

Functions

inverse(encoding)

@spec inverse(atom()) :: %{required(non_neg_integer()) => binary()}

Returns the inverse rank map (rank -> token bytes) for decoding.

Raises ArgumentError for unsupported encodings.

load(encoding)

@spec load(atom()) :: %{required(binary()) => non_neg_integer()}

Returns the rank map for the given encoding.

The rank map is %{binary() => non_neg_integer()} where keys are raw token bytes and values are their BPE merge ranks.

Results are cached in :persistent_term after first load.

Raises ArgumentError for unsupported encodings.

supported_encodings()

@spec supported_encodings() :: [atom()]

Returns the list of supported encodings.

warmup()

@spec warmup() :: :ok

Pre-loads rank maps for all supported encodings into persistent_term.

Call this at application startup to avoid first-call latency and concurrent parsing races.