Modules
Pure Elixir BPE tokenizer compatible with OpenAI's tiktoken.
Core Byte-Pair Encoding merge algorithm.
Regex-based pre-tokenization that splits text into chunks before BPE.
Loads and caches tiktoken rank files from priv/ranks/.
Pure Elixir BPE tokenizer compatible with OpenAI's tiktoken.
Core Byte-Pair Encoding merge algorithm.
Regex-based pre-tokenization that splits text into chunks before BPE.
Loads and caches tiktoken rank files from priv/ranks/.