Agentix.Tokenizer behaviour (Agentix v0.1.0)

Copy Markdown View Source

Approximate token counting for budgeting the assembled context.

Exact counts only come back from the provider after a call and are model-specific; ReqLLM exposes no public count API, so Agentix owns a pre-send estimate. This is a behaviour with a default Agentix.Tokenizer.Heuristic (a byte-count estimate). A real tokenizer (a tiktoken NIF, etc.) is a later optional adapter behind the same behaviour — selected via config :agentix, :tokenizer, MyTokenizer.

Budget conservatively: an over-budget context is a hard provider failure, while over-eager compaction is mild waste. Counting covers text content parts (the bulk); it intentionally under-counts structured tool-call args, so set working_budget comfortably below the model window.

Summary

Callbacks

Estimated token count for a string.

Functions

Estimated token count for a string, via the configured tokenizer.

Estimated token count for an assembled context (sum over message text parts).

The configured tokenizer module (default Agentix.Tokenizer.Heuristic).

Callbacks

count(t)

@callback count(String.t()) :: non_neg_integer()

Estimated token count for a string.

Functions

count(text)

@spec count(String.t()) :: non_neg_integer()

Estimated token count for a string, via the configured tokenizer.

count_context(context)

@spec count_context(ReqLLM.Context.t()) :: non_neg_integer()

Estimated token count for an assembled context (sum over message text parts).

impl()

@spec impl() :: module()

The configured tokenizer module (default Agentix.Tokenizer.Heuristic).