ExLLM.Context.Strategies (ex_llm v0.5.0)
View SourceMessage truncation strategies for context window management.
Provides different algorithms for fitting messages within token limits:
- Sliding window: Remove oldest messages first
- Smart: Preserve system messages and prioritize recent messages
- Summary: Replace old messages with summaries (future)
Summary
Functions
Calculate token distribution for different message types.
Group consecutive messages by role for better token efficiency.
Sliding window truncation - keep most recent messages.
Smart truncation - preserve system messages and recent conversation.
Split messages into system and non-system messages.
Apply a truncation strategy to fit messages within token limit.
Functions
@spec calculate_distribution( non_neg_integer(), keyword() ) :: map()
Calculate token distribution for different message types.
Returns suggested token allocations for system, conversation, and response.
Group consecutive messages by role for better token efficiency.
Some models handle grouped messages more efficiently.
@spec sliding_window([map()], non_neg_integer()) :: [map()]
Sliding window truncation - keep most recent messages.
@spec smart_truncate([map()], non_neg_integer()) :: [map()]
Smart truncation - preserve system messages and recent conversation.
Split messages into system and non-system messages.
@spec truncate([map()], non_neg_integer(), atom()) :: [map()]
Apply a truncation strategy to fit messages within token limit.
Strategies
:sliding_window
- Remove messages from the beginning:smart
- Keep system messages and recent messages:fifo
- First in, first out (alias for sliding_window):lifo
- Last in, first out (keep oldest messages)
Examples
Strategies.truncate(messages, 1000, :sliding_window)