erllama_cache_policy (erllama v0.1.0)

View Source

Pure-Erlang policy decisions for the erllama_cache subsystem.

Two responsibilities:

1. Boundary trim: cold saves persist a *trimmed-aligned prefix* of the prompt rather than the full live token list, so the next request whose prompt is a textual extension of this one still lands on the saved cache key after BPE retokenisation. The algorithm trims a fixed number of tokens off the tail and aligns the result down to a multiple of a configured chunk.

2. Save-reason gating: cold/continued/finish saves each have a simple guard (token-count thresholds and intervals). Eviction and shutdown saves are unconditional and do not pass through this module.

This module has no side effects; everything is testable as plain data transformations.

Summary

Types

config/0

-type config() ::
          #{min_tokens := non_neg_integer(),
            cold_min_tokens := non_neg_integer(),
            cold_max_tokens := non_neg_integer(),
            continued_interval := pos_integer(),
            boundary_trim_tokens := non_neg_integer(),
            boundary_align_tokens := pos_integer(),
            session_resume_wait_ms => non_neg_integer()}.

token/0

-type token() :: non_neg_integer().

Functions

cold_save_split(Tokens, Cfg)

-spec cold_save_split([token()], config()) -> {trim, [token()], [token()]} | no_save.

should_continued_save(LiveCount, LastSavedAtCount, Cfg)

-spec should_continued_save(non_neg_integer(), non_neg_integer(), config()) -> boolean().

should_finish_save(LiveCount, Cfg)

-spec should_finish_save(non_neg_integer(), config()) -> boolean().

trim_boundary(Tokens, Trim, Align)

-spec trim_boundary([token()], non_neg_integer(), pos_integer()) -> {ok, [token()]} | {skip, too_short}.

validate_config(Cfg)

-spec validate_config(map()) -> ok | {error, term()}.