ExAthena.Compactor behaviour (ExAthena v0.4.1)

Copy Markdown View Source

Behaviour for context-window compaction.

When the conversation's estimated token footprint crosses :compact_at (fraction of the provider's max_tokens), the loop asks the Compactor to reduce history size. The Compactor's job is to preserve intent + pinned rules while replacing the middle of history with a summary.

Contract

  • Pinned prefix: the first N messages (:pinned_prefix_count) are never dropped. System prompts + CLAUDE.md-style pinned rules live there.
  • Live suffix: the last K messages (:live_suffix_count) are never dropped. Recent context the model needs to keep reasoning.
  • Middle: everything between is the Compactor's to replace. It may emit zero or more summary messages that sit where the dropped messages used to be.

Default implementation ExAthena.Compactors.Summary uses the session's own provider to generate a terse summary message and substitutes it. Consumers can swap in any module via config :ex_athena, compactor: MyApp.MyCompactor.

Why

Research (Anthropic compact_20260112 beta, Cline, Claude Agent SDK): proactive compaction at ~60% of the context limit beats reactive truncation at 95% — the model never notices a sudden loss of continuity, and pinned rules survive every compaction cycle.

Summary

Callbacks

Run compaction against the current state. Return one of

Whether compaction should run this turn. The kernel calls this before compact/2 so the compactor can defer cheaply without having to build a summary.

Functions

Best-effort token estimator. Counts ~4 chars per token for text content, plus a small fixed cost per tool-call to cover the JSON envelope. Good enough for compaction triggers; not a billing number.

Types

decision()

@type decision() ::
  {:compact, messages :: [ExAthena.Messages.Message.t()], metadata :: map()}
  | :skip
  | {:error, term()}

estimate()

@type estimate() :: %{tokens: non_neg_integer(), max_tokens: non_neg_integer()}

Callbacks

compact(t, estimate)

@callback compact(ExAthena.Loop.State.t(), estimate()) :: decision()

Run compaction against the current state. Return one of:

  • {:compact, new_messages, metadata} — the kernel swaps state.messages for new_messages and emits a {:compaction, …} event with metadata.
  • :skip — do nothing this cycle (e.g. compactor judged compaction not yet necessary). The kernel emits no event.
  • {:error, reason} — terminate the run with :error_compaction_failed.

should_compact?(t, estimate)

(optional)
@callback should_compact?(ExAthena.Loop.State.t(), estimate()) :: boolean()

Whether compaction should run this turn. The kernel calls this before compact/2 so the compactor can defer cheaply without having to build a summary.

Functions

estimate_tokens(messages)

@spec estimate_tokens([ExAthena.Messages.Message.t()]) :: non_neg_integer()

Best-effort token estimator. Counts ~4 chars per token for text content, plus a small fixed cost per tool-call to cover the JSON envelope. Good enough for compaction triggers; not a billing number.