Pressure-triggered context compaction trims older turns from a multi-turn agent's LLM-input message list once turn count or estimated token usage crosses a threshold. Recent turns stay intact so the agent always sees the raw breadcrumb trail of what it most recently did.

Compaction is opt-in. The library default is compaction: false.

When to enable it

You probably don't need compaction unless you're seeing one of:

  • A multi-turn agent that runs long enough to push toward the model's context window.
  • LLM cost dominated by repeatedly re-sending earlier turn history.
  • Agents that stall because old context is drowning out the recent error trail.

For short multi-turn runs (under ~8 turns or under a few thousand tokens), the cost of compaction outweighs the benefit. Leave it off.

Quick start

SubAgent.run(prompt, llm: llm, max_turns: 20, compaction: true)

compaction: true selects sensible defaults — the :trim strategy with trigger: [turns: 8], keep_recent_turns: 3, keep_initial_user: true.

Explicit configuration

SubAgent.run(prompt,
  llm: llm,
  max_turns: 20,
  compaction: [
    strategy: :trim,
    trigger: [turns: 8, tokens: 12_000],
    keep_recent_turns: 3,
    keep_initial_user: true,
    token_counter: nil
  ]
)
OptionDefaultMeaning
strategy:trimThe only Phase 1 strategy. Custom modules and :summarize are deferred.
trigger[turns: 8]Fires when state.turn > N (turns) or estimated total tokens >= N (tokens). Set both for OR semantics.
keep_recent_turns3The most recent N × 2 messages stay verbatim.
keep_initial_usertrueKeep the first user message (the original prompt) at the head of the trimmed list.
token_counternil (uses default)1-arity function from message content to estimated token count.

What :trim does

When pressure is detected, :trim:

  1. Optionally keeps the first user message (when keep_initial_user: true and the head of the list actually has role :user).
  2. Keeps the last keep_recent_turns × 2 messages.
  3. Drops everything in between.

If slicing would produce an :assistant-leading recent slice (e.g. odd boundaries), :trim drops one more message from the front so the slice begins with :user.

What you'll see in step.usage.compaction

Triggered:

%{
  enabled: true,
  triggered: true,
  strategy: "trim",
  reason: :turn_pressure,        # | :token_pressure
  messages_before: 13,
  messages_after: 7,
  estimated_tokens_before: 31_200,
  estimated_tokens_after: 12_400,
  kept_initial_user?: true,
  kept_recent_turns: 3,
  over_budget?: false
}

Not triggered (compaction was active but didn't fire):

%{
  enabled: true,
  triggered: false,
  strategy: "trim",
  reason: nil,
  messages_before: 4,
  messages_after: 4,
  estimated_tokens_before: 320,
  estimated_tokens_after: 320,
  kept_initial_user?: false,
  kept_recent_turns: 3,
  over_budget?: false
}

The shape is the same in both cases — every field is always present, so consumers can read usage.compaction.messages_before without Map.has_key? guards.

over_budget?: true flags the case where a single retained message exceeds the configured trigger[:tokens] budget. :trim does not split content; you'll need to handle this at a higher level (smaller messages, larger budget, or Phase 2's summarization once it ships).

Token estimation

The default counter approximates tokens as max(1, String.length(content) / 4) — a pressure heuristic, not a model-accurate count. If you need adapter-specific counting, supply your own:

compaction: [
  token_counter: fn content -> :tiktoken.encode(content) |> length() end
]

The counter must be a 1-arity function returning a non-negative integer.

Limits

  • :trim only. No summarization in Phase 1.
  • No custom strategy modules. Phase 2 will add a behaviour for that.
  • Not applied to text mode. output: :text rejects compaction at validation time.
  • Not applied to single-shot or single-shot+retry. Gate is agent.max_turns > 1.
  • Token counter is a heuristic. Adapter-aware counting is deferred to Phase 2.

For the deferred items, see the Phase 2 design stub at docs/plans/pressure-triggered-context-compaction-phase-2.md in the repo.

See Also