Jido.AI.Reasoning.TRM.Strategy (Jido AI v2.2.0)

Copy Markdown View Source

TRM (Tiny-Recursive-Model) execution strategy for Jido agents.

This strategy implements recursive reasoning by iteratively improving answers through a reason-supervise-improve cycle. Each iteration:

  1. Reasoning: Generate insights about the current answer
  2. Supervision: Evaluate the answer and provide feedback
  3. Improvement: Apply feedback to generate a better answer

Overview

TRM uses a tiny network applied recursively to iteratively improve answers, achieving remarkable parameter efficiency while outperforming larger models on complex reasoning tasks. Key features:

  • Recursive reasoning loop with iterative answer improvement
  • Deep supervision with multiple feedback steps
  • Adaptive Computational Time (ACT) for early stopping
  • Latent state management across recursion steps

Architecture

This strategy uses a pure state machine (Jido.AI.Reasoning.TRM.Machine) for all state transitions. The strategy acts as a thin adapter that:

  • Converts instructions to machine messages
  • Converts machine directives to SDK-specific directive structs
  • Manages the machine state within the agent

Configuration

Configure via strategy options when defining your agent:

use Jido.Agent,
  name: "my_trm_agent",
  strategy: {
    Jido.AI.Reasoning.TRM.Strategy,
    model: "anthropic:claude-sonnet-4-20250514",
    max_supervision_steps: 5,
    act_threshold: 0.9
  }

Options

  • :model (optional) - Model alias or direct model spec, defaults to :fast (resolved via Jido.AI.resolve_model/1)
  • :max_supervision_steps (optional) - Maximum iterations before termination, defaults to 5
  • :act_threshold (optional) - Confidence threshold for early stopping, defaults to 0.9

Signal Routing

This strategy implements signal_routes/1 which AgentServer uses to automatically route these signals to strategy commands:

  • "ai.trm.query":trm_start
  • "ai.llm.response":trm_llm_result
  • "ai.llm.delta":trm_llm_partial

State

State is stored under agent.state.__strategy__ with TRM-specific structure.

Summary

Functions

Returns the default system prompt for improvement phase.

Returns the default system prompt for reasoning phase.

Returns the default system prompt for supervision phase.

Gets the answer history from the agent's TRM state.

Gets the best answer found so far.

Gets the best score achieved.

Gets the current confidence score from the agent's TRM state.

Gets the current answer from the agent's TRM state.

Gets the current supervision step from the agent's TRM state.

Returns the action atom for handling streaming LLM partial tokens.

Returns the action atom for handling LLM results.

Returns the action atom for handling request rejection events.

Returns the action atom for starting TRM reasoning.

Types

config()

@type config() :: %{
  model: String.t(),
  max_supervision_steps: pos_integer(),
  act_threshold: float()
}

Functions

default_improvement_prompt()

@spec default_improvement_prompt() :: String.t()

Returns the default system prompt for improvement phase.

This is provided for reference - prompts are managed internally by the Supervision module. See Jido.AI.Reasoning.TRM.Supervision for details.

default_reasoning_prompt()

@spec default_reasoning_prompt() :: String.t()

Returns the default system prompt for reasoning phase.

This is provided for reference - prompts are managed internally by the Reasoning module. See Jido.AI.Reasoning.TRM.Reasoning for details.

default_supervision_prompt()

@spec default_supervision_prompt() :: String.t()

Returns the default system prompt for supervision phase.

This is provided for reference - prompts are managed internally by the Supervision module. See Jido.AI.Reasoning.TRM.Supervision for details.

get_answer_history(agent)

@spec get_answer_history(Jido.Agent.t()) :: [String.t()]

Gets the answer history from the agent's TRM state.

get_best_answer(agent)

@spec get_best_answer(Jido.Agent.t()) :: String.t() | nil

Gets the best answer found so far.

get_best_score(agent)

@spec get_best_score(Jido.Agent.t()) :: float()

Gets the best score achieved.

get_confidence(agent)

@spec get_confidence(Jido.Agent.t()) :: float()

Gets the current confidence score from the agent's TRM state.

get_current_answer(agent)

@spec get_current_answer(Jido.Agent.t()) :: String.t() | nil

Gets the current answer from the agent's TRM state.

get_supervision_step(agent)

@spec get_supervision_step(Jido.Agent.t()) :: non_neg_integer()

Gets the current supervision step from the agent's TRM state.

llm_partial_action()

@spec llm_partial_action() :: :trm_llm_partial

Returns the action atom for handling streaming LLM partial tokens.

llm_result_action()

@spec llm_result_action() :: :trm_llm_result

Returns the action atom for handling LLM results.

request_error_action()

@spec request_error_action() :: :trm_request_error

Returns the action atom for handling request rejection events.

start_action()

@spec start_action() :: :trm_start

Returns the action atom for starting TRM reasoning.