Prompt for deriving progressive environment state from a single step.
Implements s_t = f(s_{t-1}, a_{t-1}, o_t) — each step's state is derived
from the previous state, the action taken, and the new observation.
Prompt for deriving progressive environment state from a single step.
Implements s_t = f(s_{t-1}, a_{t-1}, o_t) — each step's state is derived
from the previous state, the action taken, and the new observation.