viva_math/free_energy
Free Energy Principle (FEP) calculations.
Based on Karl Friston’s work (2010, 2019). Free Energy bounds surprise (negative log evidence) and can be decomposed as:
F = Π · (μ - o)² + D_KL(q || p) ↑ ↑ Accuracy Complexity (weighted (deviation prediction from priors) error)
In VIVA, this is used for interoception - sensing internal state and minimizing “surprise” through prediction.
References:
- Friston (2010) “The free-energy principle: a unified brain theory?”
- Parr & Friston (2019) “Generalised free energy and active inference”
- Validated by DeepSeek R1 671B (2025)
Types
Expected Free Energy components.
In planning, an agent selects actions that minimise G = epistemic + pragmatic. Splitting the components lets you steer exploration (epistemic) vs exploitation (pragmatic) by reweighting them.
pub type ExpectedFreeEnergy {
ExpectedFreeEnergy(
epistemic: Float,
pragmatic: Float,
total: Float,
)
}
Constructors
-
ExpectedFreeEnergy( epistemic: Float, pragmatic: Float, total: Float, )Arguments
- epistemic
-
Information gain from observing the outcome (exploration).
- pragmatic
-
Expected divergence from preferred outcomes (exploitation).
- total
-
G = epistemic + pragmatic.
Qualitative feeling based on free energy level.
pub type Feeling {
Homeostatic
Surprised
Alarmed
Overwhelmed
}
Constructors
-
HomeostaticLow free energy - predictions match reality (F < μ - σ)
-
SurprisedModerate free energy - slight mismatch (μ - σ ≤ F < μ)
-
AlarmedHigh free energy - significant mismatch (μ ≤ F < μ + σ)
-
OverwhelmedVery high free energy - system overwhelmed (F ≥ μ + σ)
Thresholds for feeling classification. Based on system-specific statistics (mean and standard deviation).
pub type FeelingThresholds {
FeelingThresholds(mean: Float, std_dev: Float)
}
Constructors
-
FeelingThresholds(mean: Float, std_dev: Float)Arguments
- mean
-
Mean free energy (baseline)
- std_dev
-
Standard deviation of free energy
Free Energy state for a system.
pub type FreeEnergyState {
FreeEnergyState(
free_energy: Float,
prediction_error: Float,
complexity: Float,
precision: Float,
feeling: Feeling,
)
}
Constructors
-
FreeEnergyState( free_energy: Float, prediction_error: Float, complexity: Float, precision: Float, feeling: Feeling, )Arguments
- free_energy
-
The free energy value (lower is better)
- prediction_error
-
Prediction error component (precision-weighted)
- complexity
-
Complexity/KL divergence component
- precision
-
Precision used for weighting
- feeling
-
Qualitative feeling based on normalized free energy
Posterior over a Gaussian belief: mean and precision (inverse variance).
BPC tracks the full posterior over hidden states instead of just MAP estimates. Closed-form Hebbian updates (Vasilescu & Friston 2025, arXiv:2503.24016) preserve the locality of PC while quantifying epistemic uncertainty.
pub type GaussianBelief {
GaussianBelief(mean: vector.Vec3, precision: Float)
}
Constructors
-
GaussianBelief(mean: vector.Vec3, precision: Float)
A hierarchical predictive coding network: a stack of layers from sensory (head) to abstract (tail). Used for active inference planning at multiple scales (S-HAI 2026).
pub type Hierarchical {
Hierarchical(layers: List(HierarchicalLayer))
}
Constructors
-
Hierarchical(layers: List(HierarchicalLayer))
A single layer of a hierarchical predictive-coding network.
Stores the layer’s state estimate mu, the precision (inverse variance)
of its prediction errors, and the precision of the prior over the layer
state. Higher layers send top-down predictions; bottom-up prediction
errors travel upward. See Friston (2010), Bogacz (2017), and the 2026
Meta-PCN framework for the modern formulation.
pub type HierarchicalLayer {
HierarchicalLayer(
mu: vector.Vec3,
precision: Float,
prior_precision: Float,
)
}
Constructors
-
HierarchicalLayer( mu: vector.Vec3, precision: Float, prior_precision: Float, )Arguments
- mu
-
Posterior mean at this layer (the latent state estimate).
- precision
-
Precision of prediction errors flowing up from this layer.
- prior_precision
-
Precision of the prior over this layer’s state.
Values
pub fn active_inference_delta(
current: vector.Vec3,
target: vector.Vec3,
rate: Float,
) -> vector.Vec3
Active Inference: compute action that minimizes expected free energy.
This returns the delta to apply to current state to move toward target. Rate controls how quickly to move (0 = no movement, 1 = instant).
pub fn belief_update(
prior: Float,
observation: Float,
precision_prior: Float,
precision_likelihood: Float,
) -> Float
Bayesian belief update: combine prior with likelihood.
posterior ∝ likelihood × prior Using precision-weighted combination: new_belief = (Π_prior × prior + Π_likelihood × observation) / (Π_prior + Π_likelihood)
pub fn bpc_precision_update(
current_precision: Float,
error_squared: Float,
observation_count: Int,
) -> Float
Hebbian variance update: the BPC weight-rule equivalent of synaptic plasticity. Updates precision based on prediction-error magnitude.
new_precision = (count · prior_precision + 1) / (count · variance + |error|²)
Higher errors → lower precision; consistent observations → higher precision. Equivalent to a conjugate Normal-Gamma update.
pub fn bpc_update(
prior: GaussianBelief,
observation: vector.Vec3,
likelihood_precision: Float,
) -> GaussianBelief
Precision-weighted Bayesian update for a Gaussian belief from a single observation under Gaussian likelihood.
posterior_precision = prior_precision + likelihood_precision posterior_mean = (prior_precision · prior_mean + likelihood_precision · observation) / posterior_precision
Returns the new belief. This is the closed-form variant central to BPC.
pub fn classify_feeling(free_energy: Float) -> Feeling
Legacy classify_feeling with fixed thresholds. Calibrated for PAD space (max distance ~3.46).
pub fn classify_feeling_normalized(
free_energy: Float,
thresholds: FeelingThresholds,
) -> Feeling
Classify feeling using normalized thresholds.
- Homeostatic: F < μ - σ (better than expected)
- Surprised: μ - σ ≤ F < μ (slightly worse)
- Alarmed: μ ≤ F < μ + σ (worse than average)
- Overwhelmed: F ≥ μ + σ (much worse)
pub fn complexity(
current: vector.Vec3,
baseline: vector.Vec3,
prior_variance: Float,
) -> Float
Compute complexity term using KL divergence.
Complexity = D_KL(q(θ) || p(θ))
Where q is posterior belief and p is prior belief (homeostatic setpoint). Weight controls the regularization strength.
pub fn complexity_weighted(
current: vector.Vec3,
baseline: vector.Vec3,
weight: Float,
) -> Float
Legacy complexity function for backwards compatibility.
pub fn compute_state(
expected: vector.Vec3,
actual: vector.Vec3,
baseline: vector.Vec3,
precision: Float,
prior_variance: Float,
thresholds: FeelingThresholds,
) -> FreeEnergyState
Compute free energy and return full state with feeling. Uses normalized thresholds for feeling classification.
pub fn compute_state_simple(
expected: vector.Vec3,
actual: vector.Vec3,
baseline: vector.Vec3,
complexity_weight: Float,
) -> FreeEnergyState
Simplified compute_state with default thresholds and legacy interface. For backwards compatibility.
pub fn default_thresholds() -> FeelingThresholds
Default thresholds calibrated for PAD space. Mean and std_dev derived from typical emotional dynamics.
pub fn estimate_precision(errors: List(Float)) -> Float
Estimate precision from recent prediction errors.
Precision = 1 / variance of errors Higher precision means more reliable predictions.
pub fn expected_free_energy(
predicted_outcome: vector.Vec3,
preferred_outcome: vector.Vec3,
predictive_uncertainty: Float,
) -> ExpectedFreeEnergy
Decompose Expected Free Energy.
predicted_outcome: agent’s expectation of the future state under action a.preferred_outcome: agent’s goal state (homeostatic setpoint).predictive_uncertainty: entropy of the predictive distribution (epistemic).
pub fn free_energy(
expected: vector.Vec3,
actual: vector.Vec3,
baseline: vector.Vec3,
precision: Float,
prior_variance: Float,
) -> Float
Compute full Free Energy: F = Π·(μ-o)² + D_KL(q||p)
Parameters
- expected: predicted/expected state (μ)
- actual: observed/actual state (o)
- baseline: prior baseline state (p) - e.g., personality/homeostatic setpoint
- precision: inverse variance of predictions (Π)
- prior_variance: variance of prior beliefs (for KL term)
pub fn gaussian_kl_divergence(
posterior_mean: vector.Vec3,
prior_mean: vector.Vec3,
variance: Float,
) -> Float
Compute KL divergence between Gaussian distributions (closed form).
CORRECTED per DeepSeek R1 validation - Full KL for Gaussians: D_KL(N(μ₁,σ₁²) || N(μ₂,σ₂²)) = (μ₁ - μ₂)²/(2σ₂²) + (σ₁² - σ₂²)/(2σ₂²) - 1/2
When variances are equal (σ₁ = σ₂), reduces to: (μ₁ - μ₂)²/(2σ²)
This measures how much the posterior (current belief) diverges from prior.
pub fn gaussian_kl_divergence_full(
posterior_mean: vector.Vec3,
prior_mean: vector.Vec3,
posterior_variance: Float,
prior_variance: Float,
) -> Float
Full KL divergence between multivariate isotropic Gaussians with different (scalar) variances in d=3 dimensions (Vec3).
D_KL(N(μ₁, σ₁² I_d) || N(μ₂, σ₂² I_d)) = (d/2) · ln(σ₂² / σ₁²) + (d·σ₁² + |μ₁-μ₂|²) / (2σ₂²) - d/2
For d=1 and equal variances this reduces to (μ₁-μ₂)² / (2σ²), matching
gaussian_kl_divergence/3.
pub fn generalized_free_energy(
expected_state: vector.Vec3,
preferred_state: vector.Vec3,
uncertainty: Float,
) -> Float
Generalized Free Energy (expected free energy for planning).
G = ambiguity + risk
- ambiguity: expected surprise under model (epistemic value)
- risk: KL divergence from preferred outcomes (pragmatic value)
Used for action selection in active inference.
pub fn hierarchical_errors(h: Hierarchical) -> List(vector.Vec3)
Per-layer prediction error: e_l = mu_l - g(mu_{l+1}).
In the simplest linear PC model, g is the identity. For richer models
pass a custom decoder via hierarchical_errors_with.
pub fn hierarchical_errors_with(
h: Hierarchical,
decoder: fn(vector.Vec3) -> vector.Vec3,
) -> List(vector.Vec3)
Hierarchical prediction errors with custom top-down decoder.
pub fn hierarchical_free_energy(h: Hierarchical) -> Float
Hierarchical free energy summed across layers.
F_total = Σ_l Π_l · |e_l|² where e_l is the prediction error between layer l and the top-down prediction from layer l+1. This is the variant Meta-PCN (ICLR 2026) regularises with weight-variance normalisation to avoid exploding errors in deep networks.
pub fn hierarchical_infer(
h: Hierarchical,
lr: Float,
n: Int,
) -> Hierarchical
Run n inference steps. Convenience wrapper around
hierarchical_inference_step.
pub fn hierarchical_inference_step(
h: Hierarchical,
lr: Float,
) -> Hierarchical
One inference step of gradient descent on the hierarchical free energy.
For each non-top layer l, updates the latent state μ_l along the descent
direction -∂F/∂μ_l, where:
∂F/∂μ_l = Π_l · (μ_l - μ_{l+1}) + Π_{l-1} · (μ_l - μ_{l-1})
↑ ↑
top-down prior fit bottom-up evidence fit
lr is the learning rate (step size); typical values 0.01–0.1 for stable
inference. The bottom layer’s μ is left untouched — it represents the
sensory observation and is fixed during inference.
pub fn meta_prediction_errors(
h: Hierarchical,
) -> List(vector.Vec3)
Meta-prediction error: prediction error of the prediction error.
Meta-PCN (Lin et al. ICLR 2026) shows that minimising “PEs of PEs” linearises the otherwise non-linear PCN equilibrium dynamics, yielding dramatically more stable inference at depth.
meta_e_l = e_l - h(e_{l+1}) where h is typically identity for the simplest case.
pub fn policy_posterior(
policies: List(#(a, vector.Vec3, Float)),
preferred_outcome: vector.Vec3,
beta: Float,
) -> List(#(a, Float))
Softmax over policies: probability of selecting each action given its Expected Free Energy. Lower G → higher probability (β controls sharpness).
pub fn precision_weighted_error_vec(
expected: vector.Vec3,
actual: vector.Vec3,
precisions: vector.Vec3,
) -> Float
Precision-weighted prediction error for Vec3.
Each dimension can have different precision. Returns weighted sum of squared errors.
pub fn precision_weighted_prediction_error(
expected: vector.Vec3,
actual: vector.Vec3,
precision: Float,
) -> Float
Compute precision-weighted prediction error.
F_accuracy = Π · (expected - actual)²
Precision (Π) = 1/variance. Higher precision = more weight on prediction errors. This is critical for biological systems where uncertainty should attenuate errors.
pub fn prediction_error(
expected: vector.Vec3,
actual: vector.Vec3,
) -> Float
Compute raw prediction error between expected and actual state. Uses squared Euclidean distance (L2 loss).
pub fn select_policy(
policies: List(#(a, vector.Vec3, Float)),
preferred_outcome: vector.Vec3,
) -> Result(#(a, ExpectedFreeEnergy), Nil)
Select the action with minimum Expected Free Energy.
policies is a list of (action_label, predicted_outcome, predictive_uncertainty).
Returns the best policy or Error(Nil) if the list is empty.
pub fn surprise(
expected: Float,
observed: Float,
sigma: Float,
) -> Float
Compute surprise for a single dimension.
Surprise = -log(p(observation | model)) Using Gaussian approximation: surprise ∝ (x - μ)² / (2σ²)
pub fn update_thresholds(
current: FeelingThresholds,
observed_fe: Float,
alpha: Float,
) -> FeelingThresholds
Update thresholds based on observed free energy history. Uses exponential moving average for online learning.
pub fn variational_bound(
observation_likelihood: Float,
kl_divergence: Float,
) -> Float
Variational Free Energy bound.
F ≤ -log p(o) + D_KL(q||p)
The free energy bounds the negative log evidence (surprise).