viva_math/transport
Optimal transport distances for empirical affective distributions.
1D empirical Wasserstein distances and PAD componentwise aggregation following Villani (2008), Optimal Transport, and Peyré & Cuturi (2019), Computational Optimal Transport.
Complexity: O((n+m)·log(n+m)) dominated by sort. The post-sort
integral walks both quantile sequences in linear time without random
indexing — see walk_quantile_* helpers below.
Values
pub fn wasserstein_1_empirical(
p: List(Float),
q: List(Float),
) -> Result(Float, Nil)
Wasserstein-1 (Earth Mover) between 1D empirical samples.
W_1(P, Q) = ∫_0^1 |F_P⁻¹(u) − F_Q⁻¹(u)| du.
Equivalent to ∫_ℝ |F_P(x) − F_Q(x)| dx (the W_1 duality holds via
integration by parts; the absolute-value identity is symmetric). For
equal sample sizes reduces to (1/n)·Σ |p_(i) − q_(i)|. For unequal sizes
integrates |p_i − q_j|·Δu across the union of quantile breakpoints
{i/n} ∪ {j/m} in linear time after sorting.
pub fn wasserstein_2_empirical(
p: List(Float),
q: List(Float),
) -> Result(Float, Nil)
Wasserstein-2 between 1D empirical samples.
W_2²(P, Q) = ∫_0^1 (F_P⁻¹(u) − F_Q⁻¹(u))² du.
For equal sample sizes the inverse-CDF integral reduces to the sorted
pairwise mean-square difference. For unequal sizes we integrate over the
union of quantile breakpoints {i/n} ∪ {j/m} in linear time: in each
slab [u_prev, u_next] both inverse CDFs are constant, contributing
(p_sorted[i] − q_sorted[j])² · (u_next − u_prev).
Returns W_2, not W_2². References: Villani (2008); Peyré & Cuturi (2019).
Note: a CDF-based path ∫(F_P−F_Q)² dx is incorrect for unequal
sample sizes — the W_1/W_2 duality via integration by parts only
holds for p=1 (absolute value), not for the quadratic kernel.
pub fn wasserstein_2_gaussian(
g1: distributions.Gaussian,
g2: distributions.Gaussian,
) -> Float
Closed form W_2 for scalar Gaussians.
W_2(N(μ₁, σ₁²), N(μ₂, σ₂²)) = √((μ₁ − μ₂)² + (σ₁ − σ₂)²).
stddev is normalised via abs before subtracting — a negative stddev
has no Gaussian meaning (N(μ, σ²) depends only on σ²), and silently
using the signed value would yield W_2(N(0, 1), N(0, 1)) of 2.0 for
Gaussian(0, -1) vs Gaussian(0, 1), which is wrong.
pub fn wasserstein_2_multivariate(
xs: List(vector.Vec3),
ys: List(vector.Vec3),
epsilon: Float,
max_iter: Int,
) -> Result(Float, Nil)
Multivariate Wasserstein-2 distance via Sinkhorn-Knopp entropic regularization (Cuturi 2013).
Solves the entropic OT problem
min_π ⟨π, C⟩ + ε·H(π) subject to π·1 = a, πᵀ·1 = b,
where C[i,j] = ‖x_i − y_j‖² and π is the transport plan.
Returns √(⟨π, C⟩), the W₂ distance induced by the regularized plan,
dropping the entropic term itself.
xs,ys: empirical PAD samples with uniform weights.epsilon: regularization strength, typically0.01to1.0.max_iter: upper bound on Sinkhorn iterations, typically100to1000; convergence may stop earlier once marginal violation is below the internal tolerance.- Returns
Error(Nil)if either list is empty.
pub fn wasserstein_pad(
p: List(vector.Vec3),
q: List(vector.Vec3),
) -> Result(Float, Nil)
Component-wise (marginal) Wasserstein-2 over PAD dimensions.
D(P, Q) = √( W_2²(P_P, Q_P) + W_2²(P_A, Q_A) + W_2²(P_D, Q_D) ).
This is not the multivariate W_2 (which requires solving a full
optimal-transport assignment with cost ‖x − y‖²). It is the Euclidean
norm of the per-axis marginal Wasserstein distances — equivalent to the
Sliced Wasserstein along the canonical PAD basis.
Triangle inequality holds (Minkowski over the marginal vector), so this
is a pseudo-metric on PAD distributions: D(P, Q) = 0 does not
imply P = Q as joints — two distributions with identical marginals but
different correlations are tied. Useful as a fast lower bound on the true
multivariate W_2; tight when marginals are product distributions.