viva_math/transport

Optimal transport distances for empirical affective distributions.

1D empirical Wasserstein distances and PAD componentwise aggregation following Villani (2008), Optimal Transport, and Peyré & Cuturi (2019), Computational Optimal Transport.

Complexity: O((n+m)·log(n+m)) dominated by sort. The post-sort integral walks both quantile sequences in linear time without random indexing — see walk_quantile_* helpers below.

Values

pub fn wasserstein_1_empirical(
  p: List(Float),
  q: List(Float),
) -> Result(Float, Nil)

Wasserstein-1 (Earth Mover) between 1D empirical samples.

W_1(P, Q) = ∫_0^1 |F_P⁻¹(u) − F_Q⁻¹(u)| du.

Equivalent to ∫_ℝ |F_P(x) − F_Q(x)| dx (the W_1 duality holds via integration by parts; the absolute-value identity is symmetric). For equal sample sizes reduces to (1/n)·Σ |p_(i) − q_(i)|. For unequal sizes integrates |p_i − q_j|·Δu across the union of quantile breakpoints {i/n} ∪ {j/m} in linear time after sorting.

pub fn wasserstein_2_empirical(
  p: List(Float),
  q: List(Float),
) -> Result(Float, Nil)

Wasserstein-2 between 1D empirical samples.

W_2²(P, Q) = ∫_0^1 (F_P⁻¹(u) − F_Q⁻¹(u))² du.

For equal sample sizes the inverse-CDF integral reduces to the sorted pairwise mean-square difference. For unequal sizes we integrate over the union of quantile breakpoints {i/n} ∪ {j/m} in linear time: in each slab [u_prev, u_next] both inverse CDFs are constant, contributing (p_sorted[i] − q_sorted[j])² · (u_next − u_prev).

Returns W_2, not W_2². References: Villani (2008); Peyré & Cuturi (2019).

Note: a CDF-based path ∫(F_P−F_Q)² dx is incorrect for unequal sample sizes — the W_1/W_2 duality via integration by parts only holds for p=1 (absolute value), not for the quadratic kernel.

pub fn wasserstein_2_gaussian(
  g1: distributions.Gaussian,
  g2: distributions.Gaussian,
) -> Float

Closed form W_2 for scalar Gaussians.

W_2(N(μ₁, σ₁²), N(μ₂, σ₂²)) = √((μ₁ − μ₂)² + (σ₁ − σ₂)²).

stddev is normalised via abs before subtracting — a negative stddev has no Gaussian meaning (N(μ, σ²) depends only on σ²), and silently using the signed value would yield W_2(N(0, 1), N(0, 1)) of 2.0 for Gaussian(0, -1) vs Gaussian(0, 1), which is wrong.

pub fn wasserstein_2_multivariate(
  xs: List(vector.Vec3),
  ys: List(vector.Vec3),
  epsilon: Float,
  max_iter: Int,
) -> Result(Float, Nil)

Multivariate Wasserstein-2 distance via Sinkhorn-Knopp entropic regularization (Cuturi 2013).

Solves the entropic OT problem min_π ⟨π, C⟩ + ε·H(π) subject to π·1 = a, πᵀ·1 = b, where C[i,j] = ‖x_i − y_j‖² and π is the transport plan.

Returns √(⟨π, C⟩), the W₂ distance induced by the regularized plan, dropping the entropic term itself.

  • xs, ys: empirical PAD samples with uniform weights.
  • epsilon: regularization strength, typically 0.01 to 1.0.
  • max_iter: upper bound on Sinkhorn iterations, typically 100 to 1000; convergence may stop earlier once marginal violation is below the internal tolerance.
  • Returns Error(Nil) if either list is empty.
pub fn wasserstein_pad(
  p: List(vector.Vec3),
  q: List(vector.Vec3),
) -> Result(Float, Nil)

Component-wise (marginal) Wasserstein-2 over PAD dimensions.

D(P, Q) = √( W_2²(P_P, Q_P) + W_2²(P_A, Q_A) + W_2²(P_D, Q_D) ).

This is not the multivariate W_2 (which requires solving a full optimal-transport assignment with cost ‖x − y‖²). It is the Euclidean norm of the per-axis marginal Wasserstein distances — equivalent to the Sliced Wasserstein along the canonical PAD basis.

Triangle inequality holds (Minkowski over the marginal vector), so this is a pseudo-metric on PAD distributions: D(P, Q) = 0 does not imply P = Q as joints — two distributions with identical marginals but different correlations are tied. Useful as a fast lower bound on the true multivariate W_2; tight when marginals are product distributions.

Search Document