Statwise is a one-dimensional statistics library with results checked against NumPy, SciPy, and Statsmodels fixtures. The public API is Elixir-native; Python libraries are behavioral references, not API templates.
Shared Input Rules
- Raw samples are finite numeric lists or one-dimensional
Nx.Tensors. - Integers are cast to
f64. - Multidimensional tensors raise
ArgumentError. - Infinite values (
:infinityand:neg_infinityfrom Nx special values) raiseArgumentError. - NaN behavior is controlled with
nan_policy.
The inferential test APIs also accept dataframe-style column inputs through
columns: and pairs: options:
- A map of columns may be passed directly, with atom or string column keys.
- An
Explorer.DataFramemay be passed when Explorer is loaded by the caller's application. Explorer is optional and is not a Statwise dependency. - Extracted columns must contain raw sample values supported by Statwise.
nilcolumn values are treated as:nanand then handled by the selectednan_policy.- Column extraction defaults to
input: :list. Passinput: :tensorto convert map columns to one-dimensionalf64tensors or, for Explorer columns, to callExplorer.Series.to_tensor/2when available.
For two-sample tests, columns: [:x, :y] returns one result. Passing
pairs: [x: :y, before: :after] returns a map keyed by
{left_column, right_column}. For one-sample t-tests, columns: :x returns
one result and columns: [:x, :y] returns a map keyed by column.
Tensor-native reductions are opt-in with backend: :tensor. Without this
option, tensor inputs are normalized through the same scalar path as lists,
which is currently faster for many small and mid-sized operations on
Nx.BinaryBackend.
NaN Policy
Supported values:
:raiserejects NaN inputs. This is the default.:propagatereturns NaN statistics/p-values for inferential tests or NaN values for descriptive/ranking operations where applicable.:omitremoves NaNs before computing.
Paired t-tests apply :omit pairwise: a pair is removed when either side is
NaN. Independent tests and Mann-Whitney U apply :omit per sample.
If omission leaves too few observations, the function raises the same insufficient-sample error it would raise for a too-small original sample.
Descriptive Statistics
Reference: NumPy 2.3.0.
Functions:
Statwise.Descriptive.count/2Statwise.Descriptive.sum/2Statwise.Descriptive.mean/2Statwise.Descriptive.variance/2Statwise.Descriptive.stddev/2Statwise.Descriptive.standard_error/2
Variance defaults to sample variance with correction: 1. Population variance
is available with correction: 0.
T-Tests
References:
- Statsmodels 0.14.6 for independent t-tests.
- SciPy 1.16.0 for one-sample and paired t-tests.
Functions:
Statwise.TTest.one_sample/2Statwise.TTest.paired/2for dataframe-style column inputsStatwise.TTest.paired/3Statwise.TTest.independent/2for dataframe-style column inputsStatwise.TTest.independent/3
Supported alternatives are :two_sided, :greater, and :less.
Independent tests support:
variance: :welchvariance: :poolednull_difference: floatconfidence_level: float, defaulting to0.95effect_size: boolean, defaulting tofalse
Confidence intervals are returned in result.confidence_interval.
- One-sample t-tests report intervals for the sample mean, matching SciPy's
TtestResult.confidence_interval. - Paired t-tests report intervals for the mean paired difference.
- Independent t-tests report intervals for
mean_x - mean_y. - One-sided alternatives use one infinite bound, represented as
:infinityor:neg_infinity.
When effect_size: true, t-test results include:
cohens_dhedges_g
One-sample and paired tests use the sample standard deviation as the
standardizer. Independent tests use the pooled standard deviation as the
standardizer for both Welch and pooled tests. Hedges' g uses the small-sample
correction 1 - 3 / (4 * df - 1).
Zero standard-error cases are explicit:
- If the observed difference is zero,
statistic,p_value, and Welchdfvalues that are undefined are returned as:nan. - If the observed difference is positive with zero standard error, the
statistic is
:infinity. - If the observed difference is negative with zero standard error, the
statistic is
:neg_infinity. - Pooled independent t-tests keep their finite degrees of freedom in this
case. Welch independent t-tests return
df: :nanwhen both samples have zero variance, matching Statsmodels' degenerate-output shape.
Ranking
Reference: SciPy 1.16.0 rankdata(method="average").
Function:
Only average tie ranking is currently supported. Other tie methods are intentionally deferred.
Mann-Whitney U
Reference: SciPy 1.16.0 mannwhitneyu.
Function:
Statwise.MannWhitney.test/2for dataframe-style column inputsStatwise.MannWhitney.test/3
Supported alternatives are :two_sided, :greater, and :less.
Supported methods:
:asymptotic:exact:auto
Like SciPy, explicit method: :exact does not apply a tie correction. :auto
uses exact p-values when there are no ties and the smaller sample has at most 8
observations; otherwise it uses the asymptotic normal approximation.
The returned statistic is U1, the U statistic for the first sample. U1 and
U2 are also available in result metadata.
Mann-Whitney U results include:
effect_size.common_language, computed asU1 / (n_x * n_y).effect_size.rank_biserial, computed as2 * common_language - 1.effect_size.cliffs_delta, an alias ofrank_biserial.
Deferred Compatibility Areas
- Weighted tests.
- Multidimensional
axisbehavior. - Missing-data policies beyond
nan_policyfor the current functions. - Masked arrays.
- Permutation tests.
- Additional rank tie methods.