Statwise is an Elixir statistics library that aims for idiomatic Elixir APIs with results checked against well-known Python references.
This first milestone includes:
- Descriptive statistics for lists and one-dimensional Nx tensors.
- Normal and Student's t distribution helpers.
- One-sample, paired, Welch, and pooled t-tests.
- Average-rank utilities.
- Asymptotic and exact Mann-Whitney U tests.
- Dataframe-style column wrappers for running tests from maps or Explorer dataframes.
- Visualization builders for histograms, ECDFs, QQ plots, box plots, scatter plots, line plots, summary bars and points with intervals, count plots, strip plots, and heatmaps with Vega-Lite-compatible output.
- Committed JSONL fixtures generated from pinned Python references.
Examples
Statwise.Descriptive.mean([1, 2, 3])
#=> 2.0
Statwise.TTest.independent([1.2, 1.9, 2.4], [2.2, 3.0, 3.4],
variance: :welch
)
#=> %Statwise.TestResult{}
Statwise.MannWhitney.test([1, 3, 5], [2, 4],
alternative: :two_sided,
method: :asymptotic
)
#=> %Statwise.TestResult{}
Statwise.Visualization.histogram([1, 2, 2, 3], bins: 10)
|> Statwise.Visualization.to_vega_lite()
#=> %{"$schema" => "https://vega.github.io/schema/vega-lite/v5.json", ...}
# In Livebook with :jason, :vega_lite, and :kino_vega_lite installed:
Statwise.Visualization.histogram([1, 2, 2, 3], bins: 10)
|> Statwise.Visualization.with_style(width: 420, color: "#2563eb")
|> Statwise.Visualization.show()rows = [
%{site: :north, treatment: :control, time: 1, score: 1.2},
%{site: :north, treatment: :control, time: 2, score: 1.8},
%{site: :south, treatment: :treated, time: 1, score: 2.4},
%{site: :south, treatment: :treated, time: 2, score: 2.9}
]
rows
|> Statwise.Visualization.plot(x: :time, y: :score, color: :treatment)
|> Statwise.Visualization.add(:point)
|> Statwise.Visualization.add(:line)
|> Statwise.Visualization.facet(column: :site)
|> Statwise.Visualization.show()
rows
|> Statwise.Visualization.box_plot(x: :treatment, y: :score)
|> Statwise.Visualization.with_test(:t_test, groups: {:control, :treated})
|> Statwise.Visualization.show()T-Tests
Statwise.TTest.one_sample([2.5, 3.1, 3.6, 4.0], mean: 3.0)
Statwise.TTest.paired(
[10.2, 11.5, 12.1, 13.8],
[9.9, 10.8, 11.2, 12.6],
alternative: :greater
)
Statwise.TTest.independent(
[1.2, 1.9, 2.4, 2.9],
[2.2, 3.0, 3.4, 4.1, 4.8],
variance: :welch,
alternative: :less,
null_difference: 0.0,
confidence_level: 0.95,
effect_size: true
)The test APIs can also pull samples from dataframe-like column data. Statwise
does not depend on Explorer, but if your application has Explorer loaded,
Explorer.DataFrame columns are accepted. Maps of columns work too:
df = %{
before: [10.2, 11.5, 12.1, 13.8],
after: [9.9, 10.8, 11.2, 12.6],
control: [1.2, 1.9, 2.4, 2.9],
treatment: [2.2, 3.0, 3.4, 4.1]
}
Statwise.TTest.one_sample(df, columns: [:before, :after], mean: 10.0)
#=> %{before: %Statwise.TestResult{}, after: %Statwise.TestResult{}}
Statwise.TTest.paired(df, columns: [:before, :after])
#=> %Statwise.TestResult{}
Statwise.TTest.independent(df, columns: [:control, :treatment], variance: :welch)
#=> %Statwise.TestResult{}Column extraction defaults to ordinary lists. Pass input: :tensor to extract
map or Explorer columns as one-dimensional f64 tensors. With Explorer loaded,
Statwise uses Explorer.Series.to_tensor/2 when it is available:
Statwise.TTest.one_sample(df,
columns: [:before, :after],
mean: 10.0,
input: :tensor,
backend: :tensor
)Use pairs: to run several two-sample tests in one call:
Statwise.TTest.paired(df,
pairs: [
before: :after,
control: :treatment
]
)
#=> %{{:before, :after} => %Statwise.TestResult{}, ...}Supported alternatives are :two_sided, :greater, and :less. Independent
t-tests support variance: :welch and variance: :pooled.
T-test results include confidence intervals by default. Pass effect_size: true
to include Cohen's d and Hedges' g.
Nonparametric Tests
Statwise.Nonparametric.Rank.ranks([10, 20, 20, 30])
#=> [1.0, 2.5, 2.5, 4.0]
Statwise.MannWhitney.test(
[1.0, 3.0, 5.0],
[2.0, 4.0],
alternative: :two_sided,
method: :auto,
continuity: true
)Dataframe columns are supported with the same columns: and pairs: options:
Statwise.MannWhitney.test(df, columns: [:control, :treatment], method: :auto)
Statwise.MannWhitney.test(df,
pairs: [
control: :treatment,
before: :after
],
method: :auto
)Ranking currently supports SciPy-compatible average ranks for ties. Mann-Whitney
U supports method: :asymptotic, method: :exact, and method: :auto.
Like SciPy, explicit method: :exact does not apply a tie correction. :auto
uses exact p-values when there are no ties and the smaller sample has at most 8
observations; otherwise it uses the asymptotic normal approximation.
Mann-Whitney results include common-language and rank-biserial effect sizes.
effect_size.cliffs_delta is also provided as an alias of rank-biserial.
Stage-one behavior is intentionally strict: raw samples must be finite numeric
lists or one-dimensional Nx tensors. Test APIs can also extract raw samples
from dataframe-style columns with columns: or pairs:. Tensor-native Nx
reductions are opt-in with backend: :tensor; the default path still favors
the fastest scalar implementation for the current Nx binary backend. NaN
behavior is controlled with
nan_policy: :raise | :propagate | :omit; see
docs/compatibility.md.
Degenerate t-tests with zero standard error return explicit :nan,
:infinity, or :neg_infinity statistics according to the compatibility
contract.
Python Compatibility
The Elixir tests use committed fixtures from:
- NumPy 2.3.0 for descriptive statistics.
- SciPy 1.16.0 for distributions and Mann-Whitney U.
- Statsmodels 0.14.6 for independent t-tests.
Python is not required for the normal test suite. To intentionally refresh fixtures:
cd reference/python
uv sync
uv run python generate_fixtures.py
cd ../..
mix test
Review fixture diffs before committing refreshed values.
For randomized pre-release checks against Python references:
cd reference/python
uv sync
uv run python differential_check.py --cases 250 --seed 202607
See docs/release_checklist.md for the release
readiness checklist.
For runnable tutorials, see
docs/statistical_tests_gallery.livemd
and docs/visualization_gallery.livemd.
CI
Run:
mix format --check-formatted
mix compile --warnings-as-errors
mix test