Section
This Livebook is a runnable tour of Statwise statistical tests. It covers the available test families, common variants, result fields, dataframe-style inputs, visual annotations, and practical guidance for choosing a test.
Setup
Mix.install([
{:statwise, path: Path.expand("..", __DIR__)},
{:jason, "~> 1.4"},
{:vega_lite, "~> 0.1"},
{:kino_vega_lite, "~> 0.1"}
])alias Statwise.{MannWhitney, TTest, Visualization}Quick Selection Guide
Use a one-sample t-test when you have one numeric sample and want to compare its mean to a fixed reference value.
Use a paired t-test when each observation has a natural before/after or matched-pair relationship.
Use an independent t-test when you have two independent groups and want to
compare means. Prefer variance: :welch by default because it does not assume
equal variances. Use variance: :pooled only when equal variances are part of
the study design or a deliberate modeling assumption.
Use Mann-Whitney U when you have two independent groups and want a rank-based nonparametric comparison. It is useful when the mean is not the right summary target, sample sizes are small, data are ordinal, or the distribution is strongly skewed. It is not a drop-in paired-test replacement.
Use rank utilities when you want to inspect or reuse average ranks directly.
Tutorial Data
baseline = [9.8, 10.1, 10.4, 9.9, 10.2, 10.3]
before = [10.2, 11.5, 12.1, 13.8, 12.9, 11.7]
after_values = [9.9, 10.8, 11.2, 12.6, 12.1, 10.9]
control = [1.2, 1.9, 2.4, 2.9, 2.7, 2.2]
treatment = [2.2, 3.0, 3.4, 4.1, 4.8, 3.6]
ordinal_control = [1, 2, 2, 3, 3, 4]
ordinal_treatment = [3, 4, 4, 5, 5, 6]
df = %{
baseline: baseline,
before: before,
after: after_values,
control: control,
treatment: treatment,
ordinal_control: ordinal_control,
ordinal_treatment: ordinal_treatment
}
rows =
Enum.map(control, &%{group: :control, score: &1, site: :north}) ++
Enum.map(treatment, &%{group: :treated, score: &1, site: :north}) ++
Enum.map([1.0, 1.4, 1.7, 2.0], &%{group: :control, score: &1, site: :south}) ++
Enum.map([1.8, 2.2, 2.5, 2.9], &%{group: :treated, score: &1, site: :south})Reading Test Results
All inferential tests return %Statwise.TestResult{}. The most commonly used
fields are:
test: which test was run.statistic: the t statistic, U statistic, or other test statistic.p_value: the p-value for the configured alternative.alternative::two_sided,:greater, or:less.estimate: estimated means, differences, and standard errors where available.confidence_interval: confidence interval metadata where available.effect_size: optional or built-in effect sizes.n: sample sizes used by the test.
result = TTest.independent(control, treatment, variance: :welch, effect_size: true)
Map.take(result, [
:test,
:statistic,
:p_value,
:alternative,
:method,
:estimate,
:confidence_interval,
:effect_size,
:n
])One-Sample T-Test
Use this when one sample should be compared with a reference mean.
TTest.one_sample(baseline,
mean: 10.0,
alternative: :two_sided,
confidence_level: 0.95,
effect_size: true
)Use alternative: :greater when the scientific question is specifically
whether the sample mean is greater than the reference value:
TTest.one_sample(baseline,
mean: 10.0,
alternative: :greater
)Paired T-Test
Use this for before/after, matched subjects, repeated measures, or other paired observations. The test is performed on within-pair differences.
TTest.paired(before, after_values,
alternative: :greater,
confidence_level: 0.95,
effect_size: true
)Visualize paired data as differences when the pairing is the central question:
paired_rows =
before
|> Enum.zip(after_values)
|> Enum.with_index(1)
|> Enum.map(fn {{before_value, after_value}, subject} ->
%{subject: subject, difference: before_value - after_value}
end)
Visualization.point_plot(paired_rows,
x: :subject,
y: :difference,
stat: :mean
)
|> Visualization.with_style(width: 520, height: 260)
|> Visualization.show()Independent T-Test
Prefer Welch's independent t-test by default:
TTest.independent(control, treatment,
variance: :welch,
alternative: :two_sided,
confidence_level: 0.95,
effect_size: true
)Use the pooled variant only when equal variances are an intentional assumption:
TTest.independent(control, treatment,
variance: :pooled,
effect_size: true
)Use null_difference: when the comparison is against a non-zero difference:
TTest.independent(control, treatment,
variance: :welch,
null_difference: -1.0
)Annotate an ordinary plot with a computed t-test:
rows
|> Visualization.box_plot(x: :group, y: :score)
|> Visualization.with_test(:t_test, groups: {:control, :treated})
|> Visualization.with_style(width: 420, height: 260)
|> Visualization.show()Mann-Whitney U Test
Use Mann-Whitney for two independent groups when a rank-based comparison is more appropriate than a mean comparison.
MannWhitney.test(ordinal_control, ordinal_treatment,
alternative: :two_sided,
method: :auto,
continuity: true
)Choose the method deliberately:
method: :autouses exact p-values when there are no ties and the smaller sample has at most 8 observations; otherwise it uses the asymptotic approximation.method: :exactcomputes exact p-values. Like SciPy, explicit exact mode does not apply tie correction.method: :asymptoticuses the normal approximation and optional continuity correction.
MannWhitney.test([1, 3, 5], [2, 4, 6], method: :exact)MannWhitney.test(ordinal_control, ordinal_treatment,
method: :asymptotic,
continuity: false
)Mann-Whitney results include rank-based effect sizes:
MannWhitney.test(ordinal_control, ordinal_treatment).effect_sizeAnnotate a plot with a computed Mann-Whitney test:
rows
|> Visualization.box_plot(x: :group, y: :score)
|> Visualization.with_test(:mann_whitney, groups: {:control, :treated})
|> Visualization.with_style(width: 420, height: 260)
|> Visualization.show()Faceted Computed Tests
When a plot is faceted, computed test annotations run independently inside each facet panel.
rows
|> Visualization.box_plot(x: :group, y: :score, facet: :site)
|> Visualization.with_test(:t_test, groups: {:control, :treated}, show: [:p_value])
|> Visualization.with_style(width: 260, height: 220)
|> Visualization.show()rows
|> Visualization.box_plot(x: :group, y: :score, facet: :site)
|> Visualization.with_test(:mann_whitney, groups: {:control, :treated}, show: [:p_value])
|> Visualization.with_style(width: 260, height: 220)
|> Visualization.show()Dataframe-Style Inputs
Maps of columns and dataframe-like values can be passed directly with
columns: or pairs:.
TTest.one_sample(df, columns: [:baseline, :before], mean: 10.0)TTest.paired(df, columns: [:before, :after])TTest.independent(df, columns: [:control, :treatment], variance: :welch)MannWhitney.test(df, columns: [:ordinal_control, :ordinal_treatment], method: :auto)Run several two-sample tests at once with pairs::
TTest.independent(df,
pairs: [
control: :treatment,
before: :after
],
variance: :welch
)MannWhitney.test(df,
pairs: [
ordinal_control: :ordinal_treatment,
control: :treatment
],
method: :auto
)Missing Values And NaN Policy
Statwise follows explicit NaN handling:
nan_policy: :raiserejects NaN values.nan_policy: :propagatereturns NaN-like results when NaNs are present.nan_policy: :omitremoves NaNs before testing where supported.
sample_with_nan = [1.0, 2.0, :nan, 3.0]
TTest.one_sample(sample_with_nan,
mean: 2.0,
nan_policy: :omit
)TTest.one_sample(sample_with_nan,
mean: 2.0,
nan_policy: :propagate
)Rank Utilities
Ranks use average-rank handling for ties, matching SciPy behavior.
Statwise.Nonparametric.Rank.ranks([10, 20, 20, 30])Rank plots are useful when inspecting nonparametric comparisons:
Visualization.rank_plot(ordinal_control, ordinal_treatment,
x_label: :control,
y_label: :treated,
title: "Average Ranks"
)
|> Visualization.with_style(width: 420, height: 260)
|> Visualization.show()Practical Checklist
Before choosing a test, ask:
- Is the comparison one sample against a reference, paired, or independent?
- Is the target a mean difference, or is a rank-based comparison more appropriate?
- Is the alternative directional (
:greateror:less) or two-sided? - Are missing values expected, and should they raise, propagate, or be omitted?
- Is an effect size needed for interpretation?
- Should the result be inspected directly or annotated on the plot where the comparison is visible?