Visualization Roadmap

Copy Markdown View Source

This roadmap sketches how Statwise.Visualization can grow toward a mature, seaborn-inspired statistical visualization API while staying idiomatic Elixir and keeping renderer dependencies optional.

North Star

The long-term goal is a small statistical plotting grammar:

rows
|> Statwise.Visualization.plot(x: :treatment, y: :score, color: :site)
|> Statwise.Visualization.add(:box_plot)
|> Statwise.Visualization.facet(column: :site)
|> Statwise.Visualization.with_theme(:minimal)
|> Statwise.Visualization.show()

Statwise should continue to support simple direct constructors:

Statwise.Visualization.box_plot(rows, x: :treatment, y: :score, facet: :site)

The direct constructors should remain easy for common use, while the grammar API can support composition and advanced charts.

Design Principles

  • Keep chart content separate from presentation.
  • Keep runtime visualization dependencies optional.
  • Prefer tidy row data and semantic mappings.
  • Export plain Vega-Lite-compatible maps as the stable renderer contract.
  • Test generated specs and statistical transformations, not screenshots.
  • Preserve existing APIs through aliases or a deprecation period.

Phase 1: Normalize Semantic Mappings

Make every chart accept consistent field mappings.

Current shape:

Statwise.Visualization.box_plot(rows,
  value: :score,
  group: :treatment,
  facet: :site
)

Target shape:

Statwise.Visualization.box_plot(rows,
  x: :treatment,
  y: :score,
  color: :treatment,
  facet: :site
)

Semantic channels to support:

  • :x
  • :y
  • :color
  • :facet
  • :row
  • :column
  • :size
  • :shape
  • :detail
  • :tooltip

Compatibility aliases:

  • value: :score maps to y: :score
  • group: :treatment maps to x: :treatment
  • facet: :site maps to a column/wrapped facet

Deliverables:

  • Add Statwise.Visualization.Mapping
  • Normalize aliases into semantic channels
  • Share row extraction across plot types
  • Support atom and string map keys
  • Preserve old API behavior
  • Add tests for old and new option names

Phase 2: Make Row Data First-Class

Seaborn works best with tidy data. Statwise should make tidy rows the primary shape while still supporting lists and maps of columns.

Supported inputs:

[%{group: :a, value: 1.2}]
%{group: [:a, :b], value: [1.2, 2.4]}
Explorer.DataFrame

Potential internal representation:

%Statwise.Visualization.Dataset{
  rows: [%{}],
  fields: %{...},
  source: :rows | :columns | :explorer
}

Conversion APIs:

Statwise.Visualization.Dataset.from_rows(rows)
Statwise.Visualization.Dataset.from_columns(columns)
Statwise.Visualization.Dataset.from_explorer(df)

Explorer should remain optional and be detected with Code.ensure_loaded?/1.

Deliverables:

  • Direct Explorer support using Explorer.DataFrame.to_rows/2
  • Map-of-columns support
  • Row validation
  • Shared missing-value policy
  • Dataset tests

Phase 3: Expand Core Plot Types

Seaborn organizes plots into relational, distribution, categorical, regression, and matrix families. Statwise should prioritize statistical usefulness.

Relational plots:

Statwise.Visualization.scatter(data, x: :height, y: :weight)
Statwise.Visualization.line(data, x: :time, y: :value)

Distribution plots:

Statwise.Visualization.histogram(data, x: :score)
Statwise.Visualization.ecdf(data, x: :score)
Statwise.Visualization.density(data, x: :score)
Statwise.Visualization.qq_plot(data, x: :score)

Categorical plots:

Statwise.Visualization.box_plot(data, x: :group, y: :score)
Statwise.Visualization.violin_plot(data, x: :group, y: :score)
Statwise.Visualization.strip_plot(data, x: :group, y: :score)
Statwise.Visualization.swarm_plot(data, x: :group, y: :score)
Statwise.Visualization.bar_plot(data, x: :group, y: :score, stat: :mean)
Statwise.Visualization.point_plot(data, x: :group, y: :score, interval: :confidence)
Statwise.Visualization.count_plot(data, x: :category)

Matrix plots:

Statwise.Visualization.heatmap(matrix)
Statwise.Visualization.correlation_heatmap(data, columns: [:a, :b, :c])

Recommended initial additions:

  • scatter/2
  • line/2
  • bar_plot/2
  • count_plot/2
  • strip_plot/2
  • heatmap/2

Defer until the statistical transformation story is clear:

  • density/2
  • violin_plot/2
  • swarm_plot/2

Phase 4: Improve Faceting

Current support:

facet: :site
facet_columns: 2

Target support:

facet: :site
facet: [column: :site]
facet: [row: :sex, column: :site]
columns: 3
share_x: true
share_y: false

Potential internal representation:

%{
  row: channel | nil,
  column: channel | nil,
  columns: integer | nil,
  share_x: boolean,
  share_y: boolean
}

Deliverables:

  • Row facets
  • Column facets
  • Wrapped facets
  • Shared-axis controls through Vega-Lite resolve
  • Livebook examples

Phase 5: Style And Theme System

The current with_style/2 supports friendly style keys. Make it more powerful and more Vega-Lite-native.

Theme presets:

Statwise.Visualization.with_theme(plot, :default)
Statwise.Visualization.with_theme(plot, :minimal)
Statwise.Visualization.with_theme(plot, :paper)
Statwise.Visualization.with_theme(plot, :dark)
Statwise.Visualization.with_theme(plot, :livebook)

Palette support:

Statwise.Visualization.with_palette(plot, :category10)
Statwise.Visualization.with_palette(plot, ["#2563eb", "#dc2626", "#16a34a"])

Vega-Lite escape hatches:

Statwise.Visualization.with_style(plot,
  vega_lite: [...],
  mark: [...],
  encoding: [...],
  facet: [...],
  spec: [...],
  config: [...]
)

Precedence rules should be explicit:

  1. Plot defaults
  2. Theme
  3. Attached style
  4. Export-time style

Deliverables:

  • Add Statwise.Visualization.Theme
  • Add Statwise.Visualization.Palette
  • Add full Vega-Lite pass-through support
  • Document merge precedence
  • Add tests for faceted and layered style routing

Phase 6: Plot Object And Composition API

This is the seaborn objects-inspired layer.

Target API:

Statwise.Visualization.plot(rows, x: :score, color: :group)
|> Statwise.Visualization.add(:histogram, bins: 20)
|> Statwise.Visualization.add(:rug)
|> Statwise.Visualization.facet(column: :site)
|> Statwise.Visualization.label(title: "Scores by Site")
|> Statwise.Visualization.show()

Potential structs:

%Statwise.Visualization.Figure{
  data: dataset,
  mappings: %{x: :score, y: nil, color: :group},
  layers: [%Statwise.Visualization.Layer{}],
  facet: nil,
  labels: %{},
  theme: nil,
  style: %{}
}

Layer examples:

add(:point)
add(:line)
add(:bar)
add(:box_plot)
add(:histogram)
add(:rule)

Deliverables:

  • plot/2
  • add/3
  • facet/2
  • label/2
  • show/1
  • Vega-Lite conversion for layered and faceted figures

This should happen after the direct constructors and semantic mappings are stable.

Phase 7: Statistical Summaries And Intervals

Implemented: add seaborn-like estimate plots that compute summaries.

Examples:

Statwise.Visualization.bar_plot(data, x: :group, y: :score, stat: :mean)

Statwise.Visualization.point_plot(data,
  x: :group,
  y: :score,
  stat: :mean,
  interval: :confidence,
  confidence_level: 0.95
)

Supported summaries:

  • :count
  • :mean
  • :median
  • :sum

Supported intervals:

  • nil
  • :standard_error
  • :confidence
  • :percentile

Deliverables:

  • Add Statwise.Visualization.Summary
  • Grouped summaries
  • Confidence, standard-error, and percentile intervals
  • point_plot/2
  • Optional bootstrap intervals later
  • Tests for summary correctness

Phase 8: Statistical Result Annotations

Implemented. Result-specific plots remain available for direct inspection:

Statwise.Visualization.t_test(result)
Statwise.Visualization.mann_whitney(result)
Statwise.Visualization.confidence_interval(result)

The primary workflow now shows statistical results directly on ordinary plots, for example a box plot with a comparison bracket and p-value/effect-size annotation:

rows
|> Statwise.Visualization.box_plot(x: :group, y: :score)
|> Statwise.Visualization.with_test(result, groups: {:control, :treated})

Tests can also be computed from the plotted rows. When the plot is faceted, the test is computed independently inside each facet panel:

rows
|> Statwise.Visualization.box_plot(x: :group, y: :score, facet: :site)
|> Statwise.Visualization.with_test(:mann_whitney, groups: {:control, :treated})

Deliverables:

  • Completed: test-result annotation data model
  • Completed: comparison brackets for categorical plots
  • Completed: p-value, statistic, and effect-size labels for t-test and Mann-Whitney results
  • Completed: per-facet test computation when tests are computed from plotted rows
  • Completed: facet-aware annotation placement

Future extensions:

  • Optional confidence interval overlays from %Statwise.TestResult{}

The Livebook should become the canonical tutorial.

Recommended structure:

  1. Quickstart
  2. Data shapes
  3. Semantic mappings
  4. Distribution plots
  5. Categorical plots
  6. Relational plots
  7. Faceting
  8. Styling and themes
  9. Statistical result plots
  10. Exporting
  11. Vega-Lite escape hatches

Also keep the README compact:

df
|> Statwise.Visualization.box_plot(x: :treatment, y: :score, facet: :site)
|> Statwise.Visualization.show()

Deliverables:

  • Keep docs/visualization_gallery.livemd current
  • Update docs/visualization.md
  • Add small README examples
  • Add generated Vega-Lite examples in tests

Phase 10: Compatibility And Stability

Before calling the visualization API mature:

  • Keep no required visualization runtime dependency.
  • Keep VegaLite, Kino, Jason, and Explorer optional.
  • Keep old APIs through aliases or a deprecation period.
  • Add changelog entries.
  • Ensure chart constructors return plain %Plot{} or %Figure{} structs.
  • Test generated Vega-Lite specs, not screenshots.
  1. Semantic mappings: x, y, color, facet
  2. Dataset normalization for rows, columns, and Explorer
  3. Scatter, line, bar, count, and strip plots
  4. Better faceting: row/column facets and shared axes
  5. Vega-Lite escape hatches in with_style/2
  6. Themes and palettes
  7. Statistical summary plots
  8. Composition API: plot |> add |> facet
  9. Matrix and correlation heatmaps
  10. Violin, density, and swarm plots once transformations are solid

The highest-leverage starting point is Phase 1 plus Phase 2. A seaborn-like API lives or dies by tidy data and semantic mappings. Once those are clean, every new chart becomes easier to add.