All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[0.2.0] - 2025-05-03

Added

  • ExSystolic.Backend.Partitioned -- tile-based parallel backend using Task.Supervisor (default) or Poolex worker pool (dispatch: :pool)
  • ExSystolic.Backend.PoolexWorker -- GenServer worker for Poolex-based dispatch
  • ExSystolic.Backend.LinkOps -- shared link buffer operations (inject_streams/2, drain_links/2, write_pe_outputs/2), eliminating triple-duplicated code across Clock, Interpreted, and Partitioned
  • ExSystolic.Tile -- tile data structure for partitioned execution with documented invariants
  • ExSystolic.TilePartitioner -- splits arrays into rectangular tiles with validation (tile_rows > 0, tile_cols > 0)
  • ExSystolic.Application -- supervision tree with ExSystolic.TaskSupervisor and :systolic_pool Poolex pool
  • ExSystolic.Backend behaviour -- documents the 6-step BSP tick contract (INJECT, READ, EXECUTE, COLLECT, WRITE, RECORD)
  • Space.links/2 callback -- Space modules now produce boundary + internal links for a given direction; Grid2D implements it :west_to_east and :north_to_south
  • Array.result_map/1 -- returns coord => state for any Space implementation
  • PE.value/2 and PE.present?/2 -- helpers for handling :empty / nil PE inputs idiomatically
  • Array.input/3 -- accepts any atom port (not just :west/:north); raises ArgumentError on duplicate {coord, port} keys
  • Architecture diagram in ExSystolic moduledoc showing BSP dataflow
  • Determinism contract in PE moduledoc forbidding side effects (Process, Agent, ETS, IO, timers)
  • :empty documented as reserved atom in Link and PE moduledocs
  • Backend conformance test (test/ex_systolic/backend/conformance_test.exs) -- result parity and trace parity across backends
  • Cross-backend trace parity test (sorted {tick, coord, state_before, state_after} comparison)
  • Link FIFO interleaved property test
  • Capacity > 1 link tests (multiple writes, FIFO order, :full on overflow)
  • Poolex pool dispatch test (dispatch: :pool parity with interpreted)
  • Application supervision tests (TaskSupervisor alive, Poolex pool accepts calls)
  • Custom space end-to-end execution test through Clock
  • Full PE state parity test beyond result_matrix
  • Doctest directives in 6 test files (34 doctests, up from 12)

Changed

  • Array.connect/2 delegates to Space.links/2; Grid2D-specific link functions removed from array.ex
  • Array.result_matrix/1 raises ArgumentError for non-Grid2D spaces (was silently producing nil cells)
  • Partitioned backend uses Task.Supervisor.async_stream with ordered: true and timeout: :infinity; trace events sorted by {tick, coord} after collection
  • Partitioned accepts dispatch: :tasks | :pool option (default :tasks)

  • Clock.run/2 raises ArgumentError for unknown backends (was CaseClauseError)
  • Clock.step/1 uses LinkOps helpers instead of inline link operations
  • Interpreted.write_outputs/3 delegates to LinkOps.write_pe_outputs + LinkOps.inject_streams
  • MAC.step/4 uses PE.value/2 and PE.present?/2 instead of inline :empty checks
  • Trace.at/2 and Trace.for_coord/2 return events in chronological order (filter-then-reverse); ordering documented
  • Trace.Event has a single canonical typespec with @type event :: Event.t() convenience alias
  • All ranges use //1 step syntax for Elixir 1.20-rc compatibility (15 occurrences)
  • Link.read/1 typespec changed to {term() | :empty, t()}

  • Link.write/2 and Link.read/1 doctests fixed (bind-then-assert pattern, no bare _)
  • Trace.record/2 doctest now verifies event content (state_after, tick)
  • Link.tick/1 doctest now demonstrates buffer preservation
  • Grid.member?/2 doctest includes negative-coordinate edge case
  • TilePartitioner moduledoc accurately describes boundary-link handling (excluded from tiles, managed globally)
  • Application has full @moduledoc and @doc on start/2
  • Backend table in ExSystolic moduledoc lists all 19 modules

Removed

  • Interpreted.read_inputs/2 -- dead code that violated buffer drainage contract; tests removed
  • Array.input/3 dead case direction branch (identity mapping :west -> :west, :north -> :north)
  • Application.start/1 dead Code.ensure_loaded?(Poolex) conditional (Poolex is always a dependency)
  • Tile.boundary_inputs field -- dead struct field never read by Partitioned
  • Grid2D-specific build_direction_links clauses and helper functions from array.ex (~90 LOC)

Fixed

  • Trace event ordering was non-deterministic in partitioned backend (ordered: false in Task.async_stream)
  • Link doctests for write/2 and read/2 never executed (bare _ in expected output)
  • result_matrix/1 silently returned nil cells for non-Grid2D spaces
  • Array.input/3 silently overwrote duplicate {coord, port} streams
  • Range syntax incompatibility with Elixir 1.20-rc (bare 0..(n-1) where n could be 0)
  • PoolexWorker.handle_call timeout risk -- callers now pass :infinity
  • Dialyzer opaque type violation in TilePartitioner (MapSet.member? replaced with Map.has_key?)

Performance

  • LinkOps.drain_links/2 builds endpoint index map for O(1) lookup (mutation still list-based; see review item 2.1 for full indexed-map refactoring)

Known Limitations

  • Trace.events list grows unbounded (no max_events or sampling)
  • Link mutation still uses List.replace_at (O(n)); indexed map storage not yet implemented
  • Latency > 1 links are not yet implemented in the backend
  • No sparse data support; every PE executes every tick
  • No semi-ring abstraction; MAC hard-codes arithmetic operations

[0.1.0] - 2025-05-01

Added

  • ExSystolic.Grid -- rectangular coordinate geometry with neighbour lookups (north, south, east, west)
  • ExSystolic.Link -- FIFO communication channels between PE ports with configurable capacity and latency
  • ExSystolic.PE -- behaviour for processing elements with init/1 and step/4 callbacks
  • ExSystolic.PE.MAC -- multiply-accumulate PE implementing the classic Kung-Leiserson systolic GEMM PE
  • ExSystolic.Array -- array construction API: new, fill, connect, input, trace, result_matrix
  • ExSystolic.Clock -- tick-by-tick execution driver with run/2 and step/1
  • ExSystolic.Trace -- optional execution trace recording with per-tick and per-coordinate querying
  • ExSystolic.Backend.Interpreted -- single-process reference backend implementing the strict tick order (inject, read, execute, write, record)
  • ExSystolic.Examples.GEMM -- general matrix multiply using systolic wavefront with skewed input streams
  • Tutorial livebook: notebooks/introduction_to_systolic_arrays.livemd
  • 86 unit tests + 9 property-based tests (95.4% coverage)
  • CI pipeline: Elixir 1.18/OTP 27, 1.19/OTP 28, 1.20/OTP 29 (experimental)
  • Publish workflow: Hex.pm release on v* tag push
  • Quality toolchain: credo, dialyxir, excoveralls, stream_data, mix_audit

Design Principles

  • Everything is pure Elixir; no GenServer, no Nx, no NIFs
  • Execution is deterministic and reproducible
  • All data structures are immutable
  • All public functions have @doc, typespecs, and doctests

Known Limitations

  • Interpreted backend has no parallelism; large arrays will be slow
  • No sparse data support; every PE executes every tick
  • Latency > 1 links are not yet implemented in the backend
  • Trace is held entirely in memory
  • No semi-ring abstraction; MAC hard-codes arithmetic operations

Performance

No benchmarks have been run. The interpreted backend is chosen for architectural clarity and correctness, not for speed. Performance claims will be made only after benchmarks exist.