ExZarr.Codecs.PipelineV3 (ExZarr v1.1.0)

View Source

Zarr v3 codec pipeline implementation.

The v3 specification introduces a unified codec pipeline with strict ordering requirements to ensure predictable data transformations.

Codec Ordering

The pipeline must follow this exact order:

  1. Array → Array codecs (zero or more): Transforms that operate on array data, such as transpose, delta encoding, bit rounding, etc.

  2. Array → Bytes codec (exactly one, required): Serializes array data to bytes. The standard codec is "bytes" which handles endianness and packing.

  3. Bytes → Bytes codecs (zero or more): Compression codecs that operate on byte streams, such as gzip, zstd, blosc, etc.

Example Pipeline

codecs = [
  # Array → Array: shuffle bytes for better compression
  %{name: "shuffle", configuration: %{elementsize: 8}},
  # Array → Array: delta encoding
  %{name: "delta", configuration: %{dtype: "int64"}},
  # Array → Bytes: serialize to bytes (required)
  %{name: "bytes", configuration: %{}},
  # Bytes → Bytes: compress with gzip
  %{name: "gzip", configuration: %{level: 5}}
]

{:ok, pipeline} = ExZarr.Codecs.PipelineV3.parse_codecs(codecs)
{:ok, encoded} = ExZarr.Codecs.PipelineV3.encode(data, pipeline)
{:ok, decoded} = ExZarr.Codecs.PipelineV3.decode(encoded, pipeline)

Specification

Zarr v3 Core Specification - Codecs: https://zarr-specs.readthedocs.io/en/latest/v3/core/index.html#codecs

Summary

Functions

Decodes data through the codec pipeline.

Encodes data through the codec pipeline.

Converts v2-style filters and compressor to v3 codec list.

Parses and validates a list of codec specifications.

Types

codec_spec()

@type codec_spec() :: %{:name => String.t(), optional(:configuration) => map()}

codec_stage()

@type codec_stage() :: :array_to_array | :array_to_bytes | :bytes_to_bytes

Functions

decode(data, pipeline, opts \\ [])

@spec decode(binary(), ExZarr.Codecs.PipelineV3.Pipeline.t(), keyword()) ::
  {:ok, binary()} | {:error, term()}

Decodes data through the codec pipeline.

Applies codecs in reverse order:

  1. Bytes→Bytes decompression
  2. Bytes→Array deserialization
  3. Array→Array reverse transformations

Parameters

  • data - Binary data to decode
  • pipeline - Validated pipeline struct
  • opts - Additional options (shape, dtype, etc.)

Returns

  • {:ok, decoded_data} - Successfully decoded binary
  • {:error, reason} - Decoding failure

encode(data, pipeline, opts \\ [])

@spec encode(binary(), ExZarr.Codecs.PipelineV3.Pipeline.t(), keyword()) ::
  {:ok, binary()} | {:error, term()}

Encodes data through the codec pipeline.

Applies codecs in forward order:

  1. Array→Array transformations
  2. Array→Bytes serialization
  3. Bytes→Bytes compression

Parameters

  • data - Binary data to encode
  • pipeline - Validated pipeline struct
  • opts - Additional options (shape, dtype, etc.)

Returns

  • {:ok, encoded_data} - Successfully encoded binary
  • {:error, reason} - Encoding failure

from_v2(filters, compressor)

@spec from_v2(list() | nil, atom()) :: [codec_spec()]

Converts v2-style filters and compressor to v3 codec list.

Parameters

  • filters - List of v2 filter tuples {:filter_id, opts}
  • compressor - v2 compressor atom (:zlib, :zstd, etc.)

Returns

  • List of v3 codec specifications

Examples

iex> filters = [{:shuffle, [elementsize: 8]}, {:delta, [dtype: :int64]}]
iex> codecs = ExZarr.Codecs.PipelineV3.from_v2(filters, :zlib)
iex> length(codecs)
4

parse_codecs(codec_specs)

@spec parse_codecs([codec_spec()]) ::
  {:ok, ExZarr.Codecs.PipelineV3.Pipeline.t()} | {:error, term()}

Parses and validates a list of codec specifications.

Validates:

  • At least one codec is present
  • Exactly one array→bytes codec exists
  • Codecs are in the correct order
  • All codec names are recognized

Parameters

  • codec_specs - List of codec specification maps

Returns

  • {:ok, pipeline} - Validated pipeline struct
  • {:error, reason} - Validation failure

Examples

iex> codecs = [
...>   %{name: "bytes"},
...>   %{name: "gzip", configuration: %{level: 5}}
...> ]
iex> {:ok, _pipeline} = ExZarr.Codecs.PipelineV3.parse_codecs(codecs)

iex> codecs = [%{name: "gzip"}]  # Missing required bytes codec
iex> ExZarr.Codecs.PipelineV3.parse_codecs(codecs)
{:error, :missing_array_to_bytes_codec}