ExZarr.Metadata (ExZarr v1.1.0)

View Source

Metadata management for Zarr arrays.

Handles the .zarray metadata file that describes an array's structure, data type, compression, and other attributes according to the Zarr v2 specification.

Metadata Structure

Each Zarr array has associated metadata containing:

  • shape: The dimensions of the array
  • chunks: The size of each chunk
  • dtype: The data type of elements
  • compressor: The compression codec used
  • fill_value: Default value for uninitialized elements
  • order: Memory layout order ("C" for row-major)
  • zarr_format: Version of the Zarr specification (2)
  • filters: Optional transformation pipeline (currently unused)

Zarr v2 Specification

The metadata is stored as JSON in a .zarray file. ExZarr reads and writes this format for compatibility with other Zarr implementations including Python (zarr-python), Julia (Zarr.jl), and others.

Examples

# Create metadata
config = %{
  shape: {1000, 1000},
  chunks: {100, 100},
  dtype: :float64,
  compressor: :zlib,
  fill_value: 0.0
}
{:ok, metadata} = ExZarr.Metadata.create(config)

# Calculate chunk information
ExZarr.Metadata.num_chunks(metadata)    # => {10, 10}
ExZarr.Metadata.total_chunks(metadata)  # => 100
ExZarr.Metadata.chunk_size_bytes(metadata)  # => 80000 (100*100*8)

Summary

Functions

Returns the size of a chunk in bytes (uncompressed).

Creates metadata from a configuration map.

Returns the number of chunks along each dimension.

Returns the total number of chunks in the array.

Validates metadata structure.

Types

t()

@type t() :: %ExZarr.Metadata{
  chunks: tuple(),
  compressor: ExZarr.compressor(),
  dtype: ExZarr.dtype(),
  fill_value: number(),
  filters: list() | nil,
  order: String.t(),
  shape: tuple(),
  zarr_format: integer()
}

Functions

chunk_size_bytes(metadata)

@spec chunk_size_bytes(t()) :: non_neg_integer()

Returns the size of a chunk in bytes (uncompressed).

Calculates the number of bytes required to store one chunk in memory. This is the product of chunk dimensions multiplied by the size of the data type.

Examples

metadata = %ExZarr.Metadata{
  shape: {1000, 1000},
  chunks: {100, 100},
  dtype: :float64,
  # ... other fields
}
ExZarr.Metadata.chunk_size_bytes(metadata)
# => 80000 (100 * 100 * 8 bytes per float64)

metadata = %ExZarr.Metadata{
  shape: {1000},
  chunks: {100},
  dtype: :int32,
  # ... other fields
}
ExZarr.Metadata.chunk_size_bytes(metadata)
# => 400 (100 * 4 bytes per int32)

Returns

Non-negative integer representing uncompressed chunk size in bytes.

Note

Compressed chunks on disk may be smaller depending on the compression codec and data compressibility.

create(config)

@spec create(map()) :: {:ok, t()}

Creates metadata from a configuration map.

Takes a configuration map and creates a Metadata struct with all necessary fields for a Zarr array. Uses defaults for optional fields.

Parameters

  • config - Map with keys :shape, :chunks, :dtype, :compressor, :fill_value, and optionally :filters

Examples

config = %{
  shape: {1000, 1000},
  chunks: {100, 100},
  dtype: :float64,
  compressor: :zlib,
  fill_value: 0.0
}
{:ok, metadata} = ExZarr.Metadata.create(config)

# With filters
config = %{
  shape: {1000},
  chunks: {100},
  dtype: :int64,
  compressor: :zlib,
  filters: [{:delta, [dtype: :int64]}]
}
{:ok, metadata} = ExZarr.Metadata.create(config)

Returns

{:ok, metadata} with initialized Metadata struct.

num_chunks(metadata)

@spec num_chunks(t()) :: tuple()

Returns the number of chunks along each dimension.

Calculates how many chunks are needed in each dimension to cover the entire array. Uses ceiling division to handle arrays that don't divide evenly by chunk size.

Examples

metadata = %ExZarr.Metadata{
  shape: {1000, 1000},
  chunks: {100, 100},
  # ... other fields
}
ExZarr.Metadata.num_chunks(metadata)
# => {10, 10}

metadata = %ExZarr.Metadata{
  shape: {1000, 950},
  chunks: {100, 100},
  # ... other fields
}
ExZarr.Metadata.num_chunks(metadata)
# => {10, 10} (last chunk in second dimension is only 50 elements)

Returns

Tuple with the number of chunks in each dimension.

total_chunks(metadata)

@spec total_chunks(t()) :: non_neg_integer()

Returns the total number of chunks in the array.

Calculates the product of chunks in all dimensions. This is the total number of chunk files that will be created when the array is fully populated.

Examples

metadata = %ExZarr.Metadata{
  shape: {1000, 1000},
  chunks: {100, 100},
  # ... other fields
}
ExZarr.Metadata.total_chunks(metadata)
# => 100 (10 * 10)

metadata = %ExZarr.Metadata{
  shape: {1000, 1000, 1000},
  chunks: {100, 100, 100},
  # ... other fields
}
ExZarr.Metadata.total_chunks(metadata)
# => 1000 (10 * 10 * 10)

Returns

Non-negative integer representing total chunk count.

validate(metadata)

@spec validate(t()) :: :ok | {:error, term()}

Validates metadata structure.

Checks that the metadata has valid values for required fields.

Examples

ExZarr.Metadata.validate(metadata)
# => :ok

Returns

  • :ok if metadata is valid
  • {:error, :invalid_shape} if shape is empty
  • {:error, :chunks_shape_mismatch} if chunks and shape have different dimensions
  • {:error, :unsupported_zarr_format} if zarr_format is not 2
  • {:error, {:invalid_filter_config, filter_id, reason}} if a filter configuration is invalid
  • {:error, {:unknown_filter, filter_id}} if a filter is not registered
  • {:error, :invalid_filter_format} if filters list is malformed