ExZarr.Codecs (ExZarr v1.1.0)

View Source

Compression and decompression codecs for Zarr arrays.

Supports both built-in and custom codecs through an extensible codec registry.

Built-in Codecs

Provides compression and decompression operations for chunk data using the following codecs:

  • :none - No compression (fastest, but largest storage size)
  • :zlib - Standard zlib compression using Erlang's built-in :zlib module
  • :zstd - Zstandard compression using ezstd (optional dependency)
  • :lz4 - LZ4 compression using nimble_lz4 (optional dependency)
  • :snappy - Snappy compression using snappyer (optional dependency)
  • :blosc - Blosc meta-compressor using Zig NIF (optional)
  • :bzip2 - Bzip2 compression using Zig NIF (optional)
  • :crc32c - CRC32C checksum codec (bytes-to-bytes, adds 4-byte checksum)

Compression Performance

Different codecs offer different trade-offs between compression speed, decompression speed, and compression ratio:

  • :none - Fastest (no CPU overhead), largest files
  • :lz4 - Very fast compression and decompression, moderate compression ratio
  • :snappy - Very fast compression, good for real-time data
  • :zlib - Balanced performance and compression ratio (default level 6)
  • :zstd - Best compression ratio, fast decompression, configurable levels
  • :blosc - Meta-compressor with SIMD acceleration, excellent for numerical data
  • :bzip2 - High compression ratio, slower speed
  • :crc32c - Checksum codec for data integrity (not compression, adds 4-byte overhead)

Optional Dependencies

Some codecs require optional dependencies. To use them, add to your mix.exs:

def deps do
  [
    {:ex_zarr, "~> 0.1"},
    {:ezstd, "~> 1.1"},        # For :zstd
    {:nimble_lz4, "~> 0.1.3"}, # For :lz4
    {:snappyer, "~> 1.2"}      # For :snappy
  ]
end

Check codec availability at runtime with codec_available?/1.

Custom Codecs

ExZarr supports custom codecs through the ExZarr.Codecs.Codec behavior. Implement the behavior and register your codec:

defmodule MyApp.CustomCodec do
  @behaviour ExZarr.Codecs.Codec

  def codec_id, do: :my_codec
  def codec_info, do: %{name: "My Codec", version: "1.0", type: :compression}
  def available?, do: true
  def encode(data, _opts), do: {:ok, data}
  def decode(data, _opts), do: {:ok, data}
end

# Register the codec
ExZarr.Codecs.register_codec(MyApp.CustomCodec)

# Use it like built-in codecs
{:ok, encoded} = ExZarr.Codecs.compress(data, :my_codec)

See ExZarr.Codecs.Codec for full documentation on implementing custom codecs.

Examples

# Compress data with zlib (always available)
{:ok, compressed} = ExZarr.Codecs.compress("hello world", :zlib)

# Decompress data
{:ok, original} = ExZarr.Codecs.decompress(compressed, :zlib)

# Use zstd with compression level
{:ok, compressed} = ExZarr.Codecs.compress("hello world", :zstd, level: 5)

# Use blosc (excellent for numerical data)
{:ok, compressed} = ExZarr.Codecs.compress(float_data, :blosc, level: 9)

# Use CRC32C checksum (adds 4-byte checksum for data integrity)
{:ok, checksummed} = ExZarr.Codecs.compress(data, :crc32c)

# No compression
{:ok, data} = ExZarr.Codecs.compress("hello", :none)
# data == "hello"

# Check codec availability
ExZarr.Codecs.codec_available?(:zlib)   # => true (always)
ExZarr.Codecs.codec_available?(:zstd)   # => true if libzstd is installed
ExZarr.Codecs.codec_available?(:blosc)  # => true if libblosc is installed
ExZarr.Codecs.codec_available?(:crc32c) # => true (always available)

Compatibility Notes

  • All codecs are compatible with zarr-python when using the same compression
  • :zlib is always available (uses Erlang's built-in module)
  • Other codecs require system libraries and compiled Zig NIFs
  • If a codec is not available, compression/decompression will return an error
  • Use codec_available?/1 to check availability before use

Summary

Functions

Returns the list of available codecs.

Checks if a codec is available for use.

Gets information about a codec.

Compresses data using the specified codec.

Decompresses data using the specified codec.

Lists all registered codec IDs.

Registers a custom codec.

Unregisters a custom codec.

Types

codec()

@type codec() :: atom()

codec_module()

@type codec_module() :: module()

Functions

available_codecs()

@spec available_codecs() :: [codec(), ...]

Returns the list of available codecs.

This function checks which codecs are actually available at runtime. :none and :zlib are always available. Other codecs require the Zig NIFs to be compiled with the system libraries installed.

Examples

ExZarr.Codecs.available_codecs()
# => [:none, :zlib, :zstd, :lz4, :snappy, :blosc, :bzip2]
# (if all system libraries are installed)

Returns

List of codec atoms that can be used with compress/3 and decompress/2.

codec_available?(codec)

@spec codec_available?(codec()) :: boolean()

Checks if a codec is available for use.

Returns true if the codec can be used with compress/3 and decompress/2, false otherwise.

This function actually tests if the codec can be used by checking if the necessary functions are exported from the ZigCodecs module.

Examples

ExZarr.Codecs.codec_available?(:zlib)
# => true

ExZarr.Codecs.codec_available?(:zstd)
# => true (if libzstd is installed and NIFs are compiled)

ExZarr.Codecs.codec_available?(:unknown)
# => false

Parameters

  • codec - Codec atom to check

Returns

Boolean indicating codec availability.

codec_info(codec_id)

@spec codec_info(codec()) :: {:ok, map()} | {:error, :not_found}

Gets information about a codec.

Returns metadata including name, version, type, and description.

Examples

{:ok, info} = ExZarr.Codecs.codec_info(:zstd)
# => %{
#   name: "Zstandard",
#   version: "1.0.0",
#   type: :compression,
#   description: "Zstandard compression algorithm"
# }

Returns

  • {:ok, map()} - Codec information
  • {:error, :not_found} - Codec not registered

compress(data, codec, opts \\ [])

@spec compress(binary(), codec(), keyword()) :: {:ok, binary()} | {:error, term()}

Compresses data using the specified codec.

Takes binary data and compresses it using the chosen compression algorithm. The :none codec returns the data unchanged. Other codecs apply compression and return the compressed binary.

Parameters

  • data - Binary data to compress
  • codec - Compression codec (:none, :zlib, :zstd, :lz4, :snappy, :blosc, :bzip2, or :crc32c)
  • opts - Optional keyword list with compression options:
    • :level - Compression level (codec-specific, typically 1-9)

Examples

# Compress with zlib
{:ok, compressed} = ExZarr.Codecs.compress("hello world", :zlib)

# Compress with zstd at level 5
{:ok, compressed} = ExZarr.Codecs.compress("hello world", :zstd, level: 5)

# No compression
{:ok, same} = ExZarr.Codecs.compress("hello", :none)
# same == "hello"

# Compress binary data
data = :crypto.strong_rand_bytes(1000)
{:ok, compressed} = ExZarr.Codecs.compress(data, :zlib)

Returns

  • {:ok, compressed_binary} on success
  • {:error, {:unsupported_codec, codec}} if codec is not available
  • {:error, {:compression_failed, reason}} if compression fails

decompress(data, codec)

@spec decompress(binary(), codec()) :: {:ok, binary()} | {:error, term()}

Decompresses data using the specified codec.

Takes compressed binary data and decompresses it using the chosen algorithm. The codec must match the one used for compression. The :none codec returns the data unchanged.

Parameters

  • data - Compressed binary data
  • codec - Compression codec (:none, :zlib, :zstd, :lz4, :snappy, :blosc, :bzip2, or :crc32c)

Examples

# Compress and decompress
{:ok, compressed} = ExZarr.Codecs.compress("hello world", :zlib)
{:ok, original} = ExZarr.Codecs.decompress(compressed, :zlib)
# original == "hello world"

# No decompression needed
{:ok, same} = ExZarr.Codecs.decompress("hello", :none)
# same == "hello"

Returns

  • {:ok, decompressed_binary} on success
  • {:error, {:unsupported_codec, codec}} if codec is not available
  • {:error, {:decompression_failed, reason}} if decompression fails

Errors

Decompression will fail if:

  • The data is not validly compressed with the specified codec
  • The data is corrupted
  • The wrong codec is specified

Notes

For :lz4 and :bzip2, the original size is stored in the first 8 bytes of the compressed data (prepended during compression).

list_codecs()

@spec list_codecs() :: [codec()]

Lists all registered codec IDs.

Includes both built-in and custom codecs, regardless of availability.

Examples

ExZarr.Codecs.list_codecs()
# => [:none, :zlib, :crc32c, :zstd, :lz4, :my_codec]

register_codec(codec_module, opts \\ [])

@spec register_codec(
  codec_module(),
  keyword()
) :: :ok | {:error, term()}

Registers a custom codec.

The codec module must implement the ExZarr.Codecs.Codec behavior.

Examples

defmodule MyApp.CustomCodec do
  @behaviour ExZarr.Codecs.Codec

  def codec_id, do: :my_codec
  def codec_info, do: %{name: "My Codec", version: "1.0", type: :compression, description: "..."}
  def available?, do: true
  def encode(data, _opts), do: {:ok, my_encode(data)}
  def decode(data, _opts), do: {:ok, my_decode(data)}
end

ExZarr.Codecs.register_codec(MyApp.CustomCodec)
{:ok, encoded} = ExZarr.Codecs.compress(data, :my_codec)

Options

  • :force - Overwrite existing codec with same ID (default: false)

Returns

  • :ok - Codec registered successfully
  • {:error, :already_registered} - Codec ID already in use
  • {:error, :invalid_codec} - Module doesn't implement Codec behavior

unregister_codec(codec_id)

@spec unregister_codec(codec()) :: :ok | {:error, term()}

Unregisters a custom codec.

Built-in codecs cannot be unregistered.

Examples

ExZarr.Codecs.unregister_codec(:my_codec)

Returns

  • :ok - Codec unregistered successfully
  • {:error, :not_found} - Codec not registered
  • {:error, :cannot_unregister_builtin} - Cannot unregister built-in codec