Preprocessing pipelines for ML model inputs.
Provides standardized preprocessing for common input types: images, text, and audio. All functions return Nx tensors ready for model consumption.
Image Preprocessing
# Standard ImageNet preprocessing
tensor = image_path
|> Dala.ML.Preprocess.load_image()
|> Dala.ML.Preprocess.resize({224, 224})
|> Dala.ML.Preprocess.normalize(:imagenet)
|> Dala.ML.Preprocess.to_batch()Audio Preprocessing
spectrogram = audio_path
|> Dala.ML.Preprocess.load_audio()
|> Dala.ML.Preprocess.mel_spectrogram(sample_rate: 16000)
Summary
Functions
Loads audio from a file path.
Loads an image from a file path and returns a tensor.
Returns an Nx tensor of shape {height, width, 3} with values 0..255.
Computes a mel spectrogram from audio samples.
Normalizes a tensor with standard normalization schemes.
Resizes an image tensor to the target size.
Adds a batch dimension to a tensor (shape {...} → {1, ...}).
Converts an Nx tensor to a binary of f32 values for ONNX input.
Functions
@spec load_audio(String.t()) :: {:ok, {Nx.Tensor.t(), pos_integer()}} | {:error, term()}
Loads audio from a file path.
Returns {:ok, {samples_tensor, sample_rate}}.
@spec load_image(String.t()) :: {:ok, Nx.Tensor.t()} | {:error, term()}
Loads an image from a file path and returns a tensor.
Returns an Nx tensor of shape {height, width, 3} with values 0..255.
@spec mel_spectrogram( Nx.Tensor.t(), keyword() ) :: Nx.Tensor.t() | {:ok, Nx.Tensor.t()}
Computes a mel spectrogram from audio samples.
Options
:sample_rate— Audio sample rate (default: 16000):n_fft— FFT size (default: 400):n_mels— Number of mel bands (default: 80):hop_length— Hop length (default: 160)
@spec normalize(Nx.Tensor.t(), atom() | {list(), list()}) :: Nx.Tensor.t()
Normalizes a tensor with standard normalization schemes.
Schemes
:imagenet— ImageNet mean/std normalization:minmax— Scale to [0, 1]:standard— Zero mean, unit variance{mean, std}— Custom normalization
@spec resize( Nx.Tensor.t(), {pos_integer(), pos_integer()} ) :: Nx.Tensor.t()
Resizes an image tensor to the target size.
size is a tuple {height, width}.
@spec to_batch(Nx.Tensor.t()) :: Nx.Tensor.t()
Adds a batch dimension to a tensor (shape {...} → {1, ...}).
@spec to_f32_binary(Nx.Tensor.t()) :: binary()
Converts an Nx tensor to a binary of f32 values for ONNX input.