Kernel execution and registry for GPU compute.
Kernels are identified by atoms and executed on the GPU via CubeCL. This module provides a registry for custom kernels and helpers for common operations.
Built-in Kernels
| Kernel | Description |
|---|---|
:elementwise_add | Elementwise addition (2 inputs) |
:elementwise_mul | Elementwise multiplication (2 inputs) |
:scalar_mul | Scalar multiplication (1 input + scalar param) |
:relu | ReLU activation (1 input) |
:sigmoid | Sigmoid activation (1 input) |
:matmul | Matrix multiplication (2 inputs) |
:blur | Gaussian blur (1 input + radius/sigma params) |
:sharpen | Sharpen filter (1 input) |
:grayscale | RGB to grayscale (1 input) |
:lut | Color LUT transform (1 input + lut param) |
Custom Kernels
Register custom kernels at compile time:
defmodule MyKernels do
use Dala.Gpu.Compute.Kernel
kernel :custom_blur do
"""
// CubeCL kernel code
fn input: Tensor<f32>, output: Tensor<f32>, params: Map {
// ...
}
"""
end
endExecution Model
Kernels run on the dirty CPU scheduler to avoid blocking the BEAM. On iOS, kernels compile to Metal shaders. On Android, to OpenGL ES compute shaders. On desktop (dev), a CPU fallback is used.
EXCubeCL 0.3+ Compatibility
Dala uses atom kernel names (:elementwise_add) internally and
translates to EXCubeCL string names ("elementwise_add") at the
boundary. The run/4 and async_run/4 functions accept both
atoms and strings.
Summary
Functions
Run a kernel asynchronously. Returns a command ID.
Clear all registered kernels.
Initialize the kernel registry ETS table.
List all registered kernel names.
Look up a registered kernel.
Register a custom kernel at runtime.
Run a named kernel synchronously.
Types
@type kernel_spec() :: %{ name: atom(), source: String.t(), inputs: non_neg_integer(), params: [atom()] }
Functions
@spec async_run( atom(), [Dala.Gpu.Compute.Buffer.t()], Dala.Gpu.Compute.Buffer.t(), map() ) :: non_neg_integer()
Run a kernel asynchronously. Returns a command ID.
@spec clear_registry() :: :ok
Clear all registered kernels.
@spec init_registry() :: :ok
Initialize the kernel registry ETS table.
@spec list() :: [atom()]
List all registered kernel names.
@spec lookup(atom()) :: {:ok, kernel_spec()} | :error
Look up a registered kernel.
Register a custom kernel at runtime.
@spec run(atom(), [Dala.Gpu.Compute.Buffer.t()], Dala.Gpu.Compute.Buffer.t(), map()) :: :ok | {:error, term()}
Run a named kernel synchronously.