Dala.Gpu.Compute.Kernel (dala v0.5.0)

Copy Markdown View Source

Kernel execution and registry for GPU compute.

Kernels are identified by atoms and executed on the GPU via CubeCL. This module provides a registry for custom kernels and helpers for common operations.

Built-in Kernels

KernelDescription
:elementwise_addElementwise addition (2 inputs)
:elementwise_mulElementwise multiplication (2 inputs)
:scalar_mulScalar multiplication (1 input + scalar param)
:reluReLU activation (1 input)
:sigmoidSigmoid activation (1 input)
:matmulMatrix multiplication (2 inputs)
:blurGaussian blur (1 input + radius/sigma params)
:sharpenSharpen filter (1 input)
:grayscaleRGB to grayscale (1 input)
:lutColor LUT transform (1 input + lut param)

Custom Kernels

Register custom kernels at compile time:

defmodule MyKernels do
  use Dala.Gpu.Compute.Kernel

  kernel :custom_blur do
    """
    // CubeCL kernel code
    fn input: Tensor<f32>, output: Tensor<f32>, params: Map {
      // ...
    }
    """
  end
end

Execution Model

Kernels run on the dirty CPU scheduler to avoid blocking the BEAM. On iOS, kernels compile to Metal shaders. On Android, to OpenGL ES compute shaders. On desktop (dev), a CPU fallback is used.

EXCubeCL 0.3+ Compatibility

Dala uses atom kernel names (:elementwise_add) internally and translates to EXCubeCL string names ("elementwise_add") at the boundary. The run/4 and async_run/4 functions accept both atoms and strings.

Summary

Functions

Run a kernel asynchronously. Returns a command ID.

Clear all registered kernels.

Initialize the kernel registry ETS table.

List all registered kernel names.

Look up a registered kernel.

Register a custom kernel at runtime.

Run a named kernel synchronously.

Types

kernel_spec()

@type kernel_spec() :: %{
  name: atom(),
  source: String.t(),
  inputs: non_neg_integer(),
  params: [atom()]
}

Functions

async_run(kernel, inputs, output, params \\ %{})

Run a kernel asynchronously. Returns a command ID.

clear_registry()

@spec clear_registry() :: :ok

Clear all registered kernels.

init_registry()

@spec init_registry() :: :ok

Initialize the kernel registry ETS table.

list()

@spec list() :: [atom()]

List all registered kernel names.

lookup(name)

@spec lookup(atom()) :: {:ok, kernel_spec()} | :error

Look up a registered kernel.

register(name, source, opts \\ [])

@spec register(atom(), String.t(), keyword()) :: :ok | {:error, term()}

Register a custom kernel at runtime.

run(kernel, inputs, output, params \\ %{})

@spec run(atom(), [Dala.Gpu.Compute.Buffer.t()], Dala.Gpu.Compute.Buffer.t(), map()) ::
  :ok | {:error, term()}

Run a named kernel synchronously.