Public API for ONNX Runtime inference.
Provides cross-platform ML inference via ONNX Runtime with hardware acceleration on both iOS and Android:
- iOS: CoreML Execution Provider → Apple Neural Engine
- Android: NNAPI Execution Provider → Qualcomm Hexagon / MediaTek APU
- Fallback: CPU execution on all platforms
All NIF functions run on the dirty CPU scheduler to avoid blocking BEAM.
Usage
# Load model from file
{:ok, session_id} = Dala.ML.ONNX.load_model_from_file("model.onnx")
# Or from binary data
{:ok, session_id} = Dala.ML.ONNX.create_session(model_binary)
# Run inference
{:ok, output} = Dala.ML.ONNX.run(session_id, input_binary)
# Clean up
:ok = Dala.ML.ONNX.destroy_session(session_id)
Summary
Functions
Check if ONNX Runtime NIF is available.
Create an ONNX inference session from model binary data.
Destroy an ONNX session and free associated resources.
Load an ONNX model from a file path and create a session.
Run inference on a session with the given input binary data.
Check if ONNX Runtime is available and initialized on this platform.
Return the number of active ONNX sessions.
Functions
@spec available?() :: boolean()
Check if ONNX Runtime NIF is available.
Create an ONNX inference session from model binary data.
Returns {:ok, session_id} on success.
Destroy an ONNX session and free associated resources.
Load an ONNX model from a file path and create a session.
Run inference on a session with the given input binary data.
Input must be a binary of f32 values in the correct shape for the model.
Returns {:ok, output_binary} where output_binary is f32 values.
@spec runtime_available?() :: boolean() | :not_supported
Check if ONNX Runtime is available and initialized on this platform.
@spec session_count() :: integer() | :not_supported
Return the number of active ONNX sessions.