Status: Early development. Not yet ready for production use.
ExCubecl
ExCubecl is an Nx backend powered by CubeCL via Rust NIFs. It provides efficient tensor operations with support for CPU computation today and GPU acceleration (via CubeCL) coming soon.
Features
- Nx Backend: Full integration with the Nx tensor library
- Rust NIFs: High-performance tensor operations via Rust
- Mobile Support: C FFI layer for iOS (Objective-C/Swift) and Android (JNI)
- Graceful Fallback: Operations not yet implemented in NIF fall back to
Nx.BinaryBackend - Type Support:
f32,f64,s32,s64,u32,u8
Installation
Add ex_cubecl to your list of dependencies in mix.exs:
def deps do
[
{:ex_cubecl, "~> 0.1.0"}
]
endQuick Start
# Create tensors
a = Nx.tensor([1.0, 2.0, 3.0], backend: ExCubecl.Backend)
b = Nx.tensor([4.0, 5.0, 6.0], backend: ExCubecl.Backend)
# Basic operations
Nx.add(a, b) # [5.0, 7.0, 9.0]
Nx.multiply(a, b) # [4.0, 10.0, 18.0]
Nx.sum(a) # 6.0
# Shape operations
Nx.reshape(a, {3, 1})
Nx.transpose(Nx.tensor([[1.0, 2.0], [3.0, 4.0]]))
# Reductions
Nx.sum(a, axes: [0])
Nx.argmax(a)
# Type conversion
Nx.as_type(a, {:s, 32})
# Transfer to/from other backends
binary = Nx.to_binary(a)
Nx.from_binary(binary, {:f, 32}, backend: ExCubecl.Backend)Supported Operations
| Category | Operations |
|---|---|
| Binary | add, subtract, multiply, divide, pow, remainder, atan2, min, max, quotient, bitwise_and, bitwise_or, bitwise_xor, left_shift, right_shift |
| Comparison | equal, not_equal, greater, less, greater_equal, less_equal, logical_and, logical_or, logical_xor |
| Unary | negate, abs, exp, log, sqrt, sin, cos, tan, sigmoid, relu, expm1, log1p, cosh, sinh, tanh, acos, asin, atan, acosh, asinh, atanh, rsqrt, cbrt, erf, erfc, erf_inv, bitwise_not, ceil, floor, round, sign, conjugate, count_leading_zeros, population_count, real, imag, is_nan, is_infinity |
| Shape | reshape, squeeze, broadcast, transpose, pad, reverse, slice, concatenate, stack, select |
| Reductions | sum, product, reduce_max, reduce_min, all, any, argmax, argmin |
| Window | window_sum, window_max, window_min |
| LinAlg | dot, conv |
| Sorting | sort, argsort |
| Type | as_type, bitcast, constant, eye, iota |
| Indexed | indexed_add, indexed_put, gather, put_slice |
Operations not yet implemented in the NIF layer (e.g., fft, ifft, triangular_solve) automatically fall back to Nx.BinaryBackend.
Mobile Integration (iOS / Android)
ExCubecl includes a C FFI layer for mobile platform integration.
iOS (Objective-C / Swift)
#include "ex_cubecl.h"
// Create tensors
float data[] = {1.0f, 2.0f, 3.0f};
size_t shape[] = {3};
ex_cubecl_tensor_handle_t a = ex_cubecl_new_tensor((const uint8_t*)data, shape, 1, EX_CUBECL_DTYPE_F32);
ex_cubecl_tensor_handle_t b = ex_cubecl_new_tensor((const uint8_t*)data, shape, 1, EX_CUBECL_DTYPE_F32);
// Add
ex_cubecl_tensor_handle_t result = ex_cubecl_add(a, b);
// Read result
float out[3];
ex_cubecl_read_tensor(result, (uint8_t*)out, sizeof(out));
// Cleanup
ex_cubecl_deallocate_tensor(a);
ex_cubecl_deallocate_tensor(b);
ex_cubecl_deallocate_tensor(result);Android (JNI)
#include "ex_cubecl.h"
#include <jni.h>
JNIEXPORT jlong JNICALL
Java_com_example_excubecl_ExCubeclTensor_add(
JNIEnv *env, jobject thiz, jlong a_handle, jlong b_handle) {
return (jlong)ex_cubecl_add((ex_cubecl_tensor_handle_t)a_handle,
(ex_cubecl_tensor_handle_t)b_handle);
}See native/ex_cubecl_nif/include/ex_cubecl.h for the full API reference.
Architecture
┌─────────────────────────────────────────────┐
│ Elixir / Nx │
│ Nx.add(a, b) → ExCubecl.Backend.add/3 │
├─────────────────────────────────────────────┤
│ ExCubecl.Backend │
│ - Type conversion, broadcasting, fallback │
├─────────────────────────────────────────────┤
│ ExCubecl.NIF (Elixir) │
│ - NIF function stubs │
├─────────────────────────────────────────────┤
│ Rust NIF (lib.rs) │
│ - Tensor operations on CPU │
│ - Integer-aware paths (no f64 roundtrip) │
├─────────────────────────────────────────────┤
│ C FFI (ffi.rs + ex_cubecl.h) │
│ - Mobile platform interface │
│ - Handle-based tensor management │
└─────────────────────────────────────────────┘GPU Support (Coming Soon)
GPU acceleration via CubeCL is prepared but requires the CubeCL crate to be published with the needed features. When available, uncomment the cubecl dependency in native/ex_cubecl_nif/Cargo.toml and enable the gpu feature:
mix compile --features gpu
License
Apache 2.0 - See LICENSE for details.