NxTfliteMob (nx_tflite_mob v0.0.3)

Copy Markdown View Source

TensorFlow Lite NIF for Mob apps.

Loads .tflite model bytes, runs inference via TFLite's bundled XNNPACK CPU path or via the NNAPI delegate (which on MediaTek devices like the Moto G Power 5G 2024 routes to the mtk-gpu_shim GPU accelerator and gets us ~150 ms YOLOv8n).

Example

tflite = File.read!("priv/yolov8n_full_integer_quant.tflite")
{:ok, m} = NxTfliteMob.load_module(tflite,
             delegate: "nnapi",
             accelerator: "mtk-gpu_shim",
             allow_fp16: true)

input_int8 = File.read!("priv/input_int8.bin")  # 1x640x640x3 INT8
{:ok, [out_bin]} = NxTfliteMob.call(m, [input_int8])

:ok = NxTfliteMob.release_module(m)

Delegates

  • :delegate"xnnpack" (default, CPU INT8/FP32) or "nnapi" (vendor NN HAL, hits the GPU / NPU when present)
  • :accelerator — only meaningful with delegate: "nnapi"; the accelerator name to request. Discover with NxTfliteMob.NIF.list_nnapi_devices/0 (planned). Known values on the Moto BXM-8-256:
    • "mtk-gpu_shim" — PowerVR GPU through MediaTek NNAPI HAL (best result for YOLOv8n)
    • "mtk-neuron_shim" — APU NPU; only partial op coverage for YOLO, falls back partial
    • "nnapi-reference" — NNAPI's CPU emulation (slow)
  • :num_threads — XNNPACK CPU thread count (default 6)
  • :allow_fp16 — NNAPI may run FP32 ops in FP16 (default true)

Summary

Types

module_handle()

@type module_handle() :: reference()

Functions

call(handle, inputs)

@spec call(module_handle(), [binary()]) :: {:ok, [binary()]} | {:error, String.t()}

load_module(model_bytes, opts \\ [])

@spec load_module(
  binary(),
  keyword()
) :: {:ok, module_handle()} | {:error, String.t()}

release_module(handle)

@spec release_module(module_handle()) :: :ok