System capability detection for llama.cpp precompiled binary selection.
Inspects the current OS, CPU architecture and available GPU hardware to select the most appropriate precompiled binary from the llama.cpp releases.
Detection strategy
- OS and architecture are read via
Apero.OS. - GPU detection tries, in order: NVIDIA (
nvidia-smi), AMD (rocminfo), Apple Metal (via OS type), and Intel Arc (sycl-ls). - The detected combination is mapped to the llama.cpp asset name pattern.
Asset naming
llama.cpp release assets follow this pattern:
llama-<version>-bin-<platform>-<variant>-<arch>.zipFor example:
llama-b4561-bin-linux-cuda-cu12.4.1-x64.zip
llama-b4561-bin-ubuntu-x64.zip
llama-b4561-bin-macos-arm64.zip
llama-b4561-bin-win-cuda-cu12.4.1-x64.zip
Summary
Functions
Returns the download URL for the best-matching asset in the given release, based on the current system's detection.
Detects OS, architecture and GPU backend.
Returns the GPU backend detected on the current machine.
Returns the latest llama.cpp release tag from GitHub, or {:error, reason}
if the API is unreachable.
Types
@type detection() :: %{ os: Apero.OS.os_type(), arch: Apero.OS.arch(), gpu: gpu_backend(), cuda_version: binary() | nil, asset_pattern: binary() }
@type gpu_backend() :: :cuda | :rocm | :metal | :vulkan | :sycl | :cpu
Functions
Returns the download URL for the best-matching asset in the given release, based on the current system's detection.
Pass :latest as version to resolve the latest release automatically.
@spec detect() :: detection()
Detects OS, architecture and GPU backend.
Returns a detection map with an :asset_pattern that can be used to select
the right binary from a GitHub release.
@spec detect_gpu(Apero.OS.os_type()) :: {gpu_backend(), binary() | nil}
Returns the GPU backend detected on the current machine.
Returns the latest llama.cpp release tag from GitHub, or {:error, reason}
if the API is unreachable.