Download and installation utilities for llama.cpp binaries and GGUF models.
Handles:
- Detecting the right precompiled llama.cpp asset for the current machine.
- Downloading and extracting it to the engine's
binary_dir. - Downloading GGUF model files from any HTTP/HTTPS URL.
- Resuming interrupted downloads via HTTP
Rangerequests where supported.
All downloads stream to disk — files are never loaded fully into memory.
Summary
Functions
Downloads and installs the appropriate llama.cpp precompiled binary for the given engine.
Downloads a GGUF model file from model.download_url to
model.model_dir/model.filename.
Functions
@spec download_engine(Candil.Engine.t()) :: :ok | {:error, binary()}
Downloads and installs the appropriate llama.cpp precompiled binary for the given engine.
The binary is extracted to engine.binary_dir (or ~/.apero/llm/bin by
default). Existing binaries are overwritten only if the version differs.
Steps
- Resolve the release tag (
:latest→ real tag via GitHub API). - Detect OS/arch/GPU and select the matching asset URL.
- Download the
.ziparchive to a temp file. - Extract
llama-server(andllama-cli) from the archive. - Make the binary executable.
@spec download_model(Candil.Model.t()) :: {:ok, binary()} | {:error, binary()}
Downloads a GGUF model file from model.download_url to
model.model_dir/model.filename.
Returns {:ok, dest_path} on success. Returns immediately without
downloading if the file already exists.