Local llama.cpp engine definition and lifecycle management.
An engine represents a llama-server binary that serves a model over an
OpenAI-compatible HTTP API. Multiple models can be configured to use the
same engine definition, but only one model can be loaded per running server
instance.
Fields
:alias— unique atom identifier for the engine:binary_dir— directory where thellama-serverbinary lives or will be installed (default:"~/.apero/llm/bin"):use_precompiled— iftrue, Apero will download the official precompiled binary from the llama.cpp GitHub releases (default:true):precompiled_version—:latestor a specific release tag such as"b4561"(default::latest):host— host the server listens on (default:"127.0.0.1"):port— base port; each running instance uses this port plus an offset (default:8080):start_args— extra CLI arguments passed tollama-serverat startup (e.g.["--n-gpu-layers", "35"])
Summary
Functions
Returns the base URL for the engine serving model_alias, or nil if not
running.
Returns the effective binary directory for an engine.
Returns true if the engine binary exists on disk.
Returns the full path to the llama-server binary for this engine.
Returns true if the engine serving model_alias is running and responding
to the /health endpoint.
Starts a llama-server process loaded with model.
Stops the engine server running the given model alias.
Types
Functions
Returns the base URL for the engine serving model_alias, or nil if not
running.
Returns the effective binary directory for an engine.
Falls back to ~/.apero/llm/bin when binary_dir is nil.
Returns true if the engine binary exists on disk.
Returns the full path to the llama-server binary for this engine.
Returns true if the engine serving model_alias is running and responding
to the /health endpoint.
@spec start(t(), Candil.Model.t()) :: {:ok, pid()} | {:error, binary()}
Starts a llama-server process loaded with model.
If engine.use_precompiled is true and the binary does not exist,
this function automatically downloads it before starting.
Registers the running server in Candil.Registry under the model alias.
Returns {:ok, pid} or {:error, reason}.
@spec stop(atom()) :: :ok | {:error, :not_running}
Stops the engine server running the given model alias.
Returns :ok or {:error, :not_running}.