Behaviour for the model I/O the manager performs on the write path.
The default implementation is LlamaCppEx.ModelManager.ModelIO, which delegates
to LlamaCppEx.Hub, LlamaCppEx.Model, and LlamaCppEx.Server. Tests inject a
fake via the :io start option to exercise load/unload lifecycle without real
GGUF files.
Inference dispatch (generate/stream/chat/embed) does NOT go through this
behaviour — it reads the ETS table directly from the caller and calls the
relevant module, keeping the manager process off the hot path.
Summary
Callbacks
Loads a model directly (for :direct mode).
Resolves a source to a local file path and its byte size, downloading from the Hub if needed.
Starts a backing LlamaCppEx.Server for id (for :server mode).
Stops a backing server, dropping its context and model refs.
Callbacks
@callback load_model( String.t(), keyword() ) :: {:ok, LlamaCppEx.Model.t()} | {:error, term()}
Loads a model directly (for :direct mode).
@callback resolve_source( LlamaCppEx.ModelManager.Entry.source(), keyword() ) :: {:ok, String.t(), non_neg_integer()} | {:error, term()}
Resolves a source to a local file path and its byte size, downloading from the Hub if needed.
@callback start_server(id :: term(), path :: String.t(), keyword()) :: {:ok, pid()} | {:error, term()}
Starts a backing LlamaCppEx.Server for id (for :server mode).
@callback stop_server(pid()) :: :ok
Stops a backing server, dropping its context and model refs.