LlamaCppEx.ModelManager.Backend behaviour (LlamaCppEx v0.8.24)

Copy Markdown View Source

Behaviour for the model I/O the manager performs on the write path.

The default implementation is LlamaCppEx.ModelManager.ModelIO, which delegates to LlamaCppEx.Hub, LlamaCppEx.Model, and LlamaCppEx.Server. Tests inject a fake via the :io start option to exercise load/unload lifecycle without real GGUF files.

Inference dispatch (generate/stream/chat/embed) does NOT go through this behaviour — it reads the ETS table directly from the caller and calls the relevant module, keeping the manager process off the hot path.

Summary

Callbacks

Loads a model directly (for :direct mode).

Resolves a source to a local file path and its byte size, downloading from the Hub if needed.

Starts a backing LlamaCppEx.Server for id (for :server mode).

Stops a backing server, dropping its context and model refs.

Callbacks

load_model(t, keyword)

@callback load_model(
  String.t(),
  keyword()
) :: {:ok, LlamaCppEx.Model.t()} | {:error, term()}

Loads a model directly (for :direct mode).

resolve_source(source, keyword)

@callback resolve_source(
  LlamaCppEx.ModelManager.Entry.source(),
  keyword()
) :: {:ok, String.t(), non_neg_integer()} | {:error, term()}

Resolves a source to a local file path and its byte size, downloading from the Hub if needed.

start_server(id, path, keyword)

@callback start_server(id :: term(), path :: String.t(), keyword()) ::
  {:ok, pid()} | {:error, term()}

Starts a backing LlamaCppEx.Server for id (for :server mode).

stop_server(pid)

@callback stop_server(pid()) :: :ok

Stops a backing server, dropping its context and model refs.