ExLLM.Providers.Bumblebee.ModelLoader (ex_llm v0.8.1)
View SourceHandles loading and caching of Bumblebee models for local inference.
This GenServer manages the lifecycle of loaded models, ensuring efficient memory usage and providing fast access to cached models.
Features
- Automatic model downloading from HuggingFace
- Model caching to avoid reloading
- Memory management with model unloading
- Hardware acceleration detection
- Support for local model paths
Summary
Functions
Returns a specification to start this module under a supervisor.
Get information about hardware acceleration.
Get information about a loaded model.
Callback implementation for GenServer.init/1
.
List all loaded models.
Load a model by name or path. Returns {:ok, model_info} or {:error, reason}.
Unload a model from memory.
Functions
Returns a specification to start this module under a supervisor.
See Supervisor
.
Get information about hardware acceleration.
Get information about a loaded model.
Callback implementation for GenServer.init/1
.
List all loaded models.
Load a model by name or path. Returns {:ok, model_info} or {:error, reason}.
Examples
{:ok, model} = ModelLoader.load_model("microsoft/phi-2")
{:ok, model} = ModelLoader.load_model("/path/to/local/model")
Unload a model from memory.