ExLLM.Adapters.LMStudio (ex_llm v0.5.0)
View SourceLM Studio adapter for local LLM inference.
This adapter provides integration with LM Studio, a desktop application for running local LLMs with an OpenAI-compatible API. LM Studio supports models from Hugging Face and provides both GUI and server modes for local inference.
Configuration
LM Studio runs a local server with OpenAI-compatible endpoints. By default, it listens
on http://localhost:1234
with API key "lm-studio"
.
# Basic usage
{:ok, response} = ExLLM.chat(:lmstudio, messages)
# With custom endpoint
{:ok, response} = ExLLM.chat(:lmstudio, messages,
host: "192.168.1.100",
port: 8080
)
Features
- OpenAI-compatible API (
/v1/chat/completions
,/v1/models
,/v1/embeddings
) - Native LM Studio REST API (
/api/v0/*
) with enhanced model information - Model loading status and quantization details
- TTL (Time-To-Live) parameter for automatic model unloading
- Support for both llama.cpp and MLX engines on Apple Silicon
- Streaming chat completions
Requirements
- Install LM Studio from https://lmstudio.ai
- Download and load at least one model in LM Studio
- Start the local server (usually localhost:1234)
- Ensure the server is running when using this adapter
API Endpoints
This adapter uses both OpenAI-compatible and native LM Studio endpoints:
- OpenAI Compatible:
/v1/chat/completions
,/v1/models
,/v1/embeddings
- Native API:
/api/v0/models
,/api/v0/chat/completions
(enhanced features)
The native API provides additional information like model loading status, quantization details, architecture information, and performance metrics.