Talks to a local llama.cpp server's OpenAI-compatible HTTP API for chat-time helpers.
list_models/1 hits GET /v1/models and returns the loaded model IDs sorted
alphabetically. The llama.cpp server (llama-server) typically serves one model
at a time, so the list usually has a single entry.