Inference execution for Candil.
Handles chat completions and embeddings for both local engines and remote providers. Normalises the request/response format across the supported provider APIs (OpenAI, Anthropic, Ollama, OpenAI-compatible).
Message format
All messages are plain maps with :role and :content string keys:
%{role: "system", content: "You are a helpful assistant."}
%{role: "user", content: "Hello!"}
%{role: "assistant", content: "Hi there!"}Response format
All chat functions return a response() map:
%{
content: "Hello, how can I help?",
role: "assistant",
model: "llama-3-8b",
finish_reason: "stop",
usage: %{prompt_tokens: 12, completion_tokens: 8, total_tokens: 20}
}
Summary
Functions
Runs a chat completion against a local llama-server.
Runs a chat completion against a remote provider.
Generates embeddings for a list of texts against a local engine.
Generates embeddings for a list of texts against a remote provider.
Types
@type embed_response() :: [[float()]]
@type usage() :: %{ prompt_tokens: non_neg_integer(), completion_tokens: non_neg_integer(), total_tokens: non_neg_integer() }
Functions
Runs a chat completion against a local llama-server.
The engine must be running and healthy. Resolves the server URL from the registry via the model alias.
Options
:temperature— sampling temperature 0.0–2.0 (default:0.7):max_tokens— maximum tokens to generate (default:512):stop— list of stop sequences (default:[]):system— system prompt string (prepended to messages if set)
@spec chat_remote(Candil.Model.t(), Candil.Provider.t(), [message()], keyword()) :: {:ok, response()} | {:error, any()}
Runs a chat completion against a remote provider.
Dispatches to the appropriate protocol based on provider.type.
Options
Same as chat_local/3.
@spec embed_local(atom(), [binary()], keyword()) :: {:ok, embed_response()} | {:error, any()}
Generates embeddings for a list of texts against a local engine.
The model must have :embeddings in its usage list.
@spec embed_remote(Candil.Model.t(), Candil.Provider.t(), [binary()], keyword()) :: {:ok, embed_response()} | {:error, any()}
Generates embeddings for a list of texts against a remote provider.