View Source Ollama (Ollama v0.4.1)
Ollama is a nifty little tool for running large language models locally, and this is a nifty little library for working with Ollama in Elixir.
- API client fully implementing the Ollama API.
- Stream API responses to any Elixir process.
Installation
The package can be installed by adding ollama
to your list of dependencies
in mix.exs
.
def deps do
[
{:ollama, "0.4.1"}
]
end
Quickstart
API change
The
Ollama.API
module has been deprecated in favour of the top levelOllama
module. Apologies for the namespace change.Ollama.API
will be removed in version 1.
Assuming you have Ollama running on localhost, and that you have installed a
model, use completion/2
or chat/2
interact with the model.
1. Generate a completion
iex> client = Ollama.init()
iex> Ollama.completion(client, [
...> model: "llama2",
...> prompt: "Why is the sky blue?",
...> ])
{:ok, %{"response" => "The sky is blue because it is the color of the sky.", ...}}
2. Generate the next message in a chat
iex> client = Ollama.init()
iex> messages = [
...> %{role: "system", content: "You are a helpful assistant."},
...> %{role: "user", content: "Why is the sky blue?"},
...> %{role: "assistant", content: "Due to rayleigh scattering."},
...> %{role: "user", content: "How is that different than mie scattering?"},
...> ]
iex> Ollama.chat(client, [
...> model: "llama2",
...> messages: messages,
...> ])
{:ok, %{"message" => %{
"role" => "assistant",
"content" => "Mie scattering affects all wavelengths similarly, while Rayleigh favors shorter ones."
}, ...}}
Streaming
By default, all endpoints are called with streaming disabled, blocking until
the HTTP request completes and the response body is returned. For endpoints
where streaming is supported, the :stream
option can be set to true
or a
pid/0
. When streaming is enabled, the function returns a Task.t/0
,
which asynchronously sends messages back to either the calling process, or the
process associated with the given pid/0
.
iex> Ollama.completion(client, [
...> model: "llama2",
...> prompt: "Why is the sky blue?",
...> stream: true,
...> ])
{:ok, %Task{}}
iex> Ollama.chat(client, [
...> model: "llama2",
...> messages: messages,
...> stream: true,
...> ])
{:ok, %Task{}}
Messages will be sent in the following format, allowing the receiving process to pattern match against the pid of the async task if known:
{request_pid, {:data, data}}
The data is a map from the Ollama JSON message. See Ollama API docs.
You could manually create a receive
block to handle messages.
receive do
{^current_message_pid, {:data, %{"done" => true} = data}} ->
# handle last message
{^current_message_pid, {:data, data}} ->
# handle message
{_pid, _data_} ->
# this message was not expected!
end
In most cases you will probably use GenServer.handle_info/2
. The following
example show's how a LiveView process may by constructed to both create the
streaming request and receive the streaming messages.
defmodule Ollama.ChatLive do
use Phoenix.LiveView
# When the client invokes the "prompt" event, create a streaming request
# and optionally store the request task into the assigns
def handle_event("prompt", %{"message" => prompt}, socket) do
client = Ollama.init()
{:ok, task} = Ollama.completion(client, [
model: "llama2",
prompt: prompt,
stream: true,
])
{:noreply, assign(socket, current_request: task)}
end
# The request task streams messages back to the LiveView process
def handle_info({_request_pid, {:data, _data}} = message, socket) do
pid = socket.assigns.current_request.pid
case message do
{^pid, {:data, %{"done" => false} = data}} ->
# handle each streaming chunk
{^pid, {:data, %{"done" => true} = data}} ->
# handle the final streaming chunk
{_pid, _data} ->
# this message was not expected!
end
end
end
Summary
Functions
Generates the next message in a chat using the specified model. Optionally streamable.
Checks a blob exists in ollama by its digest or binary data.
Generates a completion for the given prompt using the specified model. Optionally streamable.
Creates a model with another name from an existing model.
Creates a blob from its binary data.
Creates a model using the given name and model file. Optionally streamable.
Deletes a model and its data.
Generate embeddings from a model for the given prompt.
Creates a new Ollama API client. Accepts either a base URL for the Ollama API,
a keyword list of options passed to Req.new/1
, or an existing Req.Request.t/0
struct.
Lists all models that Ollama has available.
Downloads a model from the ollama library. Optionally streamable.
Upload a model to a model library. Requires registering for ollama.ai and adding a public key first. Optionally streamable.
Shows all information for a specific model.
Types
@type client() :: %Ollama{req: Req.Request.t()}
Client struct
@type message() :: map()
Chat message
A chat message is a map/0
with the following fields:
:role
(String.t/0
) - Required. The role of the message, eithersystem
,user
orassistant
.:content
(String.t/0
) - Required. The content of the message.:images
(list ofString.t/0
) - (optional) List of Base64 encoded images (for multimodal models only).
Client response
Functions
Generates the next message in a chat using the specified model. Optionally streamable.
Options
:model
(String.t/0
) - Required. The ollama model name.:messages
(list ofmap/0
) - Required. List of messages - used to keep a chat memory.:template
(String.t/0
) - Prompt template, overriding the model default.:format
(String.t/0
) - Set the expected format of the response (json
).:stream
- See section on streaming. The default value isfalse
.:keep_alive
- How long to keep the model loaded.:options
- Additional advanced model parameters.
Message structure
Each message is a map with the following fields:
:role
(String.t/0
) - Required. The role of the message, eithersystem
,user
orassistant
.:content
(String.t/0
) - Required. The content of the message.:images
(list ofString.t/0
) - (optional) List of Base64 encoded images (for multimodal models only).
Examples
iex> messages = [
...> %{role: "system", content: "You are a helpful assistant."},
...> %{role: "user", content: "Why is the sky blue?"},
...> %{role: "assistant", content: "Due to rayleigh scattering."},
...> %{role: "user", content: "How is that different than mie scattering?"},
...> ]
iex> Ollama.chat(client, [
...> model: "llama2",
...> messages: messages,
...> ])
{:ok, %{"message" => %{
"role" => "assistant",
"content" => "Mie scattering affects all wavelengths similarly, while Rayleigh favors shorter ones."
}, ...}}
# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.chat(client, [
...> model: "llama2",
...> messages: messages,
...> stream: true,
...> ])
{:ok, Task{}}
@spec check_blob(client(), Ollama.Blob.digest() | binary()) :: response()
Checks a blob exists in ollama by its digest or binary data.
Examples
iex> Ollama.check_blob(client, "sha256:fe938a131f40e6f6d40083c9f0f430a515233eb2edaa6d72eb85c50d64f2300e")
{:ok, true}
iex> Ollama.check_blob(client, "this should not exist")
{:ok, false}
Generates a completion for the given prompt using the specified model. Optionally streamable.
Options
:model
(String.t/0
) - Required. The ollama model name.:prompt
(String.t/0
) - Required. Prompt to generate a response for.:images
(list ofString.t/0
) - A list of Base64 encoded images to be included with the prompt (for multimodal models only).:system
(String.t/0
) - System prompt, overriding the model default.:template
(String.t/0
) - Prompt template, overriding the model default.:context
- The context parameter returned from a previousf:completion/2
call (enabling short conversational memory).:format
(String.t/0
) - Set the expected format of the response (json
).:stream
- See section on streaming. The default value isfalse
.:keep_alive
- How long to keep the model loaded.:options
- Additional advanced model parameters.
Examples
iex> Ollama.completion(client, [
...> model: "llama2",
...> prompt: "Why is the sky blue?",
...> ])
{:ok, %{"response": "The sky is blue because it is the color of the sky.", ...}}
# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.completion(client, [
...> model: "llama2",
...> prompt: "Why is the sky blue?",
...> stream: true,
...> ])
{:ok, %Task{}}
Creates a model with another name from an existing model.
Options
:source
(String.t/0
) - Required. Name of the model to copy from.:destination
(String.t/0
) - Required. Name of the model to copy to.
Example
iex> Ollama.copy_model(client, [
...> source: "llama2",
...> destination: "llama2-backup"
...> ])
{:ok, true}
Creates a blob from its binary data.
Example
iex> Ollama.create_blob(client, data)
{:ok, true}
Creates a model using the given name and model file. Optionally streamable.
Any dependent blobs reference in the modelfile, such as FROM
and ADAPTER
instructions, must exist first. See check_blob/2
and create_blob/2
.
Options
:name
(String.t/0
) - Required. Name of the model to create.:modelfile
(String.t/0
) - Required. Contents of the Modelfile.:stream
- See section on streaming. The default value isfalse
.
Example
iex> modelfile = "FROM llama2\nSYSTEM \"You are mario from Super Mario Bros.\""
iex> Ollama.create_model(client, [
...> name: "mario",
...> modelfile: modelfile,
...> stream: true,
...> ])
{:ok, Task{}}
Deletes a model and its data.
Options
:source
(String.t/0
) - Required. Name of the model to copy from.:destination
(String.t/0
) - Required. Name of the model to copy to.
Example
iex> Ollama.delete_model(client, name: "llama2")
{:ok, true}
Generate embeddings from a model for the given prompt.
Options
:model
(String.t/0
) - Required. The name of the model used to generate the embeddings.:prompt
(String.t/0
) - Required. The prompt used to generate the embedding.:keep_alive
- How long to keep the model loaded.:options
- Additional advanced model parameters.
Example
iex> Ollama.embeddings(client, [
...> model: "llama2",
...> prompt: "Here is an article about llamas..."
...> ])
{:ok, %{"embedding" => [
0.5670403838157654, 0.009260174818336964, 0.23178744316101074, -0.2916173040866852, -0.8924556970596313,
0.8785552978515625, -0.34576427936553955, 0.5742510557174683, -0.04222835972905159, -0.137906014919281
]}}
@spec init(Req.url() | keyword() | Req.Request.t()) :: client()
Creates a new Ollama API client. Accepts either a base URL for the Ollama API,
a keyword list of options passed to Req.new/1
, or an existing Req.Request.t/0
struct.
If no arguments are given, the client is initiated with the default options:
@default_req_opts [
base_url: "http://localhost:11434/api",
receive_timeout: 60_000,
]
Examples
iex> client = Ollama.init("https://ollama.service.ai:11434/api")
%Ollama{}
Lists all models that Ollama has available.
Example
iex> Ollama.list_models(client)
{:ok, %{"models" => [
%{"name" => "codellama:13b", ...},
%{"name" => "llama2:latest", ...},
]}}
Downloads a model from the ollama library. Optionally streamable.
Options
:name
(String.t/0
) - Required. Name of the model to pull.:stream
- See section on streaming. The default value isfalse
.
Example
iex> Ollama.pull_model(client, name: "llama2")
{:ok, %{"status" => "success"}}
# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.pull_model(client, name: "llama2", stream: true)
{:ok, %Task{}}
Upload a model to a model library. Requires registering for ollama.ai and adding a public key first. Optionally streamable.
Options
:name
(String.t/0
) - Required. Name of the model to pull.:stream
- See section on streaming. The default value isfalse
.
Example
iex> Ollama.push_model(client, name: "mattw/pygmalion:latest")
{:ok, %{"status" => "success"}}
# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.push_model(client, name: "mattw/pygmalion:latest", stream: true)
{:ok, %Task{}}
Shows all information for a specific model.
Options
:name
(String.t/0
) - Required. Name of the model to show.
Example
iex> Ollama.show_model(client, name: "llama2")
{:ok, %{
"details" => %{
"families" => ["llama", "clip"],
"family" => "llama",
"format" => "gguf",
"parameter_size" => "7B",
"quantization_level" => "Q4_0"
},
"modelfile" => "...",
"parameters" => "...",
"template" => "..."
}}