View Source Ollama (Ollama v0.5.2)
Ollama is a nifty little tool for running large language models locally, and this is a nifty little library for working with Ollama in Elixir.
- 🦙 API client fully implementing the Ollama API
- 🛜 Streaming API requests
- Stream to an Enumerable
- Or stream messages to any Elixir process
Installation
The package can be installed by adding ollama
to your list of dependencies
in mix.exs
.
def deps do
[
{:ollama, "0.5.2"}
]
end
Quickstart
API change
The last two minor versions have introduced breaking API changes. We're close to an API that feels nice, so hopefully no more breaking changes 🙏🏻.
0.5.0
- Streaming requests continues to return aTask.t/0
when the:stream
option is apid/0
, but now returns anEnumerable.t/0
when:stream
istrue
. Refer to the section on Streaming.0.4.0
- TheOllama.API
module has been deprecated in favour of the top levelOllama
module.Ollama.API
will be removed in version 1.
Assuming you have Ollama running on localhost, and that you have installed a
model, use completion/2
or chat/2
interact with the model.
1. Generate a completion
iex> client = Ollama.init()
iex> Ollama.completion(client, [
...> model: "llama2",
...> prompt: "Why is the sky blue?",
...> ])
{:ok, %{"response" => "The sky is blue because it is the color of the sky.", ...}}
2. Generate the next message in a chat
iex> client = Ollama.init()
iex> messages = [
...> %{role: "system", content: "You are a helpful assistant."},
...> %{role: "user", content: "Why is the sky blue?"},
...> %{role: "assistant", content: "Due to rayleigh scattering."},
...> %{role: "user", content: "How is that different than mie scattering?"},
...> ]
iex> Ollama.chat(client, [
...> model: "llama2",
...> messages: messages,
...> ])
{:ok, %{"message" => %{
"role" => "assistant",
"content" => "Mie scattering affects all wavelengths similarly, while Rayleigh favors shorter ones."
}, ...}}
Streaming
On endpoints where streaming is supported, a streaming request can be initiated
by setting the :stream
option to true
or a pid/0
.
When :stream
is true
a lazy Enumerable.t/0
is returned which can be
used with any Stream
functions.
iex> Ollama.completion(client, [
...> model: "llama2",
...> prompt: "Why is the sky blue?",
...> stream: true,
...> ])
{:ok, stream}
iex> is_function(stream, 2)
true
iex> stream
...> |> Stream.each(& Process.send(pid, &1, [])
...> |> Stream.run()
:ok
Because the above approach builds the Enumerable.t/0
by calling receive
,
using this approach inside GenServer
callbacks may cause the GenServer to
misbehave. Instead of setting the :stream
option to true
, you can set it
to a pid/0
. A Task.t/0
is returned which will send messages to the
specified process.
The example below demonstrates making a streaming request in a LiveView event, and sends each of the streaming messages back to the same LiveView process.
defmodule MyApp.ChatLive do
use Phoenix.LiveView
# When the client invokes the "prompt" event, create a streaming request and
# asynchronously send messages back to self.
def handle_event("prompt", %{"message" => prompt}, socket) do
{:ok, task} = Ollama.completion(Ollama.init(), [
model: "llama2",
prompt: prompt,
stream: self(),
])
{:noreply, assign(socket, current_request: task)}
end
# The streaming request sends messages back to the LiveView process.
def handle_info({_request_pid, {:data, _data}} = message, socket) do
pid = socket.assigns.current_request.pid
case message do
{^pid, {:data, %{"done" => false} = data}} ->
# handle each streaming chunk
{^pid, {:data, %{"done" => true} = data}} ->
# handle the final streaming chunk
{_pid, _data} ->
# this message was not expected!
end
end
# Tidy up when the request is finished
def handle_info({ref, {:ok, %Req.Response{status: 200}}}, socket) do
Process.demonitor(ref, [:flush])
{:noreply, assign(socket, current_request: nil)}
end
end
Regardless of which approach to streaming you use, each of the streaming
messages are a plain map/0
. Refer to the Ollama API docs
for the schema.
Summary
Functions
Generates the next message in a chat using the specified model. Optionally streamable.
Checks a blob exists in ollama by its digest or binary data.
Generates a completion for the given prompt using the specified model. Optionally streamable.
Creates a model with another name from an existing model.
Creates a blob from its binary data.
Creates a model using the given name and model file. Optionally streamable.
Deletes a model and its data.
Generate embeddings from a model for the given prompt.
Creates a new Ollama API client. Accepts either a base URL for the Ollama API,
a keyword list of options passed to Req.new/1
, or an existing Req.Request.t/0
struct.
Lists all models that Ollama has available.
Downloads a model from the ollama library. Optionally streamable.
Upload a model to a model library. Requires registering for ollama.ai and adding a public key first. Optionally streamable.
Shows all information for a specific model.
Types
@type client() :: %Ollama{req: Req.Request.t()}
Client struct
@type message() :: map()
Chat message
A chat message is a map/0
with the following fields:
:role
(String.t/0
) - Required. The role of the message, eithersystem
,user
orassistant
.:content
(String.t/0
) - Required. The content of the message.:images
(list ofString.t/0
) - (optional) List of Base64 encoded images (for multimodal models only).
@type response() :: {:ok, map() | boolean() | Enumerable.t() | Task.t()} | {:error, term()}
Client response
Functions
Generates the next message in a chat using the specified model. Optionally streamable.
Options
:model
(String.t/0
) - Required. The ollama model name.:messages
(list ofmap/0
) - Required. List of messages - used to keep a chat memory.:template
(String.t/0
) - Prompt template, overriding the model default.:format
(String.t/0
) - Set the expected format of the response (json
).:stream
- See section on streaming. The default value isfalse
.:keep_alive
- How long to keep the model loaded.:options
- Additional advanced model parameters.
Message structure
Each message is a map with the following fields:
:role
(String.t/0
) - Required. The role of the message, eithersystem
,user
orassistant
.:content
(String.t/0
) - Required. The content of the message.:images
(list ofString.t/0
) - (optional) List of Base64 encoded images (for multimodal models only).
Examples
iex> messages = [
...> %{role: "system", content: "You are a helpful assistant."},
...> %{role: "user", content: "Why is the sky blue?"},
...> %{role: "assistant", content: "Due to rayleigh scattering."},
...> %{role: "user", content: "How is that different than mie scattering?"},
...> ]
iex> Ollama.chat(client, [
...> model: "llama2",
...> messages: messages,
...> ])
{:ok, %{"message" => %{
"role" => "assistant",
"content" => "Mie scattering affects all wavelengths similarly, while Rayleigh favors shorter ones."
}, ...}}
# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.chat(client, [
...> model: "llama2",
...> messages: messages,
...> stream: true,
...> ])
{:ok, Ollama.Streaming{}}
@spec check_blob(client(), Ollama.Blob.digest() | binary()) :: response()
Checks a blob exists in ollama by its digest or binary data.
Examples
iex> Ollama.check_blob(client, "sha256:fe938a131f40e6f6d40083c9f0f430a515233eb2edaa6d72eb85c50d64f2300e")
{:ok, true}
iex> Ollama.check_blob(client, "this should not exist")
{:ok, false}
Generates a completion for the given prompt using the specified model. Optionally streamable.
Options
:model
(String.t/0
) - Required. The ollama model name.:prompt
(String.t/0
) - Required. Prompt to generate a response for.:images
(list ofString.t/0
) - A list of Base64 encoded images to be included with the prompt (for multimodal models only).:system
(String.t/0
) - System prompt, overriding the model default.:template
(String.t/0
) - Prompt template, overriding the model default.:context
- The context parameter returned from a previousf:completion/2
call (enabling short conversational memory).:format
(String.t/0
) - Set the expected format of the response (json
).:raw
(boolean/0
) - Settrue
if specifying a fully templated prompt. (:template
is ingored):stream
- See section on streaming. The default value isfalse
.:keep_alive
- How long to keep the model loaded.:options
- Additional advanced model parameters.
Examples
iex> Ollama.completion(client, [
...> model: "llama2",
...> prompt: "Why is the sky blue?",
...> ])
{:ok, %{"response": "The sky is blue because it is the color of the sky.", ...}}
# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.completion(client, [
...> model: "llama2",
...> prompt: "Why is the sky blue?",
...> stream: true,
...> ])
{:ok, %Ollama.Streaming{}}
Creates a model with another name from an existing model.
Options
:source
(String.t/0
) - Required. Name of the model to copy from.:destination
(String.t/0
) - Required. Name of the model to copy to.
Example
iex> Ollama.copy_model(client, [
...> source: "llama2",
...> destination: "llama2-backup"
...> ])
{:ok, true}
Creates a blob from its binary data.
Example
iex> Ollama.create_blob(client, data)
{:ok, true}
Creates a model using the given name and model file. Optionally streamable.
Any dependent blobs reference in the modelfile, such as FROM
and ADAPTER
instructions, must exist first. See check_blob/2
and create_blob/2
.
Options
:name
(String.t/0
) - Required. Name of the model to create.:modelfile
(String.t/0
) - Required. Contents of the Modelfile.:stream
- See section on streaming. The default value isfalse
.
Example
iex> modelfile = "FROM llama2\nSYSTEM \"You are mario from Super Mario Bros.\""
iex> Ollama.create_model(client, [
...> name: "mario",
...> modelfile: modelfile,
...> stream: true,
...> ])
{:ok, Ollama.Streaming{}}
Deletes a model and its data.
Options
:source
(String.t/0
) - Required. Name of the model to copy from.:destination
(String.t/0
) - Required. Name of the model to copy to.
Example
iex> Ollama.delete_model(client, name: "llama2")
{:ok, true}
Generate embeddings from a model for the given prompt.
Options
:model
(String.t/0
) - Required. The name of the model used to generate the embeddings.:prompt
(String.t/0
) - Required. The prompt used to generate the embedding.:keep_alive
- How long to keep the model loaded.:options
- Additional advanced model parameters.
Example
iex> Ollama.embeddings(client, [
...> model: "llama2",
...> prompt: "Here is an article about llamas..."
...> ])
{:ok, %{"embedding" => [
0.5670403838157654, 0.009260174818336964, 0.23178744316101074, -0.2916173040866852, -0.8924556970596313,
0.8785552978515625, -0.34576427936553955, 0.5742510557174683, -0.04222835972905159, -0.137906014919281
]}}
@spec init(Req.url() | keyword() | Req.Request.t()) :: client()
Creates a new Ollama API client. Accepts either a base URL for the Ollama API,
a keyword list of options passed to Req.new/1
, or an existing Req.Request.t/0
struct.
If no arguments are given, the client is initiated with the default options:
@default_req_opts [
base_url: "http://localhost:11434/api",
receive_timeout: 60_000,
]
Examples
iex> client = Ollama.init("https://ollama.service.ai:11434/api")
%Ollama{}
Lists all models that Ollama has available.
Example
iex> Ollama.list_models(client)
{:ok, %{"models" => [
%{"name" => "codellama:13b", ...},
%{"name" => "llama2:latest", ...},
]}}
Downloads a model from the ollama library. Optionally streamable.
Options
:name
(String.t/0
) - Required. Name of the model to pull.:stream
- See section on streaming. The default value isfalse
.
Example
iex> Ollama.pull_model(client, name: "llama2")
{:ok, %{"status" => "success"}}
# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.pull_model(client, name: "llama2", stream: true)
{:ok, %Ollama.Streaming{}}
Upload a model to a model library. Requires registering for ollama.ai and adding a public key first. Optionally streamable.
Options
:name
(String.t/0
) - Required. Name of the model to pull.:stream
- See section on streaming. The default value isfalse
.
Example
iex> Ollama.push_model(client, name: "mattw/pygmalion:latest")
{:ok, %{"status" => "success"}}
# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.push_model(client, name: "mattw/pygmalion:latest", stream: true)
{:ok, %Ollama.Streaming{}}
Shows all information for a specific model.
Options
:name
(String.t/0
) - Required. Name of the model to show.
Example
iex> Ollama.show_model(client, name: "llama2")
{:ok, %{
"details" => %{
"families" => ["llama", "clip"],
"family" => "llama",
"format" => "gguf",
"parameter_size" => "7B",
"quantization_level" => "Q4_0"
},
"modelfile" => "...",
"parameters" => "...",
"template" => "..."
}}