View Source Ollama (Ollama v0.4.1)

Ollama-ex

License

Ollama is a nifty little tool for running large language models locally, and this is a nifty little library for working with Ollama in Elixir.

  • API client fully implementing the Ollama API.
  • Stream API responses to any Elixir process.

Installation

The package can be installed by adding ollama to your list of dependencies in mix.exs.

def deps do
  [
    {:ollama, "0.4.1"}
  ]
end

Quickstart

API change

The Ollama.API module has been deprecated in favour of the top level Ollama module. Apologies for the namespace change. Ollama.API will be removed in version 1.

Assuming you have Ollama running on localhost, and that you have installed a model, use completion/2 or chat/2 interact with the model.

1. Generate a completion

iex> client = Ollama.init()

iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...> ])
{:ok, %{"response" => "The sky is blue because it is the color of the sky.", ...}}

2. Generate the next message in a chat

iex> client = Ollama.init()
iex> messages = [
...>   %{role: "system", content: "You are a helpful assistant."},
...>   %{role: "user", content: "Why is the sky blue?"},
...>   %{role: "assistant", content: "Due to rayleigh scattering."},
...>   %{role: "user", content: "How is that different than mie scattering?"},
...> ]

iex> Ollama.chat(client, [
...>   model: "llama2",
...>   messages: messages,
...> ])
{:ok, %{"message" => %{
  "role" => "assistant",
  "content" => "Mie scattering affects all wavelengths similarly, while Rayleigh favors shorter ones."
}, ...}}

Streaming

By default, all endpoints are called with streaming disabled, blocking until the HTTP request completes and the response body is returned. For endpoints where streaming is supported, the :stream option can be set to true or a pid/0. When streaming is enabled, the function returns a Task.t/0, which asynchronously sends messages back to either the calling process, or the process associated with the given pid/0.

iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...>   stream: true,
...> ])
{:ok, %Task{}}

iex> Ollama.chat(client, [
...>   model: "llama2",
...>   messages: messages,
...>   stream: true,
...> ])
{:ok, %Task{}}

Messages will be sent in the following format, allowing the receiving process to pattern match against the pid of the async task if known:

{request_pid, {:data, data}}

The data is a map from the Ollama JSON message. See Ollama API docs.

You could manually create a receive block to handle messages.

receive do
  {^current_message_pid, {:data, %{"done" => true} = data}} ->
    # handle last message
  {^current_message_pid, {:data, data}} ->
    # handle message
  {_pid, _data_} ->
    # this message was not expected!
end

In most cases you will probably use GenServer.handle_info/2. The following example show's how a LiveView process may by constructed to both create the streaming request and receive the streaming messages.

defmodule Ollama.ChatLive do
  use Phoenix.LiveView

  # When the client invokes the "prompt" event, create a streaming request
  # and optionally store the request task into the assigns
  def handle_event("prompt", %{"message" => prompt}, socket) do
    client = Ollama.init()
    {:ok, task} = Ollama.completion(client, [
      model: "llama2",
      prompt: prompt,
      stream: true,
    ])

    {:noreply, assign(socket, current_request: task)}
  end

  # The request task streams messages back to the LiveView process
  def handle_info({_request_pid, {:data, _data}} = message, socket) do
    pid = socket.assigns.current_request.pid
    case message do
      {^pid, {:data, %{"done" => false} = data}} ->
        # handle each streaming chunk

      {^pid, {:data, %{"done" => true} = data}} ->
        # handle the final streaming chunk

      {_pid, _data} ->
        # this message was not expected!
    end
  end
end

Summary

Types

Client struct

Chat message

Client response

Functions

Generates the next message in a chat using the specified model. Optionally streamable.

Checks a blob exists in ollama by its digest or binary data.

Generates a completion for the given prompt using the specified model. Optionally streamable.

Creates a model with another name from an existing model.

Creates a blob from its binary data.

Creates a model using the given name and model file. Optionally streamable.

Deletes a model and its data.

Generate embeddings from a model for the given prompt.

Creates a new Ollama API client. Accepts either a base URL for the Ollama API, a keyword list of options passed to Req.new/1, or an existing Req.Request.t/0 struct.

Lists all models that Ollama has available.

Downloads a model from the ollama library. Optionally streamable.

Upload a model to a model library. Requires registering for ollama.ai and adding a public key first. Optionally streamable.

Shows all information for a specific model.

Types

@type client() :: %Ollama{req: Req.Request.t()}

Client struct

@type message() :: map()

Chat message

A chat message is a map/0 with the following fields:

  • :role (String.t/0) - Required. The role of the message, either system, user or assistant.
  • :content (String.t/0) - Required. The content of the message.
  • :images (list of String.t/0) - (optional) List of Base64 encoded images (for multimodal models only).
@type response() :: {:ok, Task.t() | map() | boolean()} | {:error, term()}

Client response

Functions

@spec chat(
  client(),
  keyword()
) :: response()

Generates the next message in a chat using the specified model. Optionally streamable.

Options

  • :model (String.t/0) - Required. The ollama model name.
  • :messages (list of map/0) - Required. List of messages - used to keep a chat memory.
  • :template (String.t/0) - Prompt template, overriding the model default.
  • :format (String.t/0) - Set the expected format of the response (json).
  • :stream - See section on streaming. The default value is false.
  • :keep_alive - How long to keep the model loaded.
  • :options - Additional advanced model parameters.

Message structure

Each message is a map with the following fields:

  • :role (String.t/0) - Required. The role of the message, either system, user or assistant.
  • :content (String.t/0) - Required. The content of the message.
  • :images (list of String.t/0) - (optional) List of Base64 encoded images (for multimodal models only).

Examples

iex> messages = [
...>   %{role: "system", content: "You are a helpful assistant."},
...>   %{role: "user", content: "Why is the sky blue?"},
...>   %{role: "assistant", content: "Due to rayleigh scattering."},
...>   %{role: "user", content: "How is that different than mie scattering?"},
...> ]

iex> Ollama.chat(client, [
...>   model: "llama2",
...>   messages: messages,
...> ])
{:ok, %{"message" => %{
  "role" => "assistant",
  "content" => "Mie scattering affects all wavelengths similarly, while Rayleigh favors shorter ones."
}, ...}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.chat(client, [
...>   model: "llama2",
...>   messages: messages,
...>   stream: true,
...> ])
{:ok, Task{}}
Link to this function

check_blob(client, digest)

View Source
@spec check_blob(client(), Ollama.Blob.digest() | binary()) :: response()

Checks a blob exists in ollama by its digest or binary data.

Examples

iex> Ollama.check_blob(client, "sha256:fe938a131f40e6f6d40083c9f0f430a515233eb2edaa6d72eb85c50d64f2300e")
{:ok, true}

iex> Ollama.check_blob(client, "this should not exist")
{:ok, false}
Link to this function

completion(client, params)

View Source
@spec completion(
  client(),
  keyword()
) :: response()

Generates a completion for the given prompt using the specified model. Optionally streamable.

Options

  • :model (String.t/0) - Required. The ollama model name.
  • :prompt (String.t/0) - Required. Prompt to generate a response for.
  • :images (list of String.t/0) - A list of Base64 encoded images to be included with the prompt (for multimodal models only).
  • :system (String.t/0) - System prompt, overriding the model default.
  • :template (String.t/0) - Prompt template, overriding the model default.
  • :context - The context parameter returned from a previous f:completion/2 call (enabling short conversational memory).
  • :format (String.t/0) - Set the expected format of the response (json).
  • :stream - See section on streaming. The default value is false.
  • :keep_alive - How long to keep the model loaded.
  • :options - Additional advanced model parameters.

Examples

iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...> ])
{:ok, %{"response": "The sky is blue because it is the color of the sky.", ...}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...>   stream: true,
...> ])
{:ok, %Task{}}
Link to this function

copy_model(client, params)

View Source
@spec copy_model(
  client(),
  keyword()
) :: response()

Creates a model with another name from an existing model.

Options

  • :source (String.t/0) - Required. Name of the model to copy from.
  • :destination (String.t/0) - Required. Name of the model to copy to.

Example

iex> Ollama.copy_model(client, [
...>   source: "llama2",
...>   destination: "llama2-backup"
...> ])
{:ok, true}
Link to this function

create_blob(client, blob)

View Source
@spec create_blob(client(), binary()) :: response()

Creates a blob from its binary data.

Example

iex> Ollama.create_blob(client, data)
{:ok, true}
Link to this function

create_model(client, params)

View Source
@spec create_model(
  client(),
  keyword()
) :: response()

Creates a model using the given name and model file. Optionally streamable.

Any dependent blobs reference in the modelfile, such as FROM and ADAPTER instructions, must exist first. See check_blob/2 and create_blob/2.

Options

Example

iex> modelfile = "FROM llama2\nSYSTEM \"You are mario from Super Mario Bros.\""
iex> Ollama.create_model(client, [
...>   name: "mario",
...>   modelfile: modelfile,
...>   stream: true,
...> ])
{:ok, Task{}}
Link to this function

delete_model(client, params)

View Source
@spec delete_model(
  client(),
  keyword()
) :: response()

Deletes a model and its data.

Options

  • :source (String.t/0) - Required. Name of the model to copy from.
  • :destination (String.t/0) - Required. Name of the model to copy to.

Example

iex> Ollama.delete_model(client, name: "llama2")
{:ok, true}
Link to this function

embeddings(client, params)

View Source
@spec embeddings(
  client(),
  keyword()
) :: response()

Generate embeddings from a model for the given prompt.

Options

  • :model (String.t/0) - Required. The name of the model used to generate the embeddings.
  • :prompt (String.t/0) - Required. The prompt used to generate the embedding.
  • :keep_alive - How long to keep the model loaded.
  • :options - Additional advanced model parameters.

Example

iex> Ollama.embeddings(client, [
...>   model: "llama2",
...>   prompt: "Here is an article about llamas..."
...> ])
{:ok, %{"embedding" => [
  0.5670403838157654, 0.009260174818336964, 0.23178744316101074, -0.2916173040866852, -0.8924556970596313,
  0.8785552978515625, -0.34576427936553955, 0.5742510557174683, -0.04222835972905159, -0.137906014919281
]}}
@spec init(Req.url() | keyword() | Req.Request.t()) :: client()

Creates a new Ollama API client. Accepts either a base URL for the Ollama API, a keyword list of options passed to Req.new/1, or an existing Req.Request.t/0 struct.

If no arguments are given, the client is initiated with the default options:

@default_req_opts [
  base_url: "http://localhost:11434/api",
  receive_timeout: 60_000,
]

Examples

iex> client = Ollama.init("https://ollama.service.ai:11434/api")
%Ollama{}
@spec list_models(client()) :: response()

Lists all models that Ollama has available.

Example

iex> Ollama.list_models(client)
{:ok, %{"models" => [
  %{"name" => "codellama:13b", ...},
  %{"name" => "llama2:latest", ...},
]}}
Link to this function

pull_model(client, params)

View Source
@spec pull_model(
  client(),
  keyword()
) :: response()

Downloads a model from the ollama library. Optionally streamable.

Options

Example

iex> Ollama.pull_model(client, name: "llama2")
{:ok, %{"status" => "success"}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.pull_model(client, name: "llama2", stream: true)
{:ok, %Task{}}
Link to this function

push_model(client, params)

View Source
@spec push_model(
  client(),
  keyword()
) :: response()

Upload a model to a model library. Requires registering for ollama.ai and adding a public key first. Optionally streamable.

Options

Example

iex> Ollama.push_model(client, name: "mattw/pygmalion:latest")
{:ok, %{"status" => "success"}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.push_model(client, name: "mattw/pygmalion:latest", stream: true)
{:ok, %Task{}}
Link to this function

show_model(client, params)

View Source
@spec show_model(
  client(),
  keyword()
) :: response()

Shows all information for a specific model.

Options

  • :name (String.t/0) - Required. Name of the model to show.

Example

iex> Ollama.show_model(client, name: "llama2")
{:ok, %{
  "details" => %{
    "families" => ["llama", "clip"],
    "family" => "llama",
    "format" => "gguf",
    "parameter_size" => "7B",
    "quantization_level" => "Q4_0"
  },
  "modelfile" => "...",
  "parameters" => "...",
  "template" => "..."
}}