View Source Ollama (Ollama v0.6.0)

Ollama-ex

License

Ollama is a nifty little tool for running large language models locally, and this is a nifty little library for working with Ollama in Elixir.

  • 🦙 API client fully implementing the Ollama API
  • 🛜 Streaming API requests
    • Stream to an Enumerable
    • Or stream messages to any Elixir process

Installation

The package can be installed by adding ollama to your list of dependencies in mix.exs.

def deps do
  [
    {:ollama, "0.6.0"}
  ]
end

Quickstart

Assuming you have Ollama running on localhost, and that you have installed a model, use completion/2 or chat/2 interact with the model.

1. Generate a completion

iex> client = Ollama.init()

iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...> ])
{:ok, %{"response" => "The sky is blue because it is the color of the sky.", ...}}

2. Generate the next message in a chat

iex> client = Ollama.init()
iex> messages = [
...>   %{role: "system", content: "You are a helpful assistant."},
...>   %{role: "user", content: "Why is the sky blue?"},
...>   %{role: "assistant", content: "Due to rayleigh scattering."},
...>   %{role: "user", content: "How is that different than mie scattering?"},
...> ]

iex> Ollama.chat(client, [
...>   model: "llama2",
...>   messages: messages,
...> ])
{:ok, %{"message" => %{
  "role" => "assistant",
  "content" => "Mie scattering affects all wavelengths similarly, while Rayleigh favors shorter ones."
}, ...}}

Streaming

On endpoints where streaming is supported, a streaming request can be initiated by setting the :stream option to true or a pid/0.

When :stream is true a lazy Enumerable.t/0 is returned which can be used with any Stream functions.

iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...>   stream: true,
...> ])
{:ok, stream}

iex> is_function(stream, 2)
true

iex> stream
...> |> Stream.each(& Process.send(pid, &1, [])
...> |> Stream.run()
:ok

Because the above approach builds the Enumerable.t/0 by calling receive, using this approach inside GenServer callbacks may cause the GenServer to misbehave. Instead of setting the :stream option to true, you can set it to a pid/0. A Task.t/0 is returned which will send messages to the specified process.

The example below demonstrates making a streaming request in a LiveView event, and sends each of the streaming messages back to the same LiveView process.

defmodule MyApp.ChatLive do
  use Phoenix.LiveView

  # When the client invokes the "prompt" event, create a streaming request and
  # asynchronously send messages back to self.
  def handle_event("prompt", %{"message" => prompt}, socket) do
    {:ok, task} = Ollama.completion(Ollama.init(), [
      model: "llama2",
      prompt: prompt,
      stream: self(),
    ])

    {:noreply, assign(socket, current_request: task)}
  end

  # The streaming request sends messages back to the LiveView process.
  def handle_info({_request_pid, {:data, _data}} = message, socket) do
    pid = socket.assigns.current_request.pid
    case message do
      {^pid, {:data, %{"done" => false} = data}} ->
        # handle each streaming chunk

      {^pid, {:data, %{"done" => true} = data}} ->
        # handle the final streaming chunk

      {_pid, _data} ->
        # this message was not expected!
    end
  end

  # Tidy up when the request is finished
  def handle_info({ref, {:ok, %Req.Response{status: 200}}}, socket) do
    Process.demonitor(ref, [:flush])
    {:noreply, assign(socket, current_request: nil)}
  end
end

Regardless of which approach to streaming you use, each of the streaming messages are a plain map/0. Refer to the Ollama API docs for the schema.

Summary

Types

Client struct

Chat message

Client response

Functions

Generates the next message in a chat using the specified model. Optionally streamable.

Checks a blob exists in ollama by its digest or binary data.

Generates a completion for the given prompt using the specified model. Optionally streamable.

Creates a model with another name from an existing model.

Creates a blob from its binary data.

Creates a model using the given name and model file. Optionally streamable.

Deletes a model and its data.

Generate embeddings from a model for the given prompt.

Creates a new Ollama API client. Accepts either a base URL for the Ollama API, a keyword list of options passed to Req.new/1, or an existing Req.Request.t/0 struct.

Lists all models that Ollama has available.

Downloads a model from the ollama library. Optionally streamable.

Upload a model to a model library. Requires registering for ollama.ai and adding a public key first. Optionally streamable.

Shows all information for a specific model.

Types

@type client() :: %Ollama{req: Req.Request.t()}

Client struct

@type message() :: map()

Chat message

A chat message is a map/0 with the following fields:

  • :role (String.t/0) - Required. The role of the message, either system, user or assistant.
  • :content (String.t/0) - Required. The content of the message.
  • :images (list of String.t/0) - (optional) List of Base64 encoded images (for multimodal models only).
@type response() ::
  {:ok, map() | boolean() | Enumerable.t() | Task.t()} | {:error, term()}

Client response

Functions

@spec chat(
  client(),
  keyword()
) :: response()

Generates the next message in a chat using the specified model. Optionally streamable.

Options

  • :model (String.t/0) - Required. The ollama model name.
  • :messages (list of map/0) - Required. List of messages - used to keep a chat memory.
  • :template (String.t/0) - Prompt template, overriding the model default.
  • :format (String.t/0) - Set the expected format of the response (json).
  • :stream - See section on streaming. The default value is false.
  • :keep_alive - How long to keep the model loaded.
  • :options - Additional advanced model parameters.

Message structure

Each message is a map with the following fields:

  • :role (String.t/0) - Required. The role of the message, either system, user or assistant.
  • :content (String.t/0) - Required. The content of the message.
  • :images (list of String.t/0) - (optional) List of Base64 encoded images (for multimodal models only).

Examples

iex> messages = [
...>   %{role: "system", content: "You are a helpful assistant."},
...>   %{role: "user", content: "Why is the sky blue?"},
...>   %{role: "assistant", content: "Due to rayleigh scattering."},
...>   %{role: "user", content: "How is that different than mie scattering?"},
...> ]

iex> Ollama.chat(client, [
...>   model: "llama2",
...>   messages: messages,
...> ])
{:ok, %{"message" => %{
  "role" => "assistant",
  "content" => "Mie scattering affects all wavelengths similarly, while Rayleigh favors shorter ones."
}, ...}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.chat(client, [
...>   model: "llama2",
...>   messages: messages,
...>   stream: true,
...> ])
{:ok, Ollama.Streaming{}}
Link to this function

check_blob(client, digest)

View Source
@spec check_blob(client(), Ollama.Blob.digest() | binary()) :: response()

Checks a blob exists in ollama by its digest or binary data.

Examples

iex> Ollama.check_blob(client, "sha256:fe938a131f40e6f6d40083c9f0f430a515233eb2edaa6d72eb85c50d64f2300e")
{:ok, true}

iex> Ollama.check_blob(client, "this should not exist")
{:ok, false}
Link to this function

completion(client, params)

View Source
@spec completion(
  client(),
  keyword()
) :: response()

Generates a completion for the given prompt using the specified model. Optionally streamable.

Options

  • :model (String.t/0) - Required. The ollama model name.
  • :prompt (String.t/0) - Required. Prompt to generate a response for.
  • :images (list of String.t/0) - A list of Base64 encoded images to be included with the prompt (for multimodal models only).
  • :system (String.t/0) - System prompt, overriding the model default.
  • :template (String.t/0) - Prompt template, overriding the model default.
  • :context - The context parameter returned from a previous f:completion/2 call (enabling short conversational memory).
  • :format (String.t/0) - Set the expected format of the response (json).
  • :raw (boolean/0) - Set true if specifying a fully templated prompt. (:template is ingored)
  • :stream - See section on streaming. The default value is false.
  • :keep_alive - How long to keep the model loaded.
  • :options - Additional advanced model parameters.

Examples

iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...> ])
{:ok, %{"response": "The sky is blue because it is the color of the sky.", ...}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.completion(client, [
...>   model: "llama2",
...>   prompt: "Why is the sky blue?",
...>   stream: true,
...> ])
{:ok, %Ollama.Streaming{}}
Link to this function

copy_model(client, params)

View Source
@spec copy_model(
  client(),
  keyword()
) :: response()

Creates a model with another name from an existing model.

Options

  • :source (String.t/0) - Required. Name of the model to copy from.
  • :destination (String.t/0) - Required. Name of the model to copy to.

Example

iex> Ollama.copy_model(client, [
...>   source: "llama2",
...>   destination: "llama2-backup"
...> ])
{:ok, true}
Link to this function

create_blob(client, blob)

View Source
@spec create_blob(client(), binary()) :: response()

Creates a blob from its binary data.

Example

iex> Ollama.create_blob(client, data)
{:ok, true}
Link to this function

create_model(client, params)

View Source
@spec create_model(
  client(),
  keyword()
) :: response()

Creates a model using the given name and model file. Optionally streamable.

Any dependent blobs reference in the modelfile, such as FROM and ADAPTER instructions, must exist first. See check_blob/2 and create_blob/2.

Options

Example

iex> modelfile = "FROM llama2\nSYSTEM \"You are mario from Super Mario Bros.\""
iex> Ollama.create_model(client, [
...>   name: "mario",
...>   modelfile: modelfile,
...>   stream: true,
...> ])
{:ok, Ollama.Streaming{}}
Link to this function

delete_model(client, params)

View Source
@spec delete_model(
  client(),
  keyword()
) :: response()

Deletes a model and its data.

Options

  • :source (String.t/0) - Required. Name of the model to copy from.
  • :destination (String.t/0) - Required. Name of the model to copy to.

Example

iex> Ollama.delete_model(client, name: "llama2")
{:ok, true}
Link to this function

embeddings(client, params)

View Source
@spec embeddings(
  client(),
  keyword()
) :: response()

Generate embeddings from a model for the given prompt.

Options

  • :model (String.t/0) - Required. The name of the model used to generate the embeddings.
  • :prompt (String.t/0) - Required. The prompt used to generate the embedding.
  • :keep_alive - How long to keep the model loaded.
  • :options - Additional advanced model parameters.

Example

iex> Ollama.embeddings(client, [
...>   model: "llama2",
...>   prompt: "Here is an article about llamas..."
...> ])
{:ok, %{"embedding" => [
  0.5670403838157654, 0.009260174818336964, 0.23178744316101074, -0.2916173040866852, -0.8924556970596313,
  0.8785552978515625, -0.34576427936553955, 0.5742510557174683, -0.04222835972905159, -0.137906014919281
]}}
@spec init(Req.url() | keyword() | Req.Request.t()) :: client()

Creates a new Ollama API client. Accepts either a base URL for the Ollama API, a keyword list of options passed to Req.new/1, or an existing Req.Request.t/0 struct.

If no arguments are given, the client is initiated with the default options:

@default_req_opts [
  base_url: "http://localhost:11434/api",
  receive_timeout: 60_000,
]

Examples

iex> client = Ollama.init("https://ollama.service.ai:11434/api")
%Ollama{}
@spec list_models(client()) :: response()

Lists all models that Ollama has available.

Example

iex> Ollama.list_models(client)
{:ok, %{"models" => [
  %{"name" => "codellama:13b", ...},
  %{"name" => "llama2:latest", ...},
]}}
Link to this function

pull_model(client, params)

View Source
@spec pull_model(
  client(),
  keyword()
) :: response()

Downloads a model from the ollama library. Optionally streamable.

Options

Example

iex> Ollama.pull_model(client, name: "llama2")
{:ok, %{"status" => "success"}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.pull_model(client, name: "llama2", stream: true)
{:ok, %Ollama.Streaming{}}
Link to this function

push_model(client, params)

View Source
@spec push_model(
  client(),
  keyword()
) :: response()

Upload a model to a model library. Requires registering for ollama.ai and adding a public key first. Optionally streamable.

Options

Example

iex> Ollama.push_model(client, name: "mattw/pygmalion:latest")
{:ok, %{"status" => "success"}}

# Passing true to the :stream option initiates an async streaming request.
iex> Ollama.push_model(client, name: "mattw/pygmalion:latest", stream: true)
{:ok, %Ollama.Streaming{}}
Link to this function

show_model(client, params)

View Source
@spec show_model(
  client(),
  keyword()
) :: response()

Shows all information for a specific model.

Options

  • :name (String.t/0) - Required. Name of the model to show.

Example

iex> Ollama.show_model(client, name: "llama2")
{:ok, %{
  "details" => %{
    "families" => ["llama", "clip"],
    "family" => "llama",
    "format" => "gguf",
    "parameter_size" => "7B",
    "quantization_level" => "Q4_0"
  },
  "modelfile" => "...",
  "parameters" => "...",
  "template" => "..."
}}