ReqLLM.Providers.Ollama (ReqLLM v1.13.0)

View Source

Ollama provider — local LLM inference via Ollama's OpenAI-compatible API.

Routes to Ollama's /v1 endpoint (port 11434 by default). No API key required.

Usage

# In jido_ai model alias config
config :jido_ai, model_aliases: [default: "ollama:gemma4:27b"]

# Direct usage
ReqLLM.generate_text("ollama:llama3", "Hello!")
ReqLLM.generate_object("ollama:llama3", "Extract the name", schema)

Configuration

# Optional — defaults to http://localhost:11434/v1
config :req_llm, :ollama, base_url: "http://my-ollama-host:11434/v1"

Ollama-Specific Options

Pass via provider_options: keyword:

  • num_ctx — context window size in tokens (Ollama options.num_ctx)
  • keep_alive — how long to keep model loaded, e.g. "30m" or 0 to unload immediately

Examples

ReqLLM.generate_text("ollama:gemma4:27b", "Hello",
  provider_options: [num_ctx: 16_384, keep_alive: "30m"]
)

Summary

Functions

Attaches Ollama-specific pipeline steps.

Default implementation of attach_stream/4.

Builds the Ollama request body.

Default implementation of decode_response/1.

Default implementation of decode_stream_event/2.

Default implementation of encode_body/1.

Default implementation of extract_usage/2.

Default implementation of prepare_request/4.

Default implementation of translate_options/3.

Functions

attach(request, model_input, user_opts)

Attaches Ollama-specific pipeline steps.

Unlike OpenAI-compatible providers, Ollama does not require authentication. This override skips the Authorization: Bearer header entirely so users do not need to set any API key environment variable.

attach_stream(model, context, opts, finch_name)

Default implementation of attach_stream/4.

Builds complete streaming requests using OpenAI-compatible format.

base_url()

build_body(request)

Builds the Ollama request body.

Extends the standard OpenAI-compat body with two Ollama-specific fields:

  • options.num_ctx — nested under the options map (Ollama model parameter)
  • keep_alive — top-level field controlling how long the model stays loaded

decode_response(request_response)

Default implementation of decode_response/1.

Handles success/error responses with standard ReqLLM.Response creation.

decode_stream_event(event, model)

Default implementation of decode_stream_event/2.

Decodes SSE events using OpenAI-compatible format.

default_base_url()

encode_body(request)

Default implementation of encode_body/1.

Encodes request body using OpenAI-compatible format for chat and embedding operations.

extract_usage(body, model)

Default implementation of extract_usage/2.

Extracts usage data from standard usage field in response body.

prepare_request(operation, model_spec, input, opts)

Default implementation of prepare_request/4.

Handles :chat, :object, and :embedding operations using OpenAI-compatible patterns.

provider_extended_generation_schema()

provider_id()

provider_schema()

supported_provider_options()

translate_options(operation, model, opts)

Default implementation of translate_options/3.

Pass-through implementation that returns options unchanged.