Candil.Stream (Candil v1.0.0)

Copy Markdown View Source

Server-Sent Events (SSE) streaming for LLM inference.

Streams tokens from local engines and remote providers as they are generated, calling a user-supplied callback for each chunk.

Usage

Candil.Stream.chat(:llama3, [
  %{role: "user", content: "Write a haiku about Elixir"}
], fn chunk ->
  IO.write(chunk.content)
end)

The callback receives a chunk() map:

%{content: "token", finish_reason: nil | "stop" | "length", done: false}

When streaming ends the callback is called once more with done: true.

Provider support

OpenAI, Anthropic, Ollama, OpenAI-compatible and local llama-server.

Summary

Functions

Streams a chat completion from a running local engine identified by alias.

Streams a chat completion from a remote provider.

Types

chunk()

@type chunk() :: %{content: binary(), finish_reason: binary() | nil, done: boolean()}

stream_callback()

@type stream_callback() :: (chunk() -> any())

Functions

chat(model_alias, messages, callback, opts \\ [])

@spec chat(atom(), [Candil.Inference.message()], stream_callback(), keyword()) ::
  :ok | {:error, any()}

Streams a chat completion from a running local engine identified by alias.

chat(model, provider, messages, callback, opts)

@spec chat(
  Candil.Model.t(),
  Candil.Provider.t(),
  [Candil.Inference.message()],
  stream_callback(),
  keyword()
) :: :ok | {:error, any()}

Streams a chat completion from a remote provider.