Canonical request struct for all inference calls: chat, streaming, embeddings, and media generation.
Chat / Streaming
%Intent{
model: "gpt-4o",
messages: [
%{role: :user, content: [%{type: :text, text: "Hello"}]}
]
}Media Generation
%Intent{
model: "gpt-image-1",
prompt: "A cat wearing a wizard hat",
size: "1024x1024",
quality: "auto",
n: 1,
format: "png"
}Content is always a list of typed blocks — no raw strings.
Content Blocks
%{role: :user, content: [
%{type: :text, text: "What's in this image?"},
%{type: :image_url, url: "https://..."},
%{type: :image_base64, media_type: "image/png", data: "iVBOR..."}
]}
Summary
Functions
Wraps a plain string as a single text content block.
Extracts all text from content blocks, joined by newline.
Types
@type content_block() :: %{type: :text, text: String.t()} | %{type: :image_url, url: String.t()} | %{type: :image_base64, media_type: String.t(), data: String.t()} | %{ type: :image, data: binary() | nil, url: String.t() | nil, content_type: String.t(), revised_prompt: String.t() | nil } | %{ type: :video, data: binary() | nil, url: String.t() | nil, content_type: String.t(), revised_prompt: String.t() | nil }
@type message() :: %{role: atom(), content: [content_block()]}
@type t() :: %Arcanum.Intent{ context_length: pos_integer() | nil, format: String.t(), max_tokens: pos_integer() | nil, messages: [message()] | nil, model: String.t(), n: pos_integer(), negative_prompt: String.t() | nil, prompt: String.t() | nil, quality: String.t() | nil, size: String.t(), style: String.t() | nil, temperature: float() | nil, tools: [tool()] | nil }
Functions
@spec text(String.t()) :: [content_block()]
Wraps a plain string as a single text content block.
@spec to_text([content_block()]) :: String.t()
Extracts all text from content blocks, joined by newline.