ExLLM.Vision (ex_llm v0.5.0)
View SourceVision and multimodal support for ExLLM.
Provides utilities for handling images in LLM requests, including:
- Image format validation
- Base64 encoding/decoding
- URL validation
- Provider-specific formatting
Supported Image Formats
- JPEG/JPG
- PNG
- GIF (static only for most providers)
- WebP
Usage
# With image URL
messages = [
%{
role: "user",
content: [
%{type: "text", text: "What's in this image?"},
%{type: "image_url", image_url: %{url: "https://example.com/image.jpg"}}
]
}
]
# With base64 image
image_data = File.read!("photo.jpg") |> Base.encode64()
messages = [
%{
role: "user",
content: [
%{type: "text", text: "Describe this photo"},
%{type: "image", image: %{data: image_data, media_type: "image/jpeg"}}
]
}
]
{:ok, response} = ExLLM.chat(:anthropic, messages)
Summary
Functions
Build a vision message with text and images.
Count images in a message.
Format messages for a specific provider.
Check if a message contains vision content.
Create an image URL content part.
Load an image from file and encode it for API use.
Validate and normalize messages containing images.
Check if a provider supports vision/multimodal inputs.
Create a text content part.
Functions
@spec build_message(String.t(), String.t(), [String.t()], keyword()) :: {:ok, ExLLM.Types.message()} | {:error, term()}
Build a vision message with text and images.
Examples
message = ExLLM.Vision.build_message("user", "What's in these images?", [
"https://example.com/image1.jpg",
"/path/to/local/image2.png"
])
@spec count_images(ExLLM.Types.message()) :: non_neg_integer()
Count images in a message.
@spec format_for_provider([ExLLM.Types.message()], atom()) :: [ExLLM.Types.message()]
Format messages for a specific provider.
Some providers have specific requirements for vision content.
@spec has_vision_content?(ExLLM.Types.message()) :: boolean()
Check if a message contains vision content.
Create an image URL content part.
Options
:detail
- Image detail level (:auto, :low, :high)
Load an image from file and encode it for API use.
Returns a content part ready to be included in a message.
@spec normalize_messages([ExLLM.Types.message()]) :: {:ok, [ExLLM.Types.message()]} | {:error, term()}
Validate and normalize messages containing images.
Returns {:ok, normalized_messages}
or {:error, reason}
.
Check if a provider supports vision/multimodal inputs.
Create a text content part.