ExLLM.Core.Vision (ex_llm v0.8.1)

View Source

Vision and multimodal support for ExLLM.

Provides utilities for handling images in LLM requests, including:

  • Image format validation
  • Base64 encoding/decoding
  • URL validation
  • Provider-specific formatting

Supported Image Formats

  • JPEG/JPG
  • PNG
  • GIF (static only for most providers)
  • WebP

Usage

# With image URL
messages = [
  %{
    role: "user",
    content: [
      %{type: "text", text: "What's in this image?"},
      %{type: "image_url", image_url: %{url: "https://example.com/image.jpg"}}
    ]
  }
]

# With base64 image
image_data = File.read!("photo.jpg") |> Base.encode64()
messages = [
  %{
    role: "user", 
    content: [
      %{type: "text", text: "Describe this photo"},
      %{type: "image", image: %{data: image_data, media_type: "image/jpeg"}}
    ]
  }
]

{:ok, response} = ExLLM.chat(:anthropic, messages)

Summary

Functions

Build a vision message with text and images.

Count images in a message.

Format messages for a specific provider.

Check if a message contains vision content.

Create an image URL content part.

Load an image from file and encode it for API use.

Validate and normalize messages containing images.

Check if a provider supports vision/multimodal inputs.

Create a text content part.

Functions

build_message(role, text_content, image_sources, opts \\ [])

@spec build_message(String.t(), String.t(), [String.t()], keyword()) ::
  {:ok, ExLLM.Types.message()} | {:error, term()}

Build a vision message with text and images.

Examples

message = ExLLM.Core.Vision.build_message("user", "What's in these images?", [
  "https://example.com/image1.jpg",
  "/path/to/local/image2.png"
])

count_images(arg1)

@spec count_images(ExLLM.Types.message()) :: non_neg_integer()

Count images in a message.

format_for_provider(messages, provider)

@spec format_for_provider([ExLLM.Types.message()], atom()) :: [ExLLM.Types.message()]

Format messages for a specific provider.

Some providers have specific requirements for vision content.

has_vision_content?(arg1)

@spec has_vision_content?(ExLLM.Types.message()) :: boolean()

Check if a message contains vision content.

image_url(url, opts \\ [])

@spec image_url(
  String.t(),
  keyword()
) :: map()

Create an image URL content part.

Options

  • :detail - Image detail level (:auto, :low, :high)

load_image(file_path, opts \\ [])

@spec load_image(
  String.t(),
  keyword()
) :: {:ok, map()} | {:error, term()}

Load an image from file and encode it for API use.

Returns a content part ready to be included in a message.

normalize_messages(messages)

@spec normalize_messages([ExLLM.Types.message()]) ::
  {:ok, [ExLLM.Types.message()]} | {:error, term()}

Validate and normalize messages containing images.

Returns {:ok, normalized_messages} or {:error, reason}.

supports_vision?(provider)

@spec supports_vision?(atom()) :: boolean()

Check if a provider supports vision/multimodal inputs.

text(content)

@spec text(String.t()) :: map()

Create a text content part.