Google Gemini native image-out adapter — implements ALLM.ImageAdapter
against generateContent with responseModalities: ["TEXT", "IMAGE"]
on the Gemini-native image preview models (gemini-3.1-flash-image-preview
/ "Nano Banana 2", gemini-3-pro-image-preview / "Nano Banana Pro").
and . Layer B — runtime. Consumed through the `ALLM.generate_image/3` façade. Keys resolve via `ALLM.Keys.fetch!(:gemini, opts)` at request-build time per the documented contract — no key ever lives on the engine. ## Single translator Image generation is `generateContent` with `responseModalities` toggled to `["TEXT", "IMAGE"]`. The request body is built by `ALLM.Providers.Gemini.to_gemini_request_body/2` (the same translator the chat adapter uses). The image adapter then overrides `generationConfig.responseModalities` and adds `generationConfig.imageConfig.aspectRatio` from the the documented contract size-mapping table. The `:edit` operation reuses 's `part_to_block/1` for source-image translation by synthesizing a user-role message with `[%TextPart{}, %ImagePart{},...]` content. ## Aspect-ratio mapping | ALLM `ImageRequest.size` | Gemini `imageConfig.aspectRatio` | |--------------------------|----------------------------------| | `"1024x1024"`, `"512x512"`, `"256x256"`, any square | `"1:1"` | | `"1792x1024"`, any 16:9 | `"16:9"` | | `"1024x1792"`, any 9:16 | `"9:16"` | | `"1024x768"`, any 4:3 | `"4:3"` | | `"768x1024"`, any 3:4 | `"3:4"` | | `nil` | omit `imageConfig` (Gemini default) | | anything else | `{:error, %ImageAdapterError{reason: :invalid_request}}` | Pixel sizing (`imageSize: "1K"|"2K"|"4K"`) is not exposed in v0.2's `ImageRequest.size` field; deferred. Aspect-ratio is the only knob. ## Operation gate `supported_operations/0` returns `[:generate, :edit]`. `:variation` is rejected with `:unsupported_operation` BEFORE any HTTP I/O per `ImageAdapter` invariant 4. ## Test-injection escape hatch `opts[:adapter_opts][:image_script]`, when present, delegates to `ALLM.Providers.FakeImages.generate/2` BEFORE any pre-flight gate runs. Mirrors the OpenAI.Images precedent at `lib/allm/providers/openai/images.ex:251`. ## Shared response decoder (Cross-function invariant) Response bodies are decoded via `ALLM.Providers.Gemini.Decode.candidate_parts/1` — the same helper `Gemini.generate/2` calls (see `lib/allm/providers/gemini.ex:991` post-Phase-16.5 refactor). The image adapter consumes the `image_parts` element of the returned tuple while the chat adapter consumes `text` + `tool_calls`; both walk the parts list once. Per cross-function invariants
lines 217-219.
Summary
Functions
Return the Gemini endpoint path (relative to the API base URL) for the image-generation operation.
Execute an image-generation or edit request synchronously.
Return an unfired Req.Request configured exactly as generate/2
would fire it.
Resolve an %Image{} source to raw bytes. Mirrors the OpenAI seam at
lib/allm/providers/openai/images.ex:858.
Return the closed list of operations Gemini's image adapter supports.
Map ImageRequest.size to Gemini's imageConfig.aspectRatio per
the documented contract. Returns the raw aspect-ratio string, :omit for nil,
or {:error, :invalid_size} for an unmappable size.
Build the JSON request body for an image request.
Functions
Return the Gemini endpoint path (relative to the API base URL) for the image-generation operation.
Both :generate and :edit route through generateContent (the
request body shape differs, the URL path does not). :variation is
rejected pre-flight by gate_operation/2.
Examples
iex> ALLM.Providers.Gemini.Images.endpoint_for("gemini-3.1-flash-image-preview")
"/models/gemini-3.1-flash-image-preview:generateContent"
@spec generate( ALLM.ImageRequest.t(), keyword() ) :: {:ok, ALLM.ImageResponse.t()} | {:error, ALLM.Error.ImageAdapterError.t()}
Execute an image-generation or edit request synchronously.
Pre-flight gates (per ImageAdapter invariant 4)
Before any HTTP I/O, generate/2 checks (in order):
- Test-injection escape hatch. When
opts[:adapter_opts][:image_script]is non-nil, the call delegates toALLM.Providers.FakeImages.generate/2. - Operation gate.
request.operation in supported_operations. Failure →:unsupported_operationwithmetadata: %{operation: op}. - Aspect-ratio gate.
request.size, when non-nil, must map to one of"1:1" | "16:9" | "9:16" | "4:3" | "3:4". Failure →:invalid_request.
Key resolution (ALLM.Keys.fetch!/2) runs AFTER the gates — a request
rejected pre-flight does not require a valid key.
Request-id / metadata round-trip (invariants 5 + 6)
opts[:request_id] is reflected onto response.request_id.
request.metadata round-trips onto response.metadata unchanged.
@spec prepare_request( ALLM.ImageRequest.t(), keyword() ) :: {:ok, Req.Request.t()} | {:error, ALLM.Error.ImageAdapterError.t()}
Return an unfired Req.Request configured exactly as generate/2
would fire it.
Same gate ordering as generate/2. Returns {:error, %ImageAdapterError{}}
for any pre-flight failure.
@spec resolve_image_bytes( ALLM.Image.t(), keyword() ) :: {:ok, binary(), String.t()} | {:error, ALLM.Error.ImageAdapterError.t()}
Resolve an %Image{} source to raw bytes. Mirrors the OpenAI seam at
lib/allm/providers/openai/images.ex:858.
For Gemini, this helper exists for parity with the OpenAI image-adapter
testing surface. The actual :edit request build delegates source
translation to Gemini.part_to_block/1 via the chat
translator, which handles :binary, :base64, and :file sources;
:url is rejected by Gemini.reject_unsupported_image_sources/1.
@spec supported_operations() :: [:generate | :edit]
Return the closed list of operations Gemini's image adapter supports.
Per the documented contract — [:generate, :edit]. :variation is not supported
by the Gemini-native image models and is rejected pre-flight.
Examples
iex> ALLM.Providers.Gemini.Images.supported_operations
[:generate, :edit]
@spec to_aspect_ratio(ALLM.ImageRequest.size() | nil) :: {:ok, String.t()} | :omit | {:error, :invalid_size}
Map ImageRequest.size to Gemini's imageConfig.aspectRatio per
the documented contract. Returns the raw aspect-ratio string, :omit for nil,
or {:error, :invalid_size} for an unmappable size.
Square sizes ("NxN" or {n, n}) collapse to "1:1". Non-square
sizes use exact ratio comparison rather than substring matching so
"768x1024" (3:4) and "1024x1792" (~9:16) are disambiguated.
Examples
iex> ALLM.Providers.Gemini.Images.to_aspect_ratio("1024x1024")
{:ok, "1:1"}
iex> ALLM.Providers.Gemini.Images.to_aspect_ratio({1792, 1024})
{:ok, "16:9"}
iex> ALLM.Providers.Gemini.Images.to_aspect_ratio(nil)
:omit
iex> ALLM.Providers.Gemini.Images.to_aspect_ratio("999x111")
{:error, :invalid_size}
@spec to_image_request_body( ALLM.ImageRequest.t(), keyword() ) :: {:ok, map()} | {:error, ALLM.Error.ImageAdapterError.t()}
Build the JSON request body for an image request.
Synthesizes a chat-equivalent %Request{} (single user message
whose content is the prompt for :generate, or
[%TextPart{}, %ImagePart{},...] for :edit) and delegates to
Gemini.to_gemini_request_body/2 per the documented contract. Then overrides
generationConfig.responseModalities = ["TEXT", "IMAGE"] and (when the
size maps to a known aspect ratio) adds
generationConfig.imageConfig.aspectRatio. :n > 1 adds
generationConfig.candidateCount: n.
Returns {:error, %ImageAdapterError{reason: :invalid_request}} for
unmappable sizes per the documented contract's closed table.