ReqLLM. Providers. Google
(ReqLLM v1.12.0)
View Source
Google Gemini provider – built on the OpenAI baseline defaults with Gemini-specific customizations.
Implementation
Uses built-in defaults with custom encoding/decoding to translate between OpenAI format and Gemini API format.
Google-Specific Extensions
Beyond standard OpenAI parameters, Google supports:
google_api_version- Select API version ("v1" or "v1beta"). Defaults to "v1" for production stability. Set to "v1beta" to enable beta features like Google Search grounding.google_safety_settings- List of safety filter configurationsgoogle_candidate_count- Number of response candidates to generate (default: 1)google_grounding- Enable Google Search grounding (built-in web search). Requiresgoogle_api_version: "v1beta"google_thinking_budget- Thinking token budget for Gemini 2.5 models (cannot be combined withgoogle_thinking_level)google_thinking_level- Thinking level for Gemini 3+ models (:minimal,:low,:medium,:high). Cannot be combined withgoogle_thinking_budgetcached_content- Reference to cached content for 90% cost savings (see Context Caching below)dimensions- Number of dimensions for embedding vectorstask_type- Task type for embeddings (e.g., RETRIEVAL_QUERY)response_modalities- Control output modalities for image generation (e.g., ["IMAGE"] for image-only)
See provider_schema/0 for the complete Google-specific schema and
ReqLLM.Provider.Options for inherited OpenAI parameters.
Context Caching
Gemini models support explicit context caching to reduce costs by up to 90% when reusing large amounts of content:
# Create a cache with large context
{:ok, cache} = ReqLLM.Providers.Google.CachedContent.create(
provider: :google,
model: "gemini-2.5-flash",
api_key: System.get_env("GOOGLE_API_KEY"),
contents: [%{role: "user", parts: [%{text: large_document}]}],
system_instruction: "You are a helpful assistant.",
ttl: "3600s"
)
# Use the cache in requests (90% discount on cached tokens!)
{:ok, response} = ReqLLM.generate_text(
"google:gemini-2.5-flash",
"Question about the document?",
provider_options: [cached_content: cache.name]
)
# Check token usage - note the cached_tokens field
IO.inspect(response.usage)
# %{input_tokens: 50, cached_tokens: 10000, output_tokens: 100, ...}See ReqLLM.Providers.Google.CachedContent for full API documentation.
API Version Selection
The provider defaults to Google's v1beta API which supports all features including function calling
(tools) and Google Search grounding. For legacy compatibility, you can force v1 by setting
google_api_version: "v1", but note that v1 does not support function calling or grounding:
ReqLLM.generate_text(
"google:gemini-2.5-flash",
"What are today's tech headlines?",
provider_options: [
google_grounding: %{enable: true}
]
)Note: Setting google_api_version: "v1" with function calling (tools) or grounding will return an error.
Configuration
# Add to .env file (automatically loaded)
GOOGLE_API_KEY=AIza...
Summary
Functions
Default implementation of attach/3.
Default implementation of attach_stream/4.
Default implementation of build_body/1.
Default implementation of decode_response/1.
Default implementation of decode_stream_event/2.
Callback implementation for ReqLLM.Provider.default_env_key/0.
Default implementation of encode_body/1.
Default implementation of extract_usage/2.
Custom prepare_request for chat operations to use Google's specific endpoints.
Default implementation of translate_options/3.
Functions
Default implementation of attach/3.
Sets up Bearer token authentication and standard pipeline steps.
Default implementation of attach_stream/4.
Builds complete streaming requests using OpenAI-compatible format.
Default implementation of build_body/1.
Builds request body using OpenAI-compatible format for chat and embedding operations.
Default implementation of decode_response/1.
Handles success/error responses with standard ReqLLM.Response creation.
Default implementation of decode_stream_event/2.
Decodes SSE events using OpenAI-compatible format.
Callback implementation for ReqLLM.Provider.default_env_key/0.
Default implementation of encode_body/1.
Encodes request body using OpenAI-compatible format for chat and embedding operations.
Default implementation of extract_usage/2.
Extracts usage data from standard usage field in response body.
Custom prepare_request for chat operations to use Google's specific endpoints.
Uses Google's :generateContent and :streamGenerateContent endpoints instead of the standard OpenAI /chat/completions endpoint.
Default implementation of translate_options/3.
Pass-through implementation that returns options unchanged.