# Google Vertex AI Access Claude, Gemini, and MaaS models through Google Cloud's Vertex AI platform. ## Configuration Vertex AI uses Google Cloud OAuth2 authentication with service accounts. ### Service Account (Recommended) **Environment Variables:** ```bash GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json" GOOGLE_CLOUD_PROJECT="your-project-id" GOOGLE_CLOUD_REGION="global" ``` **Provider Options:** ```elixir ReqLLM.generate_text( "google_vertex:claude-sonnet-4-5@20250929", "Hello", provider_options: [ service_account_json: "/path/to/service-account.json", project_id: "your-project-id", region: "global" ] ) ``` ## Model Specs For the full model-spec workflow, see [Model Specs](model-specs.md). Use exact Vertex model IDs from [LLMDB.xyz](https://llmdb.xyz) when possible. For MaaS and other OpenAI-compatible Vertex models that are not in the registry yet, build a full explicit model spec with `ReqLLM.model!/1`. Some MaaS model IDs also need `extra.family` when the family cannot be inferred from the ID alone. ## Provider Options Passed via `:provider_options` keyword: ### `service_account_json` - **Type**: String (file path) - **Purpose**: Path to Google Cloud service account JSON file - **Fallback**: `GOOGLE_APPLICATION_CREDENTIALS` env var - **Example**: `provider_options: [service_account_json: "/path/to/credentials.json"]` ### `access_token` - **Type**: String - **Purpose**: Use an existing OAuth2 access token generated outside ReqLLM (e.g., via Goth or gcloud) - **Behavior**: Bypasses the service account JSON flow and internal token management - **Example**: `provider_options: [access_token: "your-access-token"]` ### `project_id` - **Type**: String - **Purpose**: Google Cloud project ID - **Fallback**: `GOOGLE_CLOUD_PROJECT` env var - **Example**: `provider_options: [project_id: "my-project-123"]` - **Required**: Yes ### `region` - **Type**: String - **Default**: `"global"` - **Purpose**: GCP region for Vertex AI endpoint - **Example**: `provider_options: [region: "us-central1"]` - **Note**: Use `"global"` for newest models, specific regions for regional deployment ### `additional_model_request_fields` - **Type**: Map - **Purpose**: Model-specific request fields (e.g., thinking configuration) - **Example**: ```elixir provider_options: [ additional_model_request_fields: %{ thinking: %{type: "enabled", budget_tokens: 4096} } ] ``` ### Claude-Specific Options Vertex AI supports the same Claude options as native Anthropic: #### `anthropic_top_k` - **Type**: `1..40` - **Purpose**: Sample from top K options per token - **Example**: `provider_options: [anthropic_top_k: 20]` #### `stop_sequences` - **Type**: List of strings - **Purpose**: Custom stop sequences - **Example**: `provider_options: [stop_sequences: ["END", "STOP"]]` #### `anthropic_metadata` - **Type**: Map - **Purpose**: Request metadata for tracking - **Example**: `provider_options: [anthropic_metadata: %{user_id: "123"}]` #### `thinking` - **Type**: Map - **Purpose**: Enable extended thinking/reasoning - **Example**: `provider_options: [thinking: %{type: "enabled", budget_tokens: 4096}]` - **Access**: `ReqLLM.Response.thinking(response)` #### `anthropic_prompt_cache` - **Type**: Boolean - **Purpose**: Enable prompt caching - **Example**: `provider_options: [anthropic_prompt_cache: true]` #### `anthropic_prompt_cache_ttl` - **Type**: String (e.g., `"1h"`) - **Purpose**: Cache TTL (default ~5min if omitted) - **Example**: `provider_options: [anthropic_prompt_cache_ttl: "1h"]` ## Supported Models ### Claude 4.5 Family - **Haiku 4.5**: `google_vertex:claude-haiku-4-5@20251001` - Fast, cost-effective - Full tool calling and reasoning support - **Sonnet 4.5**: `google_vertex:claude-sonnet-4-5@20250929` - Balanced performance and capability - Extended thinking support - **Opus 4.1**: `google_vertex:claude-opus-4-1@20250805` - Highest capability - Advanced reasoning ### Claude 4.0 & Earlier - **Sonnet 4.0**: `google_vertex:claude-sonnet-4@20250514` - **Opus 4.0**: `google_vertex:claude-opus-4@20250514` - **Sonnet 3.7**: `google_vertex:claude-3-7-sonnet@20250219` - **Sonnet 3.5 v2**: `google_vertex:claude-3-5-sonnet@20241022` - **Haiku 3.5**: `google_vertex:claude-3-5-haiku@20241022` ### Model ID Format Vertex uses the `@` symbol for versioning: - Format: `claude-{tier}-{version}@{date}` - Example: `claude-sonnet-4-5@20250929` ## Wire Format Notes - **Authentication**: OAuth2 with service account tokens (auto-refreshed) - **Endpoint**: Model-specific paths under `aiplatform.googleapis.com` - **API**: Uses Anthropic's raw message format (compatible with native API) - **Streaming**: Standard Server-Sent Events (SSE) - **Region routing**: Global endpoint for newest models, regional for specific deployments All differences handled automatically by ReqLLM. ## Resources - [Vertex AI Documentation](https://cloud.google.com/vertex-ai/docs) - [Claude on Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude) - [Service Account Setup](https://cloud.google.com/iam/docs/service-accounts-create)