# Zenmux Enterprise-grade LLM aggregation platform with dual-protocol support (OpenAI and Anthropic compatible), intelligent routing, and insurance-backed quality guarantees. ## Configuration ```bash ZENMUX_API_KEY=sk-ai-v1-... ``` ## Model Specs For the full model-spec workflow, see [Model Specs](model-specs.md). Zenmux is another good fit for the full explicit model specification path because protocol and `base_url` can vary by deployment. Use exact IDs from [LLMDB.xyz](https://llmdb.xyz) when possible, and use `ReqLLM.model!/1` when you need to pin custom routing metadata. ## Provider Options Passed via `:provider_options` keyword: ### Provider Routing Configure multi-provider routing and failover strategies: #### `provider` - **Type**: Map - **Purpose**: Advanced provider routing configuration - **Structure**: ```elixir provider_options: [ provider: %{ routing: %{ type: "priority", # "priority" | "round_robin" | "least_latency" primary_factor: "cost", # "cost" | "speed" | "quality" providers: ["openai", "anthropic", "google"] }, fallback: "anthropic" # true | false | specific provider name } ] ``` - **Routing Types**: - `"priority"`: Try providers in order with fallback - `"round_robin"`: Distribute requests evenly across providers - `"least_latency"`: Select fastest provider based on metrics - **Primary Factors**: - `"cost"`: Prefer lower-cost providers - `"speed"`: Prefer faster-response providers - `"quality"`: Prefer higher-quality providers - **Fallback**: - `true`: Enable automatic failover to any available provider - `false`: Disable failover (return error on failure) - `"provider_name"`: Specific fallback provider (e.g., `"anthropic"`) ### Model Routing Configure model selection within the same provider: #### `model_routing_config` - **Type**: Map - **Purpose**: Intelligent model selection based on task requirements - **Structure**: ```elixir provider_options: [ model_routing_config: %{ available_models: ["openai/gpt-4o", "openai/gpt-4-turbo", "anthropic/claude-sonnet-4"], preference: "openai/gpt-4o", task_info: %{ task_type: "chat", # "chat" | "completion" | "embedding" complexity: "medium", # "low" | "medium" | "high" additional_properties: %{} } } ] ``` - **Task Complexity**: - `"low"`: Simple tasks (short answers, basic rewrites) - `"medium"`: Moderate complexity (general Q&A, basic code) - `"high"`: Complex tasks (long documents, complex programming) ### Reasoning Configuration #### `reasoning` - **Type**: Map - **Purpose**: Configure reasoning process behavior - **Keys**: - `enable`: Boolean to enable/disable reasoning - `depth`: Control reasoning depth - `expose`: Whether to expose reasoning in response - **Example**: ```elixir provider_options: [ reasoning: %{ enable: true, depth: "high", expose: true } ] ``` ### Web Search #### `web_search_options` - **Type**: Map - **Purpose**: Enable proactive web search for real-time information - **Example**: ```elixir provider_options: [ web_search_options: %{ enabled: true, max_results: 5 } ] ``` ### Output Control #### `max_completion_tokens` - **Type**: Positive Integer - **Purpose**: Maximum number of tokens to generate (including reasoning tokens). - **Note**: Replaces `max_tokens`. If `max_tokens` is provided, it is automatically aliased to `max_completion_tokens`. - **Example**: `max_completion_tokens: 4096` #### `verbosity` - **Type**: `"low"` | `"medium"` | `"high"` - **Default**: `"medium"` - **Purpose**: Control output detail level - **Example**: `provider_options: [verbosity: "high"]` #### `reasoning_effort` - **Type**: `:none` | `:minimal` | `:low` | `:medium` | `:high` | `:xhigh` - **Purpose**: Control reasoning model effort level - **Example**: `provider_options: [reasoning_effort: :high]` ## Dual Protocol Support Zenmux supports both OpenAI and Anthropic API protocols: ### OpenAI Protocol (Default) ```elixir # Standard OpenAI-compatible endpoint ReqLLM.generate_text("zenmux:openai/gpt-4o", "Hello!") ``` ### Anthropic Protocol For Anthropic-compatible tools and integrations, use the Anthropic base URL: ```elixir # Configure with Anthropic protocol endpoint model = ReqLLM.model!(%{ provider: :zenmux, id: "anthropic/claude-sonnet-4.5", base_url: "https://zenmux.ai/api/anthropic" }) ReqLLM.generate_text(model, "Hello!") ``` ## Usage Examples ### Basic Text Generation ```elixir {:ok, response} = ReqLLM.generate_text( "zenmux:openai/gpt-4o", "What is the capital of France?" ) ``` ### Advanced Routing with Fallback ```elixir {:ok, response} = ReqLLM.generate_text( "zenmux:openai/gpt-4o", "Complex analysis task", provider_options: [ provider: %{ routing: %{ type: "priority", primary_factor: "quality", providers: ["openai", "anthropic", "google"] }, fallback: "anthropic" } ] ) ``` ### Model Selection by Task Complexity ```elixir {:ok, response} = ReqLLM.generate_text( "zenmux:openai/gpt-4o", "Write a complex technical document", provider_options: [ model_routing_config: %{ available_models: ["openai/gpt-4o", "anthropic/claude-sonnet-4"], task_info: %{ task_type: "completion", complexity: "high" } } ] ) ``` ### Web Search Integration ```elixir {:ok, response} = ReqLLM.generate_text( "zenmux:openai/gpt-4o", "What are the latest AI developments in 2026?", provider_options: [ web_search_options: %{ enabled: true, max_results: 10 } ] ) ``` ## Model Discovery Browse available models: - [Zenmux Models](https://zenmux.ai/models) - Filter by protocol: OpenAI API Compatible or Anthropic API Compatible ## Key Benefits - **Dual Protocol Support**: Compatible with both OpenAI and Anthropic SDKs - **Enterprise Reliability**: High capacity reserves with automatic failover - **Quality Assurance**: Regular degradation checks (HLE tests) with transparent results - **AI Model Insurance**: Automated detection and payouts for quality issues - **Intelligent Routing**: Optimize for cost, speed, or quality automatically - **Global Edge Network**: Low-latency service via Cloudflare infrastructure - **Single API Key**: Access all providers with unified billing ## Pricing Transparent usage-based pricing with unified billing: ```elixir {:ok, response} = ReqLLM.generate_text("zenmux:openai/gpt-4o", "Hello") IO.puts("Cost: $#{response.usage.total_cost}") IO.puts("Tokens: #{response.usage.total_tokens}") ``` ## Resources - [Zenmux Documentation](https://docs.zenmux.ai) - [Model List](https://zenmux.ai/models) - [Benchmarks](https://zenmux.ai/benchmark) - [API Reference](https://docs.zenmux.ai/api/openai/create-chat-completion-new)