OpenRouter completion client for making AI API calls.
This module handles the actual HTTP requests to OpenRouter's chat completions
and other endpoints. It's used internally by PhoenixKitAI public functions.
Supported Endpoints
/chat/completions- Text and vision completions/embeddings- Text embeddings/images/generations- Image generation (planned)
Logging conventions
Logger.warning— expected/recoverable external failures (non-2xx HTTP responses, transport errors, rate limits). Callers see a user-facing error.Logger.error— unexpected internal failures (unknown error shapes, parse failures).
Summary
Functions
Makes a chat completion request to OpenRouter.
Makes an embeddings request to OpenRouter.
Extracts the text content from a chat completion response.
Extracts the reasoning / chain-of-thought from a chat completion response, for reasoning models (DeepSeek-R1, Mistral Magistral, OpenAI o-series, etc.).
Extracts usage information from a response.
Functions
Makes a chat completion request to OpenRouter.
Parameters
endpoint- The AI endpoint struct with API key and modelmessages- List of message maps with:roleand:contentopts- Additional options (temperature, max_tokens, etc.)
Options
:temperature- Sampling temperature (0-2):max_tokens- Maximum tokens in response:top_p- Nucleus sampling parameter:top_k- Top-k sampling parameter:frequency_penalty- Frequency penalty (-2 to 2):presence_penalty- Presence penalty (-2 to 2):repetition_penalty- Repetition penalty (0 to 2):stop- Stop sequences (list of strings):seed- Random seed for reproducibility:stream- Enable streaming (default: false)
Returns
{:ok, response}- Successful response with completion{:error, reason}- Error atom or tagged tuple. SeePhoenixKitAI.Errorsfor the full reason vocabulary and translation.
Response Structure
%{
"id" => "gen-...",
"model" => "anthropic/claude-3-haiku",
"choices" => [
%{
"message" => %{
"role" => "assistant",
"content" => "Hello! How can I help you today?"
},
"finish_reason" => "stop"
}
],
"usage" => %{
"prompt_tokens" => 10,
"completion_tokens" => 15,
"total_tokens" => 25
}
}
Makes an embeddings request to OpenRouter.
Parameters
endpoint- The AI endpoint struct with API key and modelinput- Text or list of texts to embedopts- Additional options
Options
:dimensions- Output dimensions (model-specific)
Returns
{:ok, response}- Response with embeddings{:error, reason}- Error atom or tagged tuple. SeePhoenixKitAI.Errorsfor the full reason vocabulary and translation.
Extracts the text content from a chat completion response.
Extracts the reasoning / chain-of-thought from a chat completion response, for reasoning models (DeepSeek-R1, Mistral Magistral, OpenAI o-series, etc.).
Different providers put the chain-of-thought in different fields:
- OpenRouter (and most providers it proxies):
message.reasoning - DeepSeek native API:
message.reasoning_content - Some providers may use
message.thinking
Returns the first non-empty string found, or nil if no reasoning is
present (i.e. for non-reasoning models or when the operator opted out
of returning reasoning via reasoning_exclude: true).
reasoning_exclude: true and buggy providers
The endpoint's reasoning_exclude flag controls the REQUEST payload —
it tells the provider not to send reasoning back. A correctly-behaving
provider then returns a response without any of the three reasoning
fields and extract_reasoning/1 returns nil.
A buggy provider (or one that doesn't honour the flag) might still
include reasoning. We deliberately extract it anyway rather than
gating the response-side capture on reasoning_exclude — the
reasoning in metadata.response_reasoning then doubles as a
breadcrumb that lets operators correlate "request asked for no
reasoning but provider sent it anyway" against a specific provider
- model + request id (PR #6 review finding #9). PII / data-retention
concerns are covered by the
capture_request_content?/0application-config gate — when content capture is off, theresponse_reasoningmetadata is dropped too.
If you specifically want "discard reasoning when reasoning_exclude: true regardless of what the provider sent", that's a separate
faithfulness-mode opt the caller would have to set on top of this
helper. The helper itself stays transparent.
Extracts usage information from a response.
Returns a map with token counts and cost (if available from OpenRouter). Cost is stored in nanodollars (1/1,000,000 of a dollar) to preserve precision for cheap API calls. Stored in the cost_cents field for backward compatibility.