API Reference ReqLLM v#1.12.0
View SourceModules
Main API facade for Req AI.
Application-layer response cache hooks for generation requests.
Context represents a conversation history as a collection of messages.
Centralized debug logging for ReqLLM development and troubleshooting.
Embedding functionality for ReqLLM.
Error handling system for ReqLLM using Splode.
Error class for API-related failures and HTTP errors.
Error for when we can't parse the JSON response.
Error for API request failures, HTTP errors, and network issues.
Error for provider response parsing failures and unexpected response formats.
Error for when generated objects don't match the expected schema.
Error for stream processing failures.
Error class for invalid input parameters and configurations.
Error for unsupported model capabilities.
Error for invalid message content.
Error for invalid message structures or validation failures.
Error for invalid message list structures.
Error for unimplemented functionality.
Error for invalid or missing parameters.
Error for unknown or unsupported providers.
Error for providers that exist but have no implementation (metadata-only).
Error for invalid message roles.
Error for invalid schema definitions.
Error class for unexpected or unhandled errors.
Error for unexpected or unhandled errors.
Error class for validation failures and parameter errors.
Error for parameter validation failures.
Behaviour for transforming Finch.Request structs just before a streaming
request is sent.
Text generation functionality for ReqLLM.
Image generation functionality for ReqLLM.
Handles API key lookup with the following precedence
Message represents a single conversation message with multi-modal content support.
ContentPart represents a single piece of content within a message.
Normalized reasoning/thinking data from LLM providers.
Helper functions for querying LLMDB.Model capabilities.
Optical Character Recognition for ReqLLM.
Experimental low-level Realtime WebSocket client for OpenAI.
An experimental OpenAI Realtime WebSocket session.
Bridges ReqLLM request lifecycle telemetry into OpenTelemetry GenAI client spans.
Behaviour the OpenTelemetry bridge uses to talk to a tracer.
Builds the scalar gen_ai.* / server.* / error.* attribute maps
emitted on GenAI client spans, from ReqLLM request lifecycle metadata.
Shapes ReqLLM request and response payloads into the GenAI content
attributes — gen_ai.input.messages, gen_ai.system_instructions,
gen_ai.tool.definitions, gen_ai.output.messages.
Builds histogram records for the four OpenTelemetry GenAI client metrics:
gen_ai.client.operation.duration, gen_ai.client.token.usage,
gen_ai.client.operation.time_to_first_chunk, and
gen_ai.client.operation.time_per_output_chunk.
Default ReqLLM.OpenTelemetry.Adapter implementation, backed by the
Erlang OpenTelemetry SDK (:otel_tracer, :otel_span, :otel_meter).
Spec name tables for the OpenTelemetry GenAI semantic conventions —
gen_ai.provider.name, gen_ai.operation.name, gen_ai.output.type,
and the canonical span name.
Cross-cutting helpers shared by ReqLLM.OpenTelemetry and
ReqLLM.Telemetry.OpenTelemetry — option parsing, Langfuse cost-details
merging, error rendering.
Composable parameter transformation engine for applying model-specific rules to options.
Behavior for LLM provider implementations.
Shared streaming-chunk reducer used by both ReqLLM.StreamServer (the
hot path, one chunk at a time) and
ReqLLM.Provider.Defaults.ResponseBuilder (batch, full chunk list at
end-of-stream).
Default implementations for common provider behavior patterns.
Default ResponseBuilder implementation for OpenAI-compatible providers.
Runtime generation options processing for ReqLLM providers.
Behaviour for provider-specific Response assembly from StreamChunks.
Shared utilities for provider implementations.
Provider discovery and dispatch via introspection.
Alibaba Cloud Bailian (DashScope) provider – international endpoint.
Shared logic for Alibaba Cloud Bailian (DashScope) providers.
Alibaba Cloud Bailian (DashScope) provider – China/Beijing endpoint.
AWS Bedrock provider implementation using the Provider behavior.
Parser for the AWS Event Stream protocol, specialized for Amazon Bedrock.
Anthropic model family support for AWS Bedrock.
Cohere model family support for AWS Bedrock.
AWS Bedrock Converse API support for unified tool calling across models.
Meta Llama model family support for AWS Bedrock.
OpenAI model family support for AWS Bedrock.
Shared utilities for unwrapping AWS Bedrock response formats.
AWS Security Token Service (STS) integration for AssumeRole.
Provider implementation for Anthropic Claude models.
Shared helper functions for Anthropic model adapters (Bedrock, Vertex).
Anthropic-specific context encoding for the Messages API format.
Shared extended thinking/reasoning support for Anthropic models on third-party platforms.
Anthropic-specific response decoding for the Messages API format.
Anthropic-specific ResponseBuilder implementation.
Azure AI provider implementation.
Anthropic model family support for Azure.
OpenAI model family support for Azure OpenAI Service.
Azure Responses API adapter.
Cerebras provider – OpenAI-compatible Chat Completions API with ultra-fast inference.
Cohere provider implementation for reranking operations.
DeepSeek AI provider – OpenAI-compatible Chat Completions API.
ElevenLabs provider for text-to-speech and speech-to-text transcription.
Fireworks AI provider – OpenAI-compatible Chat Completions API.
Google Gemini provider – built on the OpenAI baseline defaults with Gemini-specific customizations.
Shared functionality for Google's Context Caching API.
Google/Gemini-specific ResponseBuilder implementation.
Google Vertex AI provider implementation.
Anthropic model family support for Google Vertex AI.
Google Cloud OAuth2 authentication for Vertex AI.
Gemini model family support for Google Vertex AI.
OpenAI-compatible model family support for Google Vertex AI.
OAuth2 token cache for Google Vertex AI.
Groq provider – 100% OpenAI Chat Completions compatible with Groq's high-performance hardware.
Generic Meta Llama provider implementing Meta's native prompt format.
MiniMax provider using the OpenAI-compatible Chat Completions API.
MiniMax-specific ResponseBuilder implementation.
Mistral provider built on the official Mistral chat completions and embeddings APIs.
NEAR AI Cloud provider using the OpenAI-compatible Chat Completions API.
OpenAI provider implementation with multi-driver architecture for Chat, Responses, and Images APIs.
Behaviour for OpenAI API endpoint drivers.
Shared helper functions for OpenAI-compatible model adapters (Azure, etc.).
OpenAI Chat Completions API driver.
OpenAI Images API driver.
Defines reusable parameter transformation profiles for OpenAI models.
OpenAI Responses API driver for reasoning models.
OpenAI Responses API-specific ResponseBuilder implementation.
OpenAI Codex provider backed by the ChatGPT Codex responses endpoint.
OpenRouter provider – OpenAI Chat Completions compatible with OpenRouter's unified API.
vLLM provider – self-hosted OpenAI-compatible Chat Completions API.
Venice AI provider – OpenAI-compatible Chat Completions API with privacy-first inference.
xAI (Grok) provider – OpenAI Chat Completions compatible with xAI's models and features.
xAI Images API driver.
Z.AI provider – OpenAI-compatible Chat Completions API (Standard Endpoint).
Z.AI Coder provider – OpenAI-compatible Chat Completions API (Coding Endpoint).
Z.AI Coding Plan provider – alias for zai_coder.
Zenmux provider – OpenAI Chat Completions compatible with Zenmux's unified API.
Reranking functionality for ReqLLM.
Canonical reranking response for ReqLLM.
High-level representation of an LLM turn.
Stream processing utilities for ReqLLM responses.
Single schema authority for NimbleOptions ↔ JSON Schema conversion.
Text-to-speech generation functionality for ReqLLM.
Result of a text-to-speech generation operation.
Req step that integrates with Splode error handling.
Req step that attaches test fixture functionality when running in test environments.
Req step that handles automatic retries for transient network errors.
Req step that emits request and reasoning telemetry for Req-backed flows.
Centralized Req step that extracts token usage from provider responses, normalizes usage values across providers, computes costs, and emits telemetry.
Represents a single chunk in a streaming response.
A streaming response container that provides both real-time streaming and asynchronous metadata.
Asynchronous metadata cache that allows multiple awaiters to share the same result.
GenServer that manages streaming LLM sessions with backpressure and SSE parsing.
Main orchestration for ReqLLM streaming operations.
Finch HTTP client for ReqLLM streaming operations.
Lightweight HTTP context for streaming operations.
Retry wrapper for Finch streaming requests.
Provider-agnostic Server-Sent Events (SSE) parsing utilities.
Native :telemetry emitter for ReqLLM request and reasoning lifecycle.
Dependency-free helpers for mapping ReqLLM telemetry metadata to OpenTelemetry GenAI span data.
Normalizes caller-facing inference options into the compact request_options
map exposed on [:req_llm, :request, *] telemetry metadata.
Extracts the upstream server (address, port, path) from a request
representation for the telemetry context's server map.
Tool definition for AI model function calling.
Represents a single tool call from an assistant message.
Tool call ID compatibility helpers for cross-provider conversations.
ToolResult represents structured and multi-part tool outputs.
Speech-to-text transcription functionality for ReqLLM.
Result of an audio transcription operation.
Usage normalization helpers.
Mix Tasks
Generate text or structured objects from any supported AI model with unified interface.
Install and configure ReqLLM for use in an application.
Validate ReqLLM model coverage using the fixture system.
Validates livebook files by extracting Elixir code blocks.