API Reference ex_llm v#0.8.1
View SourceModules
ExLLM - Unified Elixir client library for Large Language Models.
Unified capability querying with automatic normalization.
Core chat functionality for ExLLM.
Context management for LLM conversations.
Message truncation strategies for context window management.
Cost calculation functionality for ExLLM.
Utilities for displaying cost information in various formats.
Session-level cost tracking and aggregation functionality.
Text embeddings generation across LLM providers.
Function calling support for ExLLM.
Represents a callable function/tool.
Represents a function call request from the LLM.
Represents the result of executing a function.
Model discovery and management across LLM providers.
Session management for ExLLM - handles conversation sessions with LLM providers.
Streaming error recovery and resumption support for ExLLM.
Represents a partial streaming response that can be resumed.
Structured output support for ExLLM using instructor_ex.
Vision and multimodal support for ExLLM.
Unified caching system for ExLLM providing both runtime performance caching and optional disk persistence for development/testing.
Cache statistics.
Behaviour for cache storage backends.
ETS-based storage backend for ExLLM cache.
Specialized storage backend for timestamp-based test response caching.
High-performance circuit breaker implementation using ETS for concurrent access.
Adaptive circuit breaker that automatically adjusts failure thresholds based on error patterns.
Bulkhead pattern implementation for circuit breaker concurrency limiting.
GenServer that manages bulkhead state for a single circuit.
Public API for circuit breaker configuration management.
Configuration management system for circuit breakers.
Health check and monitoring system for circuit breakers.
Comprehensive metrics integration for circuit breakers.
Dashboard and visualization helpers for circuit breaker metrics.
Prometheus metrics endpoint for circuit breaker monitoring.
StatsD metrics reporter for circuit breakers.
Telemetry instrumentation for circuit breaker operations.
Model capability discovery and management for ExLLM.
Represents a model capability or feature.
Complete information about a model's capabilities.
Model configuration loader for ExLLM.
Dynamic model loader for ExLLM adapters.
Provider-level capability tracking for ExLLM.
Represents provider-level information and capabilities.
Behaviour for configuration providers.
Default configuration provider that reads from environment variables. This is an alias for the Env provider for backward compatibility.
Environment-based configuration provider.
Static configuration provider for testing and library usage.
Standardized error types and utilities for ExLLM.
Unified logging for ExLLM with automatic context and security features.
Request retry logic with exponential backoff for ExLLM.
Defines a retry policy for a specific provider or operation.
Intelligent chunk batching for optimized streaming output.
Advanced flow control for streaming LLM responses.
Efficient circular buffer implementation for stream chunk management.
Telemetry instrumentation for ExLLM.
Helper module for adding telemetry instrumentation to ExLLM operations.
Telemetry metrics definitions for ExLLM (Stub).
OpenTelemetry integration for ExLLM (Stub).
Behaviour for LLM backend providers.
Anthropic Claude API adapter for ExLLM.
AWS Bedrock adapter for ExLLM. Supports multiple providers including Claude, Titan, Llama, Cohere, AI21, and Mistral through Bedrock.
Bumblebee adapter for on-device LLM inference.
Handles loading and caching of Bumblebee models for local inference.
Google Gemini API adapter for ExLLM.
OAuth2 authentication helper for Google Gemini APIs that require user authentication.
Base HTTP request functionality for Gemini API modules.
Google Gemini Context Caching API implementation.
Represents cached content that can be reused across requests.
Token usage information for cached content.
Chunk management for Gemini's Semantic Retrieval API.
Response from batch operations containing a list of chunks.
Extracted data that represents the Chunk content.
User provided metadata stored as key-value pairs.
Response from listing chunks with pagination support.
User provided string values assigned to a single metadata key.
Google Gemini Content Generation API implementation.
A generated candidate response.
Represents content with a role and parts.
Request structure for content generation.
Response from content generation.
Configuration for content generation.
Represents a content part which can be text, inline data, or function call/response.
Safety settings for content generation.
Tool definitions for function calling.
Configuration for tool usage.
Token usage information.
Google Gemini Corpus Management API implementation.
Filter condition for metadata values.
Information about a corpus.
Request structure for creating a corpus.
Request structure for listing corpora.
Response from listing corpora.
Filter for chunk and document metadata.
Request structure for querying a corpus.
Response from querying a corpus.
A chunk relevant to a query with its relevance score.
Request structure for updating a corpus.
Document management for Gemini's Semantic Retrieval API.
Filter condition applicable to a single key.
User provided metadata stored as key-value pairs.
Response from listing documents with pagination support.
User provided filter to limit retrieval based on Chunk or Document level metadata values.
Response from document query containing a list of relevant chunks.
The information for a chunk relevant to a query.
User provided string values assigned to a single metadata key.
Google Gemini Embeddings API implementation.
A list of floats representing an embedding.
Request containing the Content for the model to embed.
Google Gemini Files API implementation.
Represents an uploaded file in the Gemini API.
Error status information.
Metadata for video files.
Google Gemini Live API implementation using WebSockets.
Client content message
Content structure for Live API
Function call details
Generation configuration for Live API
Server disconnect notification
Real-time input message
Server content message
Initial session setup message
Tool call message from server
Tool response message
Google Gemini Models API implementation.
Represents a Gemini model with its capabilities and metadata.
Google Gemini Permissions API implementation.
Response from listing permissions.
Permission resource that grants access to a tuned model or corpus.
Request to transfer ownership of a tuned model.
Google Gemini Question Answering API implementation.
Request structure for generating grounded answers.
Response structure for grounded answers.
A single passage included inline with a grounding configuration.
A list of passages provided inline with the request.
Feedback related to the input data used to answer the question.
Configuration for retrieving grounding content from a Corpus or Document created using the Semantic Retriever API.
Google Gemini Token Counting API implementation.
Request structure for counting tokens.
Response from the token counting API.
Token count breakdown by modality (TEXT, IMAGE, AUDIO, VIDEO).
Google Gemini Fine-tuning API implementation.
Dataset for training or validation.
Hyperparameters controlling the tuning process.
Response from listing tuned models.
A fine-tuned model created using the tuning API.
Tuned model as a source for training a new model.
A single example for tuning.
A set of tuning examples.
Record for a single tuning step.
Tuning task that creates tuned models.
Groq adapter for ExLLM.
LM Studio adapter for local LLM inference.
Mistral AI API adapter for ExLLM.
Mock adapter for testing ExLLM integrations.
Ollama API adapter for ExLLM - provides local model inference via Ollama server.
OpenAI GPT API adapter for ExLLM.
Base implementation for OpenAI-compatible API providers.
OpenRouter API adapter for ExLLM.
Perplexity AI API adapter for ExLLM.
Enhanced streaming coordinator with advanced flow control, intelligent batching, and sophisticated buffering strategies.
Shared HTTP client utilities for ExLLM adapters.
Shared message formatting utilities for ExLLM adapters.
Unified behavior and utilities for fetching models from LLM provider APIs.
Unified request building for LLM providers.
Shared utilities for building standardized responses across adapters.
Shared streaming behavior and utilities for ExLLM adapters.
Unified streaming coordinator for all LLM adapters.
Unified vision/multimodal content formatting for LLM providers.
Adapter for X.AI's Grok models.
Interceptor module for automatically caching provider responses.
Response caching system for collecting and storing real provider responses.
Test helper functions for managing and working with the automatic test cache.
Maintain index of timestamped cache entries with metadata.
Intelligent matching of requests to cached responses.
Track cache performance and cost savings with timestamp-based metrics.
Automatically intercept and cache responses during tests.
Shared type definitions used across ExLLM modules.
Represents an available embedding model.
Represents an embedding response from an LLM provider.
Standard response format from LLM providers with integrated cost calculation.
Represents an available LLM model.
Represents a conversation session with message history and metadata.
Represents a chunk from a streaming LLM response.
Mix Tasks
Mix tasks for managing the ExLLM test response cache.
Clean up old cache entries
Clear test cache
Show test cache statistics
Circuit breaker configuration management tasks.