ExLLM.Cache (ex_llm v0.5.0)
View SourceUnified caching system for ExLLM providing both runtime performance caching and optional disk persistence for development/testing.
Features
- Fast ETS-based runtime caching with TTL
- Optional one-way disk persistence for test scenario collection
- Automatic cache expiration and cleanup
- Cache statistics and monitoring
- Mock adapter integration via persisted responses
Usage
# Runtime caching only (default)
{:ok, response} = ExLLM.chat(:anthropic, messages, cache: true)
# With custom TTL
{:ok, response} = ExLLM.chat(:anthropic, messages,
cache: true,
cache_ttl: :timer.minutes(30)
)
# Skip cache for this request
{:ok, response} = ExLLM.chat(:anthropic, messages, cache: false)
Disk Persistence
Enable disk persistence to automatically save cached responses for testing:
# Environment variable
export EX_LLM_CACHE_PERSIST=true
export EX_LLM_CACHE_DIR="/path/to/cache" # Optional
# Or application config
config :ex_llm,
cache_persist_disk: true,
cache_disk_path: "/tmp/ex_llm_cache"
When enabled, all cached responses are also written to disk and can be used by the Mock adapter for realistic testing without API calls.
Configuration
config :ex_llm, :cache,
enabled: true,
storage: {ExLLM.Cache.Storage.ETS, []},
default_ttl: :timer.minutes(15),
cleanup_interval: :timer.minutes(5),
persist_disk: false,
disk_path: "/tmp/ex_llm_cache"
Summary
Functions
Returns a specification to start this module under a supervisor.
Clear all cache entries.
Update disk persistence configuration at runtime.
Delete a specific cache entry.
Generate a cache key for a chat request.
Get a cached response if available and not expired.
Store a response in the cache with TTL.
Check if caching should be used for this request.
Get cache statistics.
Wrap a cache-aware function execution.
Functions
Returns a specification to start this module under a supervisor.
See Supervisor
.
@spec clear() :: :ok
Clear all cache entries.
Update disk persistence configuration at runtime.
@spec delete(String.t()) :: :ok
Delete a specific cache entry.
Generate a cache key for a chat request.
The key is based on:
- Provider
- Model
- Messages content
- Relevant options (temperature, max_tokens, etc.)
Get a cached response if available and not expired.
Store a response in the cache with TTL.
Check if caching should be used for this request.
Returns false for:
- Streaming requests
- Requests with functions/tools
- Explicitly disabled caching
@spec stats() :: ExLLM.Cache.Stats.t()
Get cache statistics.
Wrap a cache-aware function execution.
This is the main integration point for ExLLM modules.