LLM is a lightweight Elixir client that provides a unified interface for interacting with Large Language Model APIs. It normalizes requests and responses across multiple providers, so your application code stays provider-agnostic.

Design Philosophy

LLM follows three core principles:

  1. Normalized data model — Messages, tools, responses, and usage are represented by the same structs regardless of which provider you use. Switch from OpenAI to Anthropic by changing a single option.

  2. Adapter pattern — Each provider's wire format (JSON encoding, SSE streaming, auth headers) lives in a dedicated adapter module. The core library never references provider-specific JSON keys.

  3. Streaming firstgenerate/2 is built on top of stream/2. All requests go through the streaming path, and the final response is assembled by collecting chunks. This means tool calls, thinking blocks, and error handling work identically whether you stream or not.

Architecture

                        
                          LLM.generate 
                          LLM.stream   
                        
                               
                        
                          LLM.Stream     orchestration layer
                        
                               
              
                                              
       
      LLM.Provider.     LLM.Adapter  LLM.HTTPClient
      Resolver          .OpenAI                    
       .Anthropic  
                         .Gemini    
                         .OpenAIResp
                        

Request flow

  1. LLM.generate/2 or LLM.stream/2 builds an LLM.Context from the prompt and options.
  2. LLM.Provider.Resolver resolves the provider option into a full config map with adapter, base URL, and API key.
  3. LLM.Stream.start/2 calls the adapter's build_request/2 to encode the context into provider-specific JSON, then sends an HTTP POST via LLM.HTTPClient.
  4. LLM.Stream.next/1 receives SSE events from the HTTP response and delegates to the adapter's decode_chunk/2 to produce normalized chunk structs.
  5. LLM.Stream.collect/2 accumulates chunks, auto-executes tool calls, and loops until the stream ends or max_rounds is reached.
  6. The final LLM.Response contains a normalized LLM.Message, usage stats, and the stop reason.

Key modules

ModuleRole
LLMPublic API — generate/2, stream/2, providers/0, models/1; accepts structured_output: for typed JSON responses
LLM.ContextRequest context: system prompt, messages, tools
LLM.MessageNormalized message across all providers
LLM.ToolTool definition behaviour and inline creation
LLM.ResponseNormalized response with message, usage, stop reason
LLM.UsageToken usage information
LLM.StreamStreaming orchestration, chunk collection, tool loop
LLM.AdapterBehaviour for wire format translation
LLM.ProviderProvider configuration behaviour
LLM.HTTPClientHTTP client behaviour (swappable for tests)

What's next