In a multi-tenant SaaS — every customer brings their own LLM API key — the engine must NOT hold a key. Engines round-trip through ETF and JSON, so a key on the engine becomes a key in your job queue, your session store, your audit log. ALLM's resolution chain pushes credentials to call time and lets you swap per request.
This guide covers ALLM.Keys's five-level resolution chain, the
per-call :api_key opt, app config, environment variables, custom
resolvers, and the BYOK pattern in practice.
Resolution order
When an adapter needs an API key, ALLM.Keys.get/2 walks five
sources in priority order. The first that returns a value wins:
- Per-call —
ALLM.generate(engine, request, api_key: "sk-...") - Engine
:keysresolver — function or map on the engine ALLM.Keys.put/2runtime store — global Agent (use sparingly)- Application config —
config :allm, :keys, [openai: "sk-..."] - Environment variable — provider-specific default
If none match, the adapter raises ALLM.Error.AdapterError{reason: :authentication}.
Per-call (the BYOK primitive)
The highest-priority source is the per-call :api_key opt:
engine = ALLM.Engine.new(adapter: ALLM.Providers.OpenAI, model: "gpt-4.1-mini")
{:ok, response} = ALLM.generate(engine, request, api_key: tenant.openai_key)The engine itself never sees the key. Cache the engine, share it across processes, persist it — the key flows in per request.
Available on every entry point: generate/3, stream_generate/3,
step/3, stream_step/3, chat/3, stream/3, Session.start/3,
Session.reply/4, Session.continue/3, generate_image/3,
edit_image/4, image_variations/3.
Engine resolver
For static deployments where one engine maps to one provider with one key, set the resolver at engine construction:
engine = ALLM.Engine.new(
adapter: ALLM.Providers.OpenAI,
model: "gpt-4.1-mini",
keys: %{openai: System.fetch_env!("OPENAI_API_KEY")}
)Or with a function (re-evaluated per call — useful for rotating credentials):
engine = ALLM.Engine.new(
adapter: ALLM.Providers.OpenAI,
model: "gpt-4.1-mini",
keys: fn :openai -> MyApp.Vault.fetch!(:openai_key) end
)The resolver receives the provider's key tag (:openai, :anthropic,
:gemini, or whatever a custom adapter declares) and must return a
binary key.
Application config
Library-wide defaults belong in config/runtime.exs:
config :allm, :keys,
openai: System.fetch_env!("OPENAI_API_KEY"),
anthropic: System.fetch_env!("ANTHROPIC_API_KEY"),
gemini: System.fetch_env!("GEMINI_API_KEY")Single-tenant apps where all calls use the same key — this is the shape you want. Multi-tenant apps should NOT use this; per-call override is the right primitive.
Environment variables
Each provider has a default env var:
- OpenAI →
OPENAI_API_KEY - Anthropic →
ANTHROPIC_API_KEY - Gemini →
GEMINI_API_KEY
If nothing higher in the chain matches, ALLM.Keys reads the env var
at call time. Adequate for scripts and one-shot tools; insufficient for
production multi-tenant.
Custom resolver behaviour
For non-trivial cases — Vault integration, dynamic key rotation,
per-tenant override on a shared engine — implement the
ALLM.Keys.Resolver behaviour:
defmodule MyApp.LLMKeys do
@behaviour ALLM.Keys.Resolver
@impl true
def fetch(:openai, _opts) do
case Process.get(:current_tenant) do
nil -> :error
tenant -> {:ok, MyApp.Vault.openai_key(tenant)}
end
end
def fetch(:anthropic, _opts), do: {:ok, System.fetch_env!("ANTHROPIC_API_KEY")}
endWire it on the engine:
engine = ALLM.Engine.new(
adapter: ALLM.Providers.OpenAI,
model: "gpt-4.1-mini",
keys: MyApp.LLMKeys
)fetch/2 returns {:ok, binary} on hit or :error to fall through to
the next chain link.
The BYOK pattern in practice
A canonical multi-tenant SaaS using ALLM looks like this:
defmodule MyApp.Chat do
@engine ALLM.Engine.new(
adapter: ALLM.Providers.OpenAI,
model: "gpt-4.1-mini"
)
def ask(tenant_id, message) do
tenant = MyApp.Tenants.get!(tenant_id)
ALLM.chat(@engine, [ALLM.user(message)], api_key: tenant.openai_key)
end
endThe engine is module-level (built once, cached in beam memory). The key per call. Crashes won't leak keys to crash dumps; ETF dumps of the engine won't carry credentials; logs won't accidentally print them.
What NOT to do
# DON'T put per-tenant keys on the engine.
engine = ALLM.Engine.new(
adapter: ALLM.Providers.OpenAI,
keys: %{openai: tenant.openai_key} # leaks into ETF, JSON, crash dumps
)# DON'T use ALLM.Keys.put/2 for BYOK.
ALLM.Keys.put(:openai, tenant.openai_key)
# ^^ this is a globally-named Agent. Two concurrent requests for two
# different tenants race — request B reads request A's key.ALLM.Keys.put/2 is for development and single-tenant scripts. For
multi-tenant production, ALWAYS use the per-call opt or a custom
resolver.
Verifying keys aren't on engines
ALLM's tests verify this invariant — if you persist an engine, no key material appears in the binary. You can verify locally:
iex> engine = ALLM.Engine.new(
...> adapter: ALLM.Providers.Fake,
...> adapter_opts: [script: [{:text, "ok"}, {:finish, :stop}]]
...> )
iex> binary = :erlang.term_to_binary(engine)
iex> String.contains?(inspect(binary), "sk-")
false(With Fake there's no key to leak. With a real provider, do the same check after constructing the engine — there should be no key material in the term.)
Where to next
getting_started.md— the quick install + first-call tour.errors_and_retries.md—:authenticationreason and recovery.examples/README.md§ "SaaS bring-your-own-key (BYOK)" — runnable pattern.ALLM.Keysmodule docs for the full API reference.