ALLM.Providers.Fake is the deterministic, scripted adapter that ships
with the library. It's the canonical test vehicle — fast (~50µs per
call), serializable, requires no network, and passes every conformance
suite that real provider adapters do.
This guide consolidates the script-entry vocabulary, the cursor model,
and the test-only :usage / :record opts. Reach for it whenever you
write a test against ALLM's orchestration layer.
When to reach for it
Use ALLM.Providers.Fake for every orchestration test:
chat/3/step/3flows including tool execution.- Streaming tests (
stream/3,stream_step/3). - Session state transitions (
:idle→:awaiting_tools→:completed). - Error-path tests (rate limits, content filters, mid-stream failures).
- Multi-turn loop bound tests (
:max_turns,:halt_when, ask-user).
Use real-provider wire tests (@tag :wire, Bypass/Plug.Test) ONLY
when you're testing request/response byte-shape. For everything else,
the Fake is faster, deterministic, and decoupled from provider quirks.
Script-entry vocabulary
A script is a list of tagged tuples — each tuple describes one event the Fake will produce. Two disjoint vocabularies exist; the leading tag disambiguates.
Spec entries (user-facing)
| Tag | Shape | Emits |
|---|---|---|
{:text, s} | binary | :text_delta (streaming) / accumulates text (non-streaming) |
{:tool_call, kw} | keyword with :id, :name, :arguments | :tool_call_completed + sets finish_reason: :tool_calls |
{:tool_call_delta, kw} | keyword with :id, :arguments_delta | :tool_call_delta |
{:usage, map} | map of %Usage{} fields | sets response.usage (non-streaming) / metadata.usage on :message_completed (streaming) |
{:raw_chunk, term} | opaque | :raw_chunk |
{:finish, reason} | atom | terminal :message_completed |
{:error, term} | atom (legal reason) or any term | :error event (mid-stream) |
{:delay, ms} | non-neg int | Process.sleep(ms) — no event |
{:sleep, ms} | non-neg int | deprecated alias of :delay |
Conformance-harness entries
| Tag | Shape | Notes |
|---|---|---|
{:ok, map} | a %Response{}-shaped map | one entry per call |
{:error, reason, opts} | 3-tuple | hands off to AdapterError.new/2 |
{:text_delta, s} | streaming-only | identical to {:text, s} |
{:preflight_error, reason, opts} | streaming-only | synchronous {:error, _} from stream/2 |
{:error_event, reason, opts} | streaming-only | mid-stream :error event |
{:stream_error, reason, opts} | streaming-only | %StreamError{} mid-stream |
The full grammar lives in ALLM.Providers.Fake.Script's moduledoc.
Construction
engine = ALLM.Engine.new(
adapter: ALLM.Providers.Fake,
adapter_opts: [
script: [{:text, "ok"}, {:finish, :stop}]
]
)For multi-call tests, use :scripts (a list of per-call lists):
adapter_opts: [
scripts: [
[{:tool_call, id: "c0", name: "echo", arguments: %{"x" => 1}}, {:finish, :tool_calls}],
[{:text, "done"}, {:finish, :stop}]
]
]Streaming uses :stream_script with the same shapes (it accepts either
a flat list for a single call or a list-of-lists for multi-call).
Cursor patterns
Multi-call scripts advance a per-process cursor on every call. By default
the cursor lives in the process dictionary keyed by :erlang.phash2(scripts)
— isolated per ExUnit test process (async: true), GC'd on pid-down,
zero-setup for the common case.
Footgun: content-equal scripts collide
Two engines built with byte-identical :scripts values in the same
process share the cursor. Workaround:
cursor = ALLM.Providers.Fake.start_script_cursor()
engine1 = ALLM.Engine.new(
adapter: ALLM.Providers.Fake,
adapter_opts: [scripts: scripts, script_cursor: cursor]
)start_script_cursor/0 returns an Agent pid; cursor_index/1 reads it
so a test can assert how many calls have been consumed.
Cross-process cursor sharing
When a test dispatches the adapter call across processes
(Task.async/1), the explicit cursor is load-bearing — process-dict
isolation would otherwise reset the cursor for each Task.
The :usage opt (Phase 21.2)
adapter_opts[:usage] materializes a %ALLM.Usage{} on every response
without writing the usage entry per script:
adapter_opts: [
script: [{:text, "ok"}, {:finish, :stop}],
usage: [input_tokens: 12, output_tokens: 4]
]Accepts a pre-built %Usage{} or a keyword list (normalized via
Usage.new/1). The opt wins over any per-script {:usage, _} entry
for the same call.
On streaming, the Usage rides on the :message_completed payload's
metadata.usage key (additive payload-key extension — no new event
variant). ALLM.StreamCollector.apply_event/2 copies it onto
state.usage so non-streaming collection produces a
%Response{usage: _}.
A per-script {:usage, _} entry behaves the same on streaming: it
accumulates into metadata.usage rather than emitting a :raw_chunk.
Real adapters emitting {:raw_chunk, {:usage, _}} keep their existing
path; the change is scoped to Fake's {:usage, _} entry.
The :record opt (Phase 21.2)
adapter_opts[:record] accepts a pid that receives
{:allm_fake_record, %Request{}, opts} verbatim BEFORE the script
interpretation runs. The recording fires once per call — both
generate/2 and stream/2 send before opening the stream.
test "tool call sends the right schema" do
me = self()
engine = ALLM.Engine.new(
adapter: ALLM.Providers.Fake,
adapter_opts: [
script: [{:text, "ok"}, {:finish, :stop}],
record: me
],
tools: [my_tool]
)
{:ok, _} = ALLM.chat(engine, [ALLM.user("trigger")])
assert_receive {:allm_fake_record, %ALLM.Request{tools: [tool]}, _opts}
assert tool.schema["properties"]["city"]["type"] == "string"
endopts are forwarded verbatim — no key scrubbing. The caller owns the
opts they passed in; redact via Keyword.delete/2 before asserting if
needed. A dead recording pid raises ArgumentError — a dead pid is a
test bug.
Cleanup observation
For streaming tests asserting that Stream.resource/3's after_fun
runs:
ref = :counters.new(1, [:atomics])
{:ok, stream} = ALLM.Providers.Fake.stream(req,
adapter_opts: [script: [...], cleanup_observer: ref])
_ = Enum.take(stream, 2)
assert :counters.get(ref, 1) == 1The counter increments at most once per stream (on consumer halt,
reducer throws, or Stream.run/1 scope exit). Brutal Process.exit(pid, :kill) skips cleanup per OTP design — don't simulate :kill in tests.
Retry simulation
adapter_opts[:retry_until_call] makes the first n - 1 calls fail
transiently (with :timeout) and the n-th call succeed:
adapter_opts: [
script: [{:text, "ok"}, {:finish, :stop}],
retry_until_call: 3
]generate/2 retries automatically under the default policy. stream/2
emits the transient failure as a mid-stream {:error, _} event so the
consumer reduces to %Response{finish_reason: :error} per the
mid-stream error contract (ALLM.Runner / chat/3 do not retry the
streaming arm — spec §6.1).
Cross-process engine injection
When a test fans work out across Task.async/1 and you want the
workers to see the test's engine, use ALLM.Sandbox.set_engine/1:
test "fan-out workers use the test engine" do
ALLM.Sandbox.set_engine(fake_engine())
results =
["a", "b", "c"]
|> Task.async_stream(fn input ->
ALLM.generate(ALLM.Sandbox.get_engine(), ALLM.request([ALLM.user(input)]))
end)
|> Enum.map(fn {:ok, r} -> r end)
assert length(results) == 3
endSandbox.get_engine/0 walks $callers so worker processes inherit the
registering ancestor's engine — same idiom as Mox.allow/3 and
Ecto.Adapters.SQL.Sandbox.allow/3.
Where to next
streaming.md— the event-shape vocabulary the scripts emit.tools.md— tool-loop tests against scripted tool calls.sessions.md— multi-turn persistence tests.ALLM.Providers.FakeandALLM.Providers.Fake.Scriptmoduledocs — reference-level documentation of every entry tag.