Cantrip is an Elixir/OTP runtime for language-model entities acting through mediums, gates, wards, and looms. It is the canonical package implementation of the Cantrip spellbook lineage: the original ghost-library vocabulary is preserved, while the runtime surface is ordinary Elixir.
Core Shape
A cantrip is a reusable value. It combines:
- an LLM behaviour implementation and provider state
- an identity with system prompt and model-facing options
- a circle describing medium, gates, and wards
- optional loom storage, retry, and folding configuration
Casting a cantrip starts a one-shot entity. Summoning a cantrip starts a supervised entity process that can receive multiple intents. The entity is what emerges from the loop; the cantrip is the configuration that produces it.
The circle is the runtime contract:
A = M union G - WThe medium determines the shape of thought. Gates expose host capabilities. Wards bound runtime behavior. The loom is the durable tree left behind by the entity's turns. The Familiar's default code medium runs trusted Elixir in the host BEAM for operator-local coding work, while plain code-medium circles without a sandbox ward default to the port boundary.
Runtime Loop
Cantrip.cast/3 starts an internal supervised entity server for one episode.
Cantrip.summon/1 starts a persistent entity; Cantrip.summon/2 starts one
and immediately runs its first intent. Cantrip.send/3 continues it.
Each turn:
- folds prompt context if configured
- presents the selected medium to the LLM
- invokes the provider through the internal provider-call boundary
- classifies the response into the selected medium's input shape
- executes through the medium
- appends the utterance and observations to the loom
- either terminates, truncates, or continues
Errors that belong to the entity's operating environment are observations. They are returned to the loop as data instead of crashing the process.
Mediums
The conversation medium projects gates as provider tool definitions.
The code medium evaluates Elixir with persistent bindings. Plain code-medium
circles default to Dune-restricted Elixir in a child BEAM process, equivalent
to sandbox: :port. Add %{port_runner: [...]} to put that child under
deployment-level OS/container controls. sandbox: :port_unrestricted keeps
the child process but evaluates raw Elixir there. sandbox: :dune routes
through the in-process Dune evaluator — a deliberately smaller-surface variant
of the code medium (see docs/port-isolated-runtime.md "Dune Variant");
entity prompts need to fit that surface. sandbox: :unrestricted is the
trusted host-BEAM evaluator, and it is the Familiar default.
The bash medium executes one shell command per turn inside an OS
sandbox. Shell process state does not persist; filesystem effects do only for
paths admitted by %{bash_writable_paths: [...]}. The medium fails closed when
no sandbox adapter is available (bubblewrap on Linux, sandbox-exec on
macOS, or an explicit deployment adapter later).
The Bash adapter contract is empirical, not aspirational: CI exercises a
representative local shell workload suite under the available OS sandbox. The
suite covers git, make, jq, /dev/null redirects, and common
find/sed/grep pipelines. The workload suite opts into
%{bash_network: :on} because GitHub-hosted Linux runners can install
bubblewrap but cannot reliably create the network namespace bubblewrap uses
for default network denial. Separate tests pin the default network-deny command
shape (--unshare-net) so adapter regressions still fail locally and in
capable CI. New shell workload expectations should land as tests first so
sandbox configuration gaps surface in CI instead of in user sessions.
Bash gates are projected as commands in a per-turn directory placed at the
front of PATH. A circle with read_file can run read_file README.md; a
circle with mix can run mix test test/foo_test.exs. The shell command is
not the gate authority: wrappers call back to the parent BEAM, where the
ordinary gate executor applies dependencies, wards, telemetry, and redaction.
The done gate is exposed as cantrip_done because done is a shell keyword.
SUBMIT: output remains supported for shell-only answers.
The wrapper protocol is filesystem-based by design: a wrapper writes a per-call request directory, the parent runtime polls for ready calls, and the wrapper replays the host response to stdout/stderr. This keeps the protocol portable across Seatbelt and bubblewrap without socket mount policy, at the cost of a small polling latency floor. It is tuned for LLM-rate gate calls, not high-frequency shell RPC.
Gate command names live at the front of PATH. If a gate name collides with a
shell builtin or common command (test, time, read, etc.), the gate command
wins when invoked as an external command; use a non-colliding gate name when the
shell builtin must remain ergonomic.
medium_opts: %{sandbox: :passthrough} exists only for tests. It is rejected
outside Mix.env() == :test and is not a deployment fallback.
Bash-specific wards:
%{bash_writable_paths: [path, ...]}allows writes under those paths.%{bash_network: :on}enables network for adapters that support it; default is network off.%{bash_timeout_ms: ms}overrides the per-command timeout.%{bash_max_output_bytes: n}bounds the shell observation output.
ACP stdio embedding must start the :cantrip application before sessions
create event bridges. Cantrip.ACP.Server.run/1 does this for the packaged
entrypoint; custom embedders should either call Application.ensure_all_started(:cantrip)
or supervise Cantrip.ACP.EventBridgeSupervisor themselves.
ACP request metadata is also the production trace-correlation boundary. The
handler accepts _meta.trace_id or _meta.cantrip_trace_id on session/new
and session/prompt; the Familiar runtime carries that value into
Cantrip.summon/3 / Cantrip.send/3 so telemetry emitted by the entity can be
joined to an external request, job, or editor operation. Without that metadata,
the entity mints its own trace ID. _meta is not a Familiar configuration
channel: LLM selection, loom paths, turn budgets, and other runtime controls
come from server/runtime configuration, not from editor-supplied request
metadata.
Composition
Composition uses the public package API, not special delegation gates.
Code-medium entities call Cantrip.new/1, Cantrip.cast/3, and
Cantrip.cast_batch/2 directly. Parent context supplies inherited child LLM,
wards, root dependencies, cancellation, streaming, and loom grafting.
Child casts are not an escape hatch around the circle: a parent checks its
max_depth before any pre-built child starts, and the child runs under
WardPolicy.compose(parent.circle.wards, child.circle.wards). Numeric wards
tighten with min, boolean wards such as require_done_tool tighten with
or, and cast_batch uses the same path for each child while respecting the
parent's max_concurrent_children.
Parents can also declare constraints on what kinds of children may be spawned. These declaration-time child wards are checked before runtime ward composition:
%{child_medium_allowlist: [:conversation, :code]}%{child_gate_allowlist: [:done, :read_file]}%{child_gate_denylist: [:compile_and_load]}%{child_max_turns_ceiling: n}%{child_max_depth_ceiling: n}%{max_children_total: n}
The allow/deny wards constrain the child circle shape. Ceiling wards require
the child to declare the corresponding runtime ward at or below the ceiling;
they do not silently rewrite the child. max_children_total counts accepted
child casts cumulatively across a code-medium entity's state. Rejected child
construction returns {:error, reason}. Rejected child casts produce an error
observation on the parent loom and emit [:cantrip, :ward, :child_rejected].
This is the RLM pattern in package form: large context lives in the medium, subtasks run as child cantrips, and summaries return upward. Composition is code, not a static workflow graph.
Streaming
Streaming events are delivered as {:cantrip_event, event} messages to the
configured :stream_to process. Consumers that opt into :stream_barrier?
apply backpressure at the event boundary: after each event, the runtime sends
a barrier message and waits until the consumer acknowledges it. cast_stream/2
uses that path by default, and its stream resource acknowledges barriers as it
drains events, so a caller that has not started consuming cannot accumulate an
unbounded mailbox. ACP familiar sessions also use stream barriers so slow ACP
notification delivery slows the entity run instead of allowing bridge mailbox
growth.
Plain stream_to: pid without :stream_barrier? remains fire-and-forget for
compatibility. Use it only when the receiver is known to drain at producer
rate; otherwise its mailbox can grow without bound. Pass
stream_barrier?: true with a receiver that understands
{:cantrip_barrier, from, ref} and replies with {:cantrip_barriered, ref}.
Loom
The loom is the durable artifact of the loop. It records intents, turns, utterances, observations, child turns, metadata, and fork lineage.
Backends:
- memory for ephemeral tests and scratch sessions
- JSONL for portable traces. The backend serializes appends through an in-BEAM per-path lock, but it is still a single-writer file format across OS processes. Use one writer per file; use Mnesia when multiple nodes need shared durable state.
- Mnesia for BEAM-native durable workspace state
Folding is a view over prompt context. When the message history grows past
a configured threshold, older turns are summarized into a compact [Folded: turns N..M] marker in the LLM's input. The original turns remain in the
loom unchanged — folding shrinks what the model sees on the next call, not
what was recorded. Configure with the :folding option on Cantrip.new/1.
Code-medium code_state is kept full in memory so fork/replay can restore the
latest sandbox bindings cheaply. Durable storage writes binding-level deltas
after the first snapshot: unchanged bindings are referenced by key order, while
new or changed bindings are written once in the turn that changed them. JSONL
and Mnesia loaders expand those deltas back into full code_state maps before
returning loom.turns, so callers keep the same in-memory API without paying
O(turns x cumulative_binding_size) storage growth.
Safety Posture
The controls are explicit and scoped:
- gate root validation constrains filesystem gates
- redaction scrubs observations before they reach the entity
- diagnostic redaction protects protocol/debug output
- loop wards bound turns, depth, timeouts, and selected policies
- Dune-in-port evaluation denies ambient filesystem/system/process authority and keeps LLM-written Elixir out of the host BEAM
- child-BEAM telemetry events are forwarded over the port protocol and re-emitted by the parent with the same trace context
port_runnerlets deployments put the child process inside an OS/container sandbox- optional Dune routes code evaluation through an in-VM restricted evaluator
- compile/load wards scope hot-loaded modules (exact
allow_compile_moduleslist), paths, hashes, and signers; framework modules underElixir.Cantrip.*(exceptElixir.Cantrip.Hot.*) are rejected even when explicitly allowlisted
The default port sandbox protects the host BEAM and denies ambient language capabilities. Deployment-level OS controls remain useful defense in depth for mounts, network, CPU, memory, and user isolation.
Struct conventions for credential-bearing data
Any struct that holds credential-shaped fields — API keys, bearer tokens,
authorization headers, signed cookies — must declare @derive {Inspect, only: [<non-secret-fields>]} (or @derive {Inspect, except: [<secret-fields>]}).
This prevents accidental leak via default inspect/1 in IEx sessions, error
output, logger calls, or debug dumps. The safe formatting helpers cover the
runtime boundary error surfaces; the @derive Inspect convention covers the
construction-and-debug surface.
Current durable structs do not hold credentials directly — :llm_state on the
top-level %Cantrip{} is a plain map carrying provider state including
:api_key, and downstream code is expected to either redact at the boundary
via the safe formatting helpers or to not log raw :llm_state. Future structs that
directly hold credentials must adopt the convention above.
Process Inventory
Every process kind cantrip starts, plus its owner, restart strategy, and shutdown semantics. Reference this section when adding a new process.
| Process kind | Started by | Owner | Crash-restart | Shutdown |
|---|---|---|---|---|
| Internal entity server (GenServer) | Cantrip.cast/3, Cantrip.summon/1 via DynamicSupervisor.start_child | entity dynamic supervisor | :temporary (no auto-restart; caller gets error) | default GenServer 5s; terminate/2 sends :stop to runner |
| Per-entity runner Task | entity server runner (lib/cantrip/entity_server.ex) | registered Task.Supervisor named :Cantrip.EntityTaskSupervisor | :temporary (Task.Supervisor default) | :brutal_kill 5s on app shutdown; in-progress episodes interrupted |
| Code-medium child BEAM | port sandbox launcher (lib/cantrip/medium/code/port.ex) | not supervised; linked to eval context | N/A (process-level) | on eval timeout or parent crash: implicit exit via port boundary |
| Port-child protocol loop | spawn_link in port_child.ex:140 | linked to parent (child-side bootstrap) | N/A (linked) | parent exit propagates crash via link |
| ACP EventBridge loop | Task.Supervisor.start_child/2 in acp/event_bridge.ex | registered Task.Supervisor named :Cantrip.ACP.EventBridgeSupervisor | :temporary (Task.Supervisor default) | :DOWN from monitored owner OR explicit :stop message |
Cantrip.cast_stream/2 task | Task.async (lib/cantrip.ex:696) | linked to caller; caller drains via Stream | N/A (linked task) | stream close calls Task.shutdown(:brutal_kill) on early halt; normal completion drains remaining events |
Cantrip.cast_batch/2 children | Task.async_stream (lib/cantrip.ex:565) | Task.async_stream context; bounded by max_concurrent_children ward | N/A (bounded enumeration) | killed on max_concurrency overflow or timeout |
| Code/Bash medium eval Tasks | Task.async in medium/code.ex:164, medium/bash.ex:121 | unlinked; timeout-guarded by code_eval_timeout_ms / similar ward | N/A (unlinked) | Task.yield + Task.shutdown(:brutal_kill) on timeout |
This inventory is the contract; any new long-lived or supervised process must extend this table.