Cantrip is an Elixir/OTP runtime for language-model entities acting through mediums, gates, wards, and looms. It is the canonical package implementation of the Cantrip spellbook lineage: the original ghost-library vocabulary is preserved, while the runtime surface is ordinary Elixir.

Core Shape

A cantrip is a reusable value. It combines:

  • an LLM behaviour implementation and provider state
  • an identity with system prompt and model-facing options
  • a circle describing medium, gates, and wards
  • optional loom storage, retry, and folding configuration

Casting a cantrip starts a one-shot entity. Summoning a cantrip starts a supervised entity process that can receive multiple intents. The entity is what emerges from the loop; the cantrip is the configuration that produces it.

The circle is the runtime contract:

A = M union G - W

The medium determines the shape of thought. Gates expose host capabilities. Wards bound runtime behavior. The loom is the durable tree left behind by the entity's turns. The Familiar's default code medium runs trusted Elixir in the host BEAM for operator-local coding work, while plain code-medium circles without a sandbox ward default to the port boundary.

Runtime Loop

Cantrip.cast/3 starts an internal supervised entity server for one episode. Cantrip.summon/1 starts a persistent entity; Cantrip.summon/2 starts one and immediately runs its first intent. Cantrip.send/3 continues it.

Each turn:

  1. folds prompt context if configured
  2. presents the selected medium to the LLM
  3. invokes the provider through the internal provider-call boundary
  4. classifies the response into the selected medium's input shape
  5. executes through the medium
  6. appends the utterance and observations to the loom
  7. either terminates, truncates, or continues

Errors that belong to the entity's operating environment are observations. They are returned to the loop as data instead of crashing the process.

Mediums

The conversation medium projects gates as provider tool definitions.

The code medium evaluates Elixir with persistent bindings. Plain code-medium circles default to Dune-restricted Elixir in a child BEAM process, equivalent to sandbox: :port. Add %{port_runner: [...]} to put that child under deployment-level OS/container controls. sandbox: :port_unrestricted keeps the child process but evaluates raw Elixir there. sandbox: :dune routes through the in-process Dune evaluator — a deliberately smaller-surface variant of the code medium (see docs/port-isolated-runtime.md "Dune Variant"); entity prompts need to fit that surface. sandbox: :unrestricted is the trusted host-BEAM evaluator, and it is the Familiar default.

The bash medium executes one shell command per turn inside an OS sandbox. Shell process state does not persist; filesystem effects do only for paths admitted by %{bash_writable_paths: [...]}. The medium fails closed when no sandbox adapter is available (bubblewrap on Linux, sandbox-exec on macOS, or an explicit deployment adapter later).

The Bash adapter contract is empirical, not aspirational: CI exercises a representative local shell workload suite under the available OS sandbox. The suite covers git, make, jq, /dev/null redirects, and common find/sed/grep pipelines. The workload suite opts into %{bash_network: :on} because GitHub-hosted Linux runners can install bubblewrap but cannot reliably create the network namespace bubblewrap uses for default network denial. Separate tests pin the default network-deny command shape (--unshare-net) so adapter regressions still fail locally and in capable CI. New shell workload expectations should land as tests first so sandbox configuration gaps surface in CI instead of in user sessions.

Bash gates are projected as commands in a per-turn directory placed at the front of PATH. A circle with read_file can run read_file README.md; a circle with mix can run mix test test/foo_test.exs. The shell command is not the gate authority: wrappers call back to the parent BEAM, where the ordinary gate executor applies dependencies, wards, telemetry, and redaction. The done gate is exposed as cantrip_done because done is a shell keyword. SUBMIT: output remains supported for shell-only answers.

The wrapper protocol is filesystem-based by design: a wrapper writes a per-call request directory, the parent runtime polls for ready calls, and the wrapper replays the host response to stdout/stderr. This keeps the protocol portable across Seatbelt and bubblewrap without socket mount policy, at the cost of a small polling latency floor. It is tuned for LLM-rate gate calls, not high-frequency shell RPC.

Gate command names live at the front of PATH. If a gate name collides with a shell builtin or common command (test, time, read, etc.), the gate command wins when invoked as an external command; use a non-colliding gate name when the shell builtin must remain ergonomic.

medium_opts: %{sandbox: :passthrough} exists only for tests. It is rejected outside Mix.env() == :test and is not a deployment fallback.

Bash-specific wards:

  • %{bash_writable_paths: [path, ...]} allows writes under those paths.
  • %{bash_network: :on} enables network for adapters that support it; default is network off.
  • %{bash_timeout_ms: ms} overrides the per-command timeout.
  • %{bash_max_output_bytes: n} bounds the shell observation output.

ACP stdio embedding must start the :cantrip application before sessions create event bridges. Cantrip.ACP.Server.run/1 does this for the packaged entrypoint; custom embedders should either call Application.ensure_all_started(:cantrip) or supervise Cantrip.ACP.EventBridgeSupervisor themselves.

ACP request metadata is also the production trace-correlation boundary. The handler accepts _meta.trace_id or _meta.cantrip_trace_id on session/new and session/prompt; the Familiar runtime carries that value into Cantrip.summon/3 / Cantrip.send/3 so telemetry emitted by the entity can be joined to an external request, job, or editor operation. Without that metadata, the entity mints its own trace ID. _meta is not a Familiar configuration channel: LLM selection, loom paths, turn budgets, and other runtime controls come from server/runtime configuration, not from editor-supplied request metadata.

Composition

Composition uses the public package API, not special delegation gates. Code-medium entities call Cantrip.new/1, Cantrip.cast/3, and Cantrip.cast_batch/2 directly. Parent context supplies inherited child LLM, wards, root dependencies, cancellation, streaming, and loom grafting. Child casts are not an escape hatch around the circle: a parent checks its max_depth before any pre-built child starts, and the child runs under WardPolicy.compose(parent.circle.wards, child.circle.wards). Numeric wards tighten with min, boolean wards such as require_done_tool tighten with or, and cast_batch uses the same path for each child while respecting the parent's max_concurrent_children.

Parents can also declare constraints on what kinds of children may be spawned. These declaration-time child wards are checked before runtime ward composition:

  • %{child_medium_allowlist: [:conversation, :code]}
  • %{child_gate_allowlist: [:done, :read_file]}
  • %{child_gate_denylist: [:compile_and_load]}
  • %{child_max_turns_ceiling: n}
  • %{child_max_depth_ceiling: n}
  • %{max_children_total: n}

The allow/deny wards constrain the child circle shape. Ceiling wards require the child to declare the corresponding runtime ward at or below the ceiling; they do not silently rewrite the child. max_children_total counts accepted child casts cumulatively across a code-medium entity's state. Rejected child construction returns {:error, reason}. Rejected child casts produce an error observation on the parent loom and emit [:cantrip, :ward, :child_rejected].

This is the RLM pattern in package form: large context lives in the medium, subtasks run as child cantrips, and summaries return upward. Composition is code, not a static workflow graph.

Streaming

Streaming events are delivered as {:cantrip_event, event} messages to the configured :stream_to process. Consumers that opt into :stream_barrier? apply backpressure at the event boundary: after each event, the runtime sends a barrier message and waits until the consumer acknowledges it. cast_stream/2 uses that path by default, and its stream resource acknowledges barriers as it drains events, so a caller that has not started consuming cannot accumulate an unbounded mailbox. ACP familiar sessions also use stream barriers so slow ACP notification delivery slows the entity run instead of allowing bridge mailbox growth.

Plain stream_to: pid without :stream_barrier? remains fire-and-forget for compatibility. Use it only when the receiver is known to drain at producer rate; otherwise its mailbox can grow without bound. Pass stream_barrier?: true with a receiver that understands {:cantrip_barrier, from, ref} and replies with {:cantrip_barriered, ref}.

Loom

The loom is the durable artifact of the loop. It records intents, turns, utterances, observations, child turns, metadata, and fork lineage.

Backends:

  • memory for ephemeral tests and scratch sessions
  • JSONL for portable traces. The backend serializes appends through an in-BEAM per-path lock, but it is still a single-writer file format across OS processes. Use one writer per file; use Mnesia when multiple nodes need shared durable state.
  • Mnesia for BEAM-native durable workspace state

Folding is a view over prompt context. When the message history grows past a configured threshold, older turns are summarized into a compact [Folded: turns N..M] marker in the LLM's input. The original turns remain in the loom unchanged — folding shrinks what the model sees on the next call, not what was recorded. Configure with the :folding option on Cantrip.new/1.

Code-medium code_state is kept full in memory so fork/replay can restore the latest sandbox bindings cheaply. Durable storage writes binding-level deltas after the first snapshot: unchanged bindings are referenced by key order, while new or changed bindings are written once in the turn that changed them. JSONL and Mnesia loaders expand those deltas back into full code_state maps before returning loom.turns, so callers keep the same in-memory API without paying O(turns x cumulative_binding_size) storage growth.

Safety Posture

The controls are explicit and scoped:

  • gate root validation constrains filesystem gates
  • redaction scrubs observations before they reach the entity
  • diagnostic redaction protects protocol/debug output
  • loop wards bound turns, depth, timeouts, and selected policies
  • Dune-in-port evaluation denies ambient filesystem/system/process authority and keeps LLM-written Elixir out of the host BEAM
  • child-BEAM telemetry events are forwarded over the port protocol and re-emitted by the parent with the same trace context
  • port_runner lets deployments put the child process inside an OS/container sandbox
  • optional Dune routes code evaluation through an in-VM restricted evaluator
  • compile/load wards scope hot-loaded modules (exact allow_compile_modules list), paths, hashes, and signers; framework modules under Elixir.Cantrip.* (except Elixir.Cantrip.Hot.*) are rejected even when explicitly allowlisted

The default port sandbox protects the host BEAM and denies ambient language capabilities. Deployment-level OS controls remain useful defense in depth for mounts, network, CPU, memory, and user isolation.

Struct conventions for credential-bearing data

Any struct that holds credential-shaped fields — API keys, bearer tokens, authorization headers, signed cookies — must declare @derive {Inspect, only: [<non-secret-fields>]} (or @derive {Inspect, except: [<secret-fields>]}). This prevents accidental leak via default inspect/1 in IEx sessions, error output, logger calls, or debug dumps. The safe formatting helpers cover the runtime boundary error surfaces; the @derive Inspect convention covers the construction-and-debug surface.

Current durable structs do not hold credentials directly — :llm_state on the top-level %Cantrip{} is a plain map carrying provider state including :api_key, and downstream code is expected to either redact at the boundary via the safe formatting helpers or to not log raw :llm_state. Future structs that directly hold credentials must adopt the convention above.

Process Inventory

Every process kind cantrip starts, plus its owner, restart strategy, and shutdown semantics. Reference this section when adding a new process.

Process kindStarted byOwnerCrash-restartShutdown
Internal entity server (GenServer)Cantrip.cast/3, Cantrip.summon/1 via DynamicSupervisor.start_childentity dynamic supervisor:temporary (no auto-restart; caller gets error)default GenServer 5s; terminate/2 sends :stop to runner
Per-entity runner Taskentity server runner (lib/cantrip/entity_server.ex)registered Task.Supervisor named :Cantrip.EntityTaskSupervisor:temporary (Task.Supervisor default):brutal_kill 5s on app shutdown; in-progress episodes interrupted
Code-medium child BEAMport sandbox launcher (lib/cantrip/medium/code/port.ex)not supervised; linked to eval contextN/A (process-level)on eval timeout or parent crash: implicit exit via port boundary
Port-child protocol loopspawn_link in port_child.ex:140linked to parent (child-side bootstrap)N/A (linked)parent exit propagates crash via link
ACP EventBridge loopTask.Supervisor.start_child/2 in acp/event_bridge.exregistered Task.Supervisor named :Cantrip.ACP.EventBridgeSupervisor:temporary (Task.Supervisor default):DOWN from monitored owner OR explicit :stop message
Cantrip.cast_stream/2 taskTask.async (lib/cantrip.ex:696)linked to caller; caller drains via StreamN/A (linked task)stream close calls Task.shutdown(:brutal_kill) on early halt; normal completion drains remaining events
Cantrip.cast_batch/2 childrenTask.async_stream (lib/cantrip.ex:565)Task.async_stream context; bounded by max_concurrent_children wardN/A (bounded enumeration)killed on max_concurrency overflow or timeout
Code/Bash medium eval TasksTask.async in medium/code.ex:164, medium/bash.ex:121unlinked; timeout-guarded by code_eval_timeout_ms / similar wardN/A (unlinked)Task.yield + Task.shutdown(:brutal_kill) on timeout

This inventory is the contract; any new long-lived or supervised process must extend this table.