Cantrip is now an Elixir package first. The implementation and ExUnit suite are the authoritative contract.
Workflow
- Write focused ExUnit coverage before changing behavior.
- Keep changes scoped to the runtime surface being changed.
- Prefer BEAM-native ownership: supervised processes, behaviours at real boundaries, explicit state where possible.
- Treat expected operational failures as observations. Let unexpected bugs crash under supervision.
- Keep durable docs current when public API, deployment posture, or package shape changes.
Runtime Principles
- The circle is the safety boundary.
- The medium determines the shape of thought.
- Errors are observations.
- Folding is a view over prompt context. It must never delete the underlying
loom record, and it must preserve all leading
:systemmessages and the original user intent in the prompt context the model sees — otherwise the entity loses its identity or medium physics partway through a session. - The loom is append-only; reward annotation is the exception.
- Code medium evaluates LLM-emitted Elixir inside a child BEAM via Dune by
default (
sandbox: :port);:unrestrictedand:port_unrestrictedare explicit escape hatches. - Safety is layered: gate root validation, redaction, the port/Dune boundary, and deployment isolation.
Quality Gates
Run before opening or updating a PR:
mix format --check-formatted
mix compile --warnings-as-errors
mix test
mix credo --ignore refactor
mix verify runs the same gate. Run ./scripts/check_signer_policy.sh when
changing compile_and_load policy, signer configuration, or hot-load wards
— see docs/signer-key-runbook.md for what
that policy is for and how to rotate keys.
Live integration tests
mix verify is unit-test scope. Live tests against real providers exist
under test/real_llm_*, test/familiar_real_llm_*, test/live_anthropic_test.exs,
and test/zed_trace_replay_test.exs. They are gated by Cantrip.Test.RealLLMEnv
(set RUN_REAL_LLM_TESTS=1 plus CANTRIP_LLM_PROVIDER / CANTRIP_MODEL /
provider-specific API key) and skip cleanly otherwise.
Run before tagging a release, and any time a change touches the LLM adapter, medium dispatch, loom, folding, multi-send behavior, or anything else with a contract between the runtime and a real provider:
RUN_REAL_LLM_TESTS=1 CANTRIP_LLM_PROVIDER=anthropic ANTHROPIC_MODEL=claude-haiku-4-5 \
CANTRIP_TIMEOUT_MS=120000 \
mix test test/live_anthropic_test.exs test/real_llm_integration_test.exs
The class of bugs these catch is "code paths that look fine because the unit mocks return what the production code expects, not what real providers actually return."
CI runs the Anthropic live subset on pushes to main, release/**, and
v* tags. Those refs require the ANTHROPIC_API_KEY repository secret; PRs
run mix verify only so routine review does not spend provider tokens.