Changelog
View SourceAll notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
1.0.0 - 2026-06-17
Added
Phase 7 — distributed multi-node sessions (Tiers 0/1/2 + eager resume).
Normandy.Behaviours.SessionStore.Postgres— durable session store over Ecto/Postgres (entries, opaque turn state, config template), with migrations andresume_policy/config_templatecolumns. The Tier-1 durable store.Normandy.Behaviours.SessionRegistry.Horde(:via,members: :auto) andNormandy.Agents.Turn.Supervisor.Horde— CRDT-backed distributed registry + dynamic supervisor that route to / ownTurn.Servers across a cluster (Tier-2).Normandy.Agents.Turn.ResumeReaper— selective eager handoff on:nodedown. BecauseHorde.DynamicSupervisordoes not redistribute a dead node's children, the reaper restarts the eager, unregistered, non-terminal sessions whose server died with the lost node. Lazy rehydrate (route →whereis→ rehydrate-on-demand) needs no reaper.Normandy.Behaviours.AgentTemplate+ a persisted config template (Normandy.Agents.Turn.ConfigTemplate): the non-secret config (model/temperature/behaviour refs/tools) needed to reconstruct an agent on rehydration; atemplate_providerresolves it. Credentials are never persisted.SessionStoregainedsave_config_template/3,load_config_template/2, andlist_resumable/1(eager session ids);SessionRegistrygained the optionalchild_name/2({:via, …}) for atomic, supervisor-driven start that closes the start-time race.InMemory/ETS/Nativeimpls were extended to match.Normandy.Cluster.child_specs/1— one-call wiring of the Horde registry + supervisor + reaper (plus an optionallibclusterCluster.Supervisorwhen:topologiesare supplied andlibclusteris loaded).- Tier model: Tier-0 in-memory/ETS single-node default (unchanged); Tier-1 durable store + lazy rehydrate; Tier-2 distributed registry/supervisor + eager reaper.
- Drop-in backends behind the same
SessionStore/SessionRegistryseam:SessionStore.Mnesia(OTP-native distributed store, transactional appends, no external DB),SessionStore.Redis(Redis Streams),SessionRegistry.Redis(:viaregistry using Redis as the name table), and theNormandy.Cluster.setup_mnesia_store!/1/redis_child_specs/1wiring helpers.
Guardrails — pre-charge admission, threaded context, fail-open, semantic scope.
Normandy.Agents.BaseAgent.admit/2,3runs input guardrails as a pre-charge filter (no turn, memory, or circuit breaker), returning:ok | {:block, violations}instead of raising — reject disallowed input before paying for a turn.Normandy.Guardrails.run/3threads a caller-suppliedcontextmap to guards implementing the optionalGuard.check/3callback (check/2-only guards are unaffected) — host data a guard needs but the framework must not interpret (ids, locale, conversation history).- Per-guard
:on_errorpolicy::reraise(default — a config bug stays a crash),:open(rescue the guard's raise and treat as a pass, for a guard fronting a flaky external service),:closed(rescue and turn it into a:guard_errorviolation). Only thecheckcall is rescued; a malformed return always raises. Normandy.Guardrails.Builtins.SemanticScope— a provider-agnostic hybrid scope guard: a cheap injectedfast_pathin front of an injectedclassifier((value, context) -> :allow | {:block, reason}); the:blockreason becomes the violation's machine-readable:constraint. (#31)
Phase 6 — AgentProcess durable turn engine (
:servermode).Normandy.Coordination.AgentProcessopt-in:servermode (turn_engine: :server) routing turns through the durableTurn.Session/Turn.Serverengine: approval parking, passivation, and persistence.:inlineremains the default and is byte-for-byte unchanged.AgentProcess.approve/2delivers human-approval decisions to a parked turn.- Non-blocking
:serverrun/3/cast/3: the GenServer stays responsive while a turn is parked awaiting approval or passivated. - Store-authoritative
get_agent/1: reconstructs agent (including conversation memory) fromSessionStorein:servermode. - Template-only
update_agent/2in:servermode: updates config template (model/temperature/behaviours/tools); memory mutations are ignored becauseSessionStoreis authoritative. - Owned-or-supplied session infra:
:store,:registry,:supervisormay be passed tostart_link; if omitted, the process starts and owns in-memory defaults that terminate with it.:subscriber,:handlers,:approval_timeout_ms, and:idle_timeout_msare forwarded toTurn.Session.
Phase 5 — compaction wiring (
:steeringboundary).Normandy.Behaviours.Compactorbehaviour (+NoOpdefault, opt-inWindowManagerimpl) invoked at the:steeringturn boundary when the context window is exceeded;compactorslot onBehaviours.Config. (PR #32)
Fixed
- Flaky
Turn.Supervisor.Hordetest: astart_serverracing the:viaregistration could observe a transient{:error, {:already_started, _}}; the test now retries the start through the via race. (#36) convert_turn_output/3previously returned the empty output-schema struct for tool-using turns with non-chat_messageoutput schemas, dropping the final- response content. Non-chat_message-schema agents using tools were affected.Normandy.Context.TokenCounterwas unusable against the live API: everycount_message/2,3,count_conversation/2, andcount_detailed/2call sentmax_tokensin the/v1/messages/count_tokenspayload, which the endpoint rejects (400 invalid_request_error: "max_tokens: Extra inputs are not permitted"). The field is now omitted. The default model also moved off the retiredclaude-3-5-sonnet-20241022toclaude-haiku-4-5-20251001. The previously-skipped token-counter tests are now enabled as:integrationtests and pass against the live endpoint.
Migration
- No action required:
:inlineis the default and is byte-for-byte unchanged. - To adopt the durable engine:
AgentProcess.start_link(agent: config, turn_engine: :server), optionally passing shared:store/:registry/:supervisor.
0.9.0 - 2026-06-17
Added
Phase 4a — approval core + chokepoint split (harness decomposition).
Normandy.Agents.Dispatch.classify/3(registry → before-hooks → policy → verdict) andDispatch.execute/4(budget → execute → record → after).dispatch_one/3is re-expressed asclassify ➞ execute; its observable behavior is unchanged (the existing dispatch suite is the parity oracle).Normandy.Agents.Turncore gains real human-approval parking: an:awaiting_approvalstate,parked_calls/held_resultson%Turn.State{}, and the{:needs_approval, held, parked}→{:approval, decisions}→{:approved_results, results}event flow, with the batch-results logic factored into a sharedapply_tool_results/2(one decrement per batch, API-order preserved). The synchronous inline path is unchanged — only the Phase 4b:gen_statemshell will exercise these transitions.
Phase 4b —
:gen_statemTurn shell (harness decomposition).Normandy.Agents.Turn.Server: an opt-in asynchronous:gen_stateminterpreter of the pureTurnFSM (the async analog of the inlineDriver). Coarse lifecycle states (:running/:awaiting_approval/:idle) carry monitored Tasks for blocking effects,state_timeouts (approval expiry, passivation idle), persistence at suspend points, and mid-turn message postponement. Real human-approval parking: park on:needs_approval, resume viaTurn.Server.approve/2, fail-closed on approval timeout.Normandy.Agents.Turn.Session(router: whereis → route | rehydrate),Normandy.Agents.Turn.Supervisor(DynamicSupervisor,restart: :transient).Normandy.Behaviours.SessionRegistry(whereis/register/unregister) +Nativedefault over ElixirRegistry;session_registryslot onBehaviours.Config.Normandy.Components.AgentMemory.from_entries/1rebuilds memory from stored history for rehydration.- Four
BaseAgentturn helpers exposed@doc falsefor shell reuse (non_streaming_handlers/0,admit_turn_input/2,base_agent_pipeline/1,turn_response_model/1) — visibility-only, no behavior change. BaseAgent.run/2's inline path is unchanged;Turn.Serveris additive.
Fixed
- Corrected the version stamp: the prior
1.0.0(Phase 3) was never tagged and1.0.0is reserved for the final phase of the harness-decomposition milestone. Phase 3 is re-labeled0.8.0(a pre-1.0 breaking change from0.7.0).
0.8.0 - 2026-06-12
Added
- Branching session memory + SessionStore (Phase 3 of the harness
decomposition).
Normandy.Components.AgentMemoryis now a struct of parent-linkedAgentMemory.Entryrecords (id+parent_id) instead of a linear list. Branching is opt-in viafork/2; a linear conversation is a degenerate single-parent chain andhistory/1output is unchanged. New accessors:fork/2,entries/1,get_entry/2,entry_chain/1,messages/1,latest_message/1.Normandy.Behaviours.SessionStore(append_entry/3,history/2,fork/3,save_turn_state/3,load_turn_state/2) withInMemory(default) andETSimpls sharing one contract suite. Both serialize per-session writes (theInMemoryimpl via itsAgent, theETSimpl via a GenServer that owns a private table), so concurrent appends/forks to one session can't clobber each other — a guarantee the shared contract now exercises under concurrency. The turn-state half round-trips an opaque term; its consumer (suspendable turn / passivation) lands in Phase 4. Postgres is deferred.session_storeslot onNormandy.Behaviours.Config(default{SessionStore.InMemory, []}) — selectable per-agent, not on the dispatch pipeline, not yet consumed by the turn loop.
Changed
- BREAKING:
AgentMemory's struct shape anddump/1/load/1JSON format changed (entry-based). The dump carries aversionkey (currently1) as a forward marker for future format detection —load/1does not yet branch on it. Code that read the old%{history: [...]}map shape must use the public API or the new accessors. Thedump/1/load/1format is not backward-compatible with pre-1.0 dumps. count_messages/1now returns the total number of stored entries (map_size(entries)). For a linear conversation this is identical to the old active-chain length; after afork/2with divergent appends it counts entries across all branches, not just the active one.
Security
AgentMemory.load/1no longer decodes dumps withkeys: :atoms. A blanket atom decode interned every nested content key, so an untrusted/corrupt dump could exhaust the VM atom table.load/1now decodes with string keys and atomizes only known struct field names (viato_existing_atom, which never mints new atoms); raw content round-trips verbatim.
Robustness
AgentMemorygraph walks are cycle-safe. Both the active-branch walk (chain_newest_first/1, behindhistory/1,entry_chain/1,messages/1) and the survivor-rewiring walk (surviving_ancestor, behinddelete_turn/2) track visited ids, so a corrupt dump carrying a parent cycle terminates instead of looping forever.
Notes
- Linear-conversation observable behavior is unchanged — the end-to-end suite is
the parity oracle. Internal consumers (
base_agentiteration counters,window_manager/summarizermemory rebuild) and white-box tests were migrated behavior-preservingly to the new accessors.
0.7.0 - 2026-06-01
Added
- Pluggable behaviours (Phase 2 of the harness decomposition). The dispatch
chokepoint's function slots are now backed by four Elixir
@behaviours, each with a default impl that preserves current behavior:Normandy.Behaviours.PolicyEngine(check/2) — defaultAllowAll; plus a shippedRulesetimpl that evaluates ordered in-memory rules (matchglob →:allow | :deny | :require_approval, first-match-wins, configurabledefault_action).Normandy.Behaviours.BudgetTracker(check/2,record/2) — defaultNoOp.Normandy.Behaviours.CredentialProvider(get_token/2) — defaultFromClient(extractsapi_keyfrom the client struct). Defined and defaulted; LLM-call consumption deferred.Normandy.Behaviours.ModelCatalog(get/1,supports?/2,context_window/1) — defaultStatic, now the single source of truth forWindowManager's context-window limits.
Normandy.Behaviours.Configbundle +to_pipeline/1, selectable per-agent via the newBaseAgentConfig.behavioursfield.before/afterhooks are now first-class, config-selectable function slots.
Notes
- Additive and default-off: with the default bundle, observable behavior is unchanged. No migration required.
0.6.3 - 2026-05-12
Added
Normandy.LLM.JsonDeserializernow supports opt-in recovery from a specific truncated-JSON failure mode: when an LLM (notably Nemotron-Nano-12B-VL on DigitalOcean Inference) emits a response that ends inside an unclosed top-level string field — typically because the model entered a\n-escape runaway and ran out of output tokens —parse_and_validate/3anddeserialize_with_retry/8now acceptrecover_truncated_strings: true. When the flag is on AND the strict decode fails AND the content looks like a single top-level object AND a byte scanner determines the unclosed string is at the outermost depth, Normandy truncates the string at the last position whose preceding bytes were not part of a\nescape, appends a closing", and appends}/]closers derived from a tracked open-container stack. The recovered payload is re-decoded once through the adapter and run through the same cast pipeline as the happy path; a[:normandy, :json_deserializer, :recovery]telemetry event is emitted on success with%{recovered: 1}measurements and%{strategy: :truncated_string, byte_size_before: _, byte_size_after: _}metadata. Default isfalse— pre-existing callers see no behaviour change. Designed for vision-pipelinepage_texttranscription payloads where the alternative is an empty%Output{}and zero RAG indexing on customer-grade documents; not a general-purpose JSON repair. Nested-string truncation (e.g.{"offerings":[{"name":"Paq) explicitly does NOT recover, since manufacturing a closer there would produce a half-truthful inner record rather than empty top-level data.
0.6.2 - 2026-05-11
Fixed
Normandy.LLM.JsonDeserializernow recovers from tool-use-style response envelopes: some vision/instruction-following LLMs (Nemotron-Nano-12B-VL on DigitalOcean Inference, some Llama variants) wrap their JSON in{"name": "...", "arguments": {...}}even when givenresponse_format: {"type": "json_object"}and a system prompt asking for the bare object shape.parse_and_validate/3previously cast such payloads to an all-defaults struct and returned:ok, so downstream consumers (notablyevent_crew's vendor-doc vision extraction) silently dropped every populated field.parse_and_populate/3now retries the cast once againstparsed["arguments"]when the outer attempt either succeeded with all-defaults OR returned a validation error, and the"arguments"value is itself a map. If the retry yields any populated field it wins; if it still yields all-defaults the original result is preserved so bare-shape responses see no behaviour change. One level only —{"arguments": {"arguments": {...}}}is not unwrapped. Inner cast errors are propagated when the inner map carries at least one permitted key (atom or string form), so e.g.{"arguments":{"count":"not_a_number"}}againstcount: :integersurfaces the validation error instead of returning an empty struct; inner errors are still suppressed when no permitted keys are present, so unrelated envelopes don't manufacture new failures. The{:ok, struct}/{:error, reason}contract is unchanged for every pre-existing shape (#22).get_required_fields/1inJsonDeserializernow reads required fields from the correct source: the helper was filtering__specification__/0entries as if each were a metadata map, butNormandy.Schemastores{name, type}tuples there — so the filter never matched andvalidate_required/2was being called with an empty list for every schema. It now reads__schema__(:required)(the source of truth) and falls back to the old scan for schemas that don't expose that callback. Required-field validation now actually fires for schemas declaringfield :foo, _, required: true(#22).
0.6.1 - 2026-05-02
Fixed
- OpenTelemetry context propagation across
Normandy.Tools.Executorspawn sites: tool bodies run viaTask.async/1inexecute_with_timeout/2andexecute_parallel/3. OTel context lives in the process dictionary, so spawnedTasks started with an empty context — any span opened inside a tool'srun/1became a root span in a fresh trace instead of nesting undernormandy.tool.execute. Symptom in downstream apps: integration spans (e.g. external API calls, blob downloads) appeared as orphans in Tempo, andnormandy.tool.executewas an opaque blob with no breakdown. The executor now captures the parent context before each spawn and re-attaches it inside the spawned function. Applied at bothTask.asyncsites:execute_with_timeout/2(the primary fix) andexecute_parallel/3(so the inner timeout call sees a non-empty context to propagate further). The capture/restore helpers (already present inBaseAgentforTask.async_stream) are extracted into a new internalNormandy.Telemetry.OtelCtxmodule and shared by both call sites; they no-op when:opentelemetryis not loaded, so consumers without OTel pay nothing (#20).
Security
- Secret redaction in
Inspectoutput forNormandy.LLM.ClaudioAdapterandNormandy.A2A.AgentTool: a live API key leaked through default error logging in a downstream project when aTaskcrashed and the BEAM error logger inspected closure args holding a%ClaudioAdapter{}.Kernel.inspect/2rendered the secret in plaintext. Both structs now carry@derive {Inspect, except: [...]}covering:api_key(ClaudioAdapter) and:auth_token(AgentTool).Normandy.MCP.ServerConfigalready had this protection. Field access (dot syntax,Map.get/2, pattern matching) is unchanged; only theInspectrepresentation is affected. Locked in with regression tests asserting the secret value never appears ininspect/1output (#19).
Changed
- ExDoc warnings silenced:
Normandy.Type.load/1is now declared as an optional callback (the contract was already documented and exercised by custom-type implementations).Normandy.ParameterizedType.embed_as/2is also now declared as an optional callback (already indefoverridablewith a default:selfimpl). No behavioural change; doc-build is now warning-clean (#18).
0.6.0 - 2026-05-01
Added
- Typed-struct cache control on multimodal content blocks: each of
Normandy.Components.ContentBlock.{Text,Image,Document}gains an optionalcache_controlfield pluswith_cache/1(ephemeral, the common case) andwith_cache/2(caller-supplied map, e.g.%{"type" => "ephemeral", "ttl" => "1h"}). Atom keys are accepted and stringified at serialization time.to_claudio/1emits thecache_controlkey only when set, so existing callers see no wire-shape change. Closes the gap left in0.5.1where multimodal cache breakpoints required hand-built raw maps. - Conversation-breakpoint auto-cache strategy: when
enable_caching: true,Normandy.LLM.ClaudioAdapternow annotates the last block of the last user message withcache_control: %{"type" => "ephemeral"}, mirroring how Anthropic recommends placing prompt-cache breakpoints on chat conversations. Triggers only for list-form or single-ContentBlock-struct content — plain-string user messages keep their existing wire shape so chat-text callers see no behaviour change. Caller-setcache_control(viawith_cache/1-2or hand-built atom/string-keyedcache_controlon a raw map) is preserved; the adapter never overrides it. Earlier user messages in the history are not annotated. - List-form system prompt caching: the system clause of
add_single_message/3previously short-circuitedenable_caching: truefor list-form content because Claudio'sset_system_with_cache/2only wraps strings. The adapter now annotates the last block of a list-form system prompt and routes it throughset_system/2with pre-shaped wire blocks. Symmetric with the existing string-system caching path. - Normandy.Components.ContentBlock.CacheControl (
@moduledoc false): internal helper that string-normalizes top-level cache_control keys and raisesArgumentErrorwhen an atom and string version of the same key collide post-normalization, so caller intent is never silently lost.
Changed
dispatch_multimodal/3named-helper patterns now requirecache_control: nilon both blocks. Claudio'sadd_message_with_image,add_message_with_image_url, andadd_message_with_documenttake raw args and rebuild blocks internally — anycache_controlon the sourceContentBlockstruct would have been silently dropped on the wire. With this change, cache-annotated blocks always go through the raw-list fallback path that preserves block fields.- Multimodal system prompt with
enable_caching: truenow emitscache_controlon the last system block. Previously this combination was a documented opt-out — the adapter ignoredenable_cachingfor list-form system content and required callers to hand-build annotated block maps. Wire-shape change for callers that hit this exact combination in0.5.x. - Claudio dependency bumped to
~> 0.5.0.
0.5.1 - 2026-04-29
Added
- Multimodal user input via list-shaped content blocks: agents can now
receive a list of content blocks (e.g.
[%{"type" => "text", ...}, %{"type" => "image", ...}]) throughMyAgent.run/2,MyAgent.run/3, andMyAgent.run_with_tools/2. The list flows throughprepare_input/1,AgentMemory, and the Claudio adapter unchanged, whereadd_single_message/3already dispatches it through the existing multimodal path. Two minimal upstream changes make this work:Normandy.Components.BaseIOSchemanow has afor: Listimpl whoseto_json/1returns the list verbatim (mirrors the four-callback shape of the existingBitString/Mapimpls), and Normandy.DSL.Agent.prepare_input/1 passes lists through unchanged. Strings continue to wrap into%{chat_message: ...}and maps continue to pass through (unchanged). Callers that need prompt-cache breakpoints inside multimodal user content can hand-build raw block maps with a"cache_control"key — the adapter's raw-list path preserves them verbatim. Typed-struct caching support onNormandy.Components.ContentBlock.{Text,Image,Document}is deferred to a future release.
0.5.0 - 2026-04-29
Added
- Per-agent
max_tool_concurrency(bounded parallel tool execution):BaseAgentConfiggains amax_tool_concurrencyfield (default1). The tool loop inBaseAgentnow wraps each per-call worker throughTask.async_stream(ordered: true, max_concurrency: config.max_tool_concurrency, timeout: :infinity, on_timeout: :kill_task)in both the non-streaming and streaming branches. Default1preserves pre-0.5.0 sequential behaviour (modulo the worker-process semantics noted under Changed below). Values> 1opt the agent into parallel tool execution — each tool call runs in its ownTaskworker, ordered by the LLM's call sequence, with up to N running at once. OTel parent context is propagated softly (viaCode.ensure_loaded?(OpenTelemetry.Ctx)— Normandy does not add OTel as a hard dep) so consumer-side telemetry handlers continue to nest tool spans under the parentagent.runspan. - DSL macro
max_tool_concurrency/1: sets the compile-time default insideNormandy.DSL.Agent.agent do ... end. Runtime overrides onMyAgent.new/1(top-level keyword, or via:override) take precedence as for any other agent setting. - Input validation for
:max_tool_concurrency: non-integer values ("4",4.0, etc.) now raiseArgumentErrorrather than silently coercing to a default — a config bug should surface, not hide. Integers< 1are clamped to1to match the runtime tool-loop floor. Validation runs at both layers: at compile time inside the DSL__before_compile__(soMyAgent.config().max_tool_concurrencydoesn't lie about the value the agent will actually use), and at runtime insideBaseAgent.init/1fornew/1and:overridecallers. The sharedBaseAgent.normalize_max_tool_concurrency/1helper drives both paths. BaseAgent.unwrap_tool_task_result!/1(@doc false, public for testability): translates aTask.async_streamelement into the underlying tool result. The linkedTask.async_stream/3propagates worker raises to the caller via process-link before yielding, so{:exit, {exception, stacktrace}}is unreachable for raises in the current configuration; the helper still handles it (re-raising with the original stacktrace) along with{:exit, reason}— most importantly{:exit, :timeout}fromon_timeout: :kill_taskand any deliberateexit/1from tool wrapper code — so those fail loudly instead of hittingFunctionClauseErroragainst a{:ok, _}-only pattern.
Changed
- Streaming callback process semantics (
stream_with_tools/3): the callback now executes in theTask.async_streamworker process, not the caller — including atmax_tool_concurrency: 1, becauseTask.async_streamalways spawns one worker per closure. Callbacks that referencedself()inside (e.g.fn :tool_result, r -> send(self(), {:tool_result, r}) end) will now target the worker PID. To send messages back to the owner, capture the PID outside the callback first:parent = self(); fn :tool_result, r -> send(parent, ...) end. This is the canonical Elixir pattern for any callback that may run in a worker process. - Streaming
:tool_resultcallback ordering at concurrency > 1:stream_with_tools/3invokescallback.(:tool_result, result)from inside each worker as soon as that tool finishes, so atmax_tool_concurrency > 1callers observe:tool_resultevents in completion order, not LLM-call order. The final list of tool results sent back to the LLM stays in LLM-call order (Task.async_streamis invoked withordered: true). Callers that need call-order callback delivery should keepmax_tool_concurrency: 1or buffer + reorder client-side. - Tool loop refactor (
BaseAgent): extracted the per-tool-call body ofexecute_tool_loop/2andexecute_streaming_tool_loop/3into the private helpersexecute_one_tool_call/2andexecute_one_streaming_tool_call/2. Pure refactor — behaviour, ordering, and process semantics are identical to the previous inlineEnum.mapclosures. Sets up a follow-up change to swapEnum.mapfor an opt-in bounded parallel runner (per-agentmax_tool_concurrency) without churning the closure body again.
Security
- Atom-table hardening (
BaseAgent): replacedString.to_atom/1over LLM-supplied tool input keys withnormalize_tool_field_key/2, which only returns atoms that already exist as fields on the tool struct. LLM tool input is influenced by attacker-controllable prompt content (chat messages, webhooks); the previous code registered every unknown key in the global atom table on the way tostruct/2discarding it, and BEAM never garbage-collects atoms — sustained crafted input could exhaust the table and crash the VM. Unknown keys are now silently dropped, preserving the existing user-visible behaviour ofstruct/2.
Fixed
- Streaming tool input normalisation (
BaseAgent):execute_one_streaming_tool_call/2now routestool_call["input"]throughnormalize_tool_input/1instead of an ad-hoccasethat only acceptednil, maps, and binaries. Streaming tool input is raw LLM JSON, so a list/number/boolean previously raisedCaseClauseErrorand aborted the whole streaming tool loop; unexpected shapes now degrade to%{}. The redundantparse_json_input/1private helper (functionally identical to the binary clause ofnormalize_tool_input/1) is removed.
0.4.0 - 2026-04-25
Added
Multimodal Content Blocks: Image and document support for agent messages
Normandy.Components.ContentBlock.{Text, Image, Document}framework-neutral block types with per-moduleto_claudio/1emitting Anthropic wire shapesClaudioAdapter.add_single_message/3opportunistically dispatches to Claudio's named helpers for the three wrapped shapes (base64 image+text, URL image+text, document+text); other shapes (multi-block, reversed, image-alone, pre-shaped maps withcache_control) fall through to a raw-listadd_message/3Normandy.Components.Message.contentwidened from:structto:anywith extended@type tcoveringString.t() | struct() | [struct()]- Token accounting in
WindowManager,TokenCounter, andSummarizernow handles list content (image blocks ~1600 tokens, documents ~3000) instead of silently zero-counting them
Guardrails: First-class content-level constraint layer for agent I/O, composable across input and output stages
Normandy.Guardrailsrunner with short-circuit semanticsNormandy.Guardrails.Guardbehaviour for custom guardsNormandy.Guardrails.ViolationErrorraised on input violations- Built-in guards:
MaxLength,ForbiddenSubstrings,RegexGuard(:deny/:requiremodes),RequiredFields BaseAgentintegration via new:input_guardrails/:output_guardrailsconfig keys (input violations halt, output violations log and continue, mirroringValidationMiddleware)DSL macro
guardrails(:input | :output, [specs])inNormandy.DSL.Agent- Telemetry event
[:normandy, :agent, :guardrail, :violation]with:stage,:agent_name,:guards, and:violationsmetadata - Works on both non-streaming (
run/2) and streaming paths — see the streaming output guardrails entry below for streaming specifics
Streaming Output Guardrails: Output guardrails now run on streaming paths
:accumulatemode (default) — guards run on the final assistant text after the stream ends; log-and-continue on violation, matching non-streamingrun/2posture:incrementalmode (opt-in) — guards run every:output_guardrails_chunk_sizebytes of accumulated text plus a tail pass when the stream ends with unchecked bytes; on violation halts mid-stream, strips any in-flighttool_usecontent block, and returns with:guardrail_violationspopulated- Three signal channels on both modes:
:guardrail_violationstream callback event,:guardrail_violationsfield on the returned response, and the existing telemetry event (metadata gainsstreaming: trueandmode: :accumulate | :incremental) - New DSL macros inside
agent do … end:streaming_mode/1,streaming_chunk_size/1 - New
BaseAgentConfigfields::output_guardrails_streaming_mode,:output_guardrails_chunk_size
Fixed
- Streaming Cold-Start:
BaseAgent.stream_response/3andstream_with_tools/3no longer fail with"Client does not support streaming"when invoked as the first call through theNormandy.Agents.Modelprotocol. With protocol consolidation enabled (default in:dev/:prod), the consolidated impl module was not auto-loaded, so thefunction_exported?/3capability probe returned false. Now wraps the probe withCode.ensure_loaded/1(#9).
Changed
- Claudio dependency bumped to
~> 0.4.0. Required for streaming SSE events to decode with string-keyed data maps (matches the raw Anthropic JSON convention); earlierkeys: :atomsdecoding silently dropped callback dispatches in Normandy's adapter.
0.3.0 - 2026-04-18
Added
MCP and A2A Protocol Support: New protocols for interoperability
Normandy.MCP.ToolWrapperfor wrapping Model Context Protocol (MCP) toolsNormandy.MCP.Registryfor managing MCP tool collectionsNormandy.A2A.Serverfor agent-to-agent communication- Support for cross-agent tool execution and discovery
Structured Agent Lifecycle Logging & Telemetry: Enhanced observability
Loggercalls for agent, LLM, and tool lifecycle events- Telemetry events for:
[:normandy, :agent, :run, :start | :stop | :exception][:normandy, :llm, :call, :start | :stop | :exception][:normandy, :tool, :execute, :start | :stop | :exception]
- Automatic duration tracking for all operations
- Metadata enrichment with agent names, models, and tool names
- OpenTelemetry-friendly logging with span context correlation
Telemetry Metadata & Robustness:
- Agent names included in all telemetry metadata
- Improved error handling in LLM adapter calls
- Support for
Finchconnection pool inClaudioAdapter
DSL Enhancements:
- Exposed
run/3in DSL for direct streaming support - Improved agent definition ergonomics
- Exposed
Schema Enhancements
Schema-Based Tool Definition: New
SchemaBaseToolmixin for streamlined tool creationtool_schemamacro providing single source of truth for tool definitions- Automatic JSON schema generation and validation
- ~60% reduction in boilerplate code compared to manual approach
Tool Registry Metadata Methods: Enhanced introspection capabilities
get_metadata/2,list_metadata/1,filter_by_required_params/2, etc.- Find tools by constraints, parameter types, or required fields
Validation Middleware: Automatic validation for agent inputs and outputs
- Type-safe agent execution with path-based error messages
- Fail-fast on invalid inputs, warn on invalid LLM outputs
Changed
- Calculator Tool Migration: Migrated to schema-based approach with improved type safety
- HTTP Client: Added support for custom
Finchpools inClaudioAdapter - JSON Schema Type Format: Schema types now use atoms (
:object) instead of strings ("object") - CI/CD: Adjusted test coverage threshold to 60% and updated matrix testing
Fixed
- Streaming Stability: Restored tool loop, message conversion, and event shape in streaming responses
- Tool Loop: Fixed unwrap of double-nested JSON in
chat_messageafter tool loop completion - JSON Deserialization: Return structured content blocks from tool
to_jsoninstead of raw strings - Dependency Issues: Added default
Poisonadapter to prevent encoding errors in consuming apps - Logging: Preserved DSL-defined agent names in lifecycle logs
- Dialyzer: Resolved various type errors and added ignore patterns for clean analysis
- CI: Fixed compilation warnings and intermittent test failures
Test Coverage
- Total tests: 900+ (doctests + property tests + unit tests)
- 0 failures, 100% passing rate
0.2.0 - 2025-10-28
Added
CI/CD Infrastructure
- GitHub Actions Workflow: Comprehensive CI pipeline for automated testing
- Matrix testing across Elixir 1.15, 1.16, 1.17 and OTP 26, 27
- Separate jobs for unit tests, integration tests, Dialyzer, and dependency audits
- Smart caching for dependencies and PLT files
- Conditional integration test execution with API key support
- Documentation in
.github/workflows/README.md
Examples and Documentation
Comprehensive Examples: Three runnable examples demonstrating key features
- Customer support agent with custom tools and conversational memory
- Multi-agent research workflow with parallel execution
- Structured data extraction with validated output schemas
- Complete examples README with setup instructions and key concepts
Customer Support Example Application: Production-ready multi-agent system
- Four specialized agents (Greeter, Technical, Billing, Order Support)
- Custom tools for knowledge base, order lookup, refunds, and ticket creation
- Interactive CLI interface with session management
- Data stores for orders, tickets, and knowledge base
- Full application architecture documentation
Context Management Improvements
TokenCounter Test Coverage: Comprehensive unit tests for token counting
- 15 tests covering all TokenCounter functionality
- Mock-based testing for unit tests
- Integration tests for real API calls
- Error handling and edge case coverage
Date/Time Context Provider: Dynamic timestamp injection for prompts
Normandy.Components.DateTimeProviderfor temporal context- Configurable timezone support
- Test coverage for provider functionality
Development Tools
- JSON Deserializer: Improved JSON parsing with error handling
Normandy.LLM.JsonDeserializerfor robust JSON parsing- Fallback mechanisms for malformed JSON
- Integration tests for retry scenarios
Fixed
TokenCounter Implementation: Critical bug fixes for production use
- Fixed Claudio client initialization (map format instead of keyword list)
- Fixed agent structure access patterns (direct field access)
- Fixed system prompt extraction (pattern matching instead of get_in/2)
- Added comprehensive error handling for malformed agents
Access Protocol Issues: Resolved struct field access errors
- Replaced get_in/2 with pattern matching for BaseAgentConfig
- Improved error messages for malformed agent structures
Documentation
- Enhanced ExDoc configuration with organized module groups
- Examples directory with comprehensive usage documentation
- CI/CD workflow documentation with local testing commands
- Customer support application architecture guide
Test Coverage
- 443 unit tests (29 doctests + 21 properties + 393 tests)
- 62 integration tests (56 API + 6 comprehensive DSL tests)
- 15 new TokenCounter unit tests
- Total: 505+ tests, all passing
0.1.0 - 2025-10-26
Added
Declarative DSLs (Phase 8.6)
Agent DSL: Define agents with declarative syntax
Normandy.DSL.Agent-agent do ...endblocks for agent configuration- Macro-based configuration for model, temperature, prompts, tools
- Automatic initialization with
new/1and agent execution - Background, steps, and output_instructions directives
Workflow DSL: Compose multi-agent workflows
Normandy.DSL.Workflow-workflow do ... endblocks- Sequential execution:
step :name do ... end - Parallel execution:
parallel :name do ... end - Race patterns:
race :name do ... end - Data flow:
input(from: :step_name)or static values - Result transformation:
transform fn ... end - Conditional execution:
when_result do ... end - Automatic step orchestration and error handling
Pattern Matching Helpers: Utilities for result tuples
Normandy.Coordination.Pattern- Ergonomic {:ok, value} | {:error, reason} handling- Type checking:
ok?/1,error?/1 - Value extraction:
ok!/2,error!/2,unwrap!/1 - Filtering lists:
filter_ok/1,filter_errors/1 - Transformations:
map_ok/2,map_error/2 - Composition:
then/2,find_ok/1,collect_ok/1,all_ok/1,all_ok_map/1 - Wrapping utilities:
wrap/1,try_wrap/1
Reactive Coordination Patterns
Normandy.Coordination.Reactive- Concurrent agent execution primitivesrace/3- Return first successful result from multiple agentsall/3- Wait for all agents with optional fail-fast modesome/4- Quorum pattern (wait for N successful results)map/3- Transform agent resultswhen_result/3- Conditional execution based on results
Agent Pool Management
Normandy.Coordination.AgentPool- Connection pool pattern for agents- Transaction-based API with automatic checkout/checkin
- Manual checkout/checkin for advanced use cases
- Configurable pool size with overflow support
- LIFO/FIFO checkout strategies
- Automatic agent replacement on failure
- Pool statistics and monitoring
- Non-blocking checkout with timeout support
Core Foundation (Phases 1-7)
Schema System: Macro-based DSL for defining agent I/O schemas with JSON Schema generation
Normandy.Schemamodule withschemaandio_schemamacros- Type system with casting, dumping, and loading via
Normandy.Type - Changeset-like validation with
Normandy.Validate - Support for parameterized and custom types
Agent System: Core agent implementation with LLM integration
Normandy.Agents.BaseAgentwith init, run, and get_response methodsNormandy.Agents.BaseAgentConfigfor agent state management- Context provider system for dynamic prompt injection
- Tool/function calling support via
Normandy.Agents.ToolCallResponse
Memory Management: Conversational history tracking
Normandy.Components.AgentMemorywith turn-based organization- Message serialization and deserialization
- Configurable message history limits
Prompt System: Structured prompt generation
Normandy.Components.SystemPromptGeneratorwith section-based promptsNormandy.Components.PromptSpecificationfor prompt structureNormandy.Components.ContextProviderprotocol for dynamic context
Streaming Responses: Real-time LLM response streaming
- Streaming support in
Normandy.Agents.BaseAgent - Callback-based streaming with arity-2 callback support
- Streaming support in
Resilience Patterns: Fault tolerance and reliability
Normandy.Resilience.Retrywith exponential backoffNormandy.Resilience.CircuitBreakerfor preventing cascade failures- Integration with BaseAgent for automatic retry on failures
Context Window Management: Intelligent conversation management
Normandy.Context.WindowManagerfor automatic context managementNormandy.Context.TokenCounterfor accurate token countingNormandy.Context.Summarizerfor conversation summarization- Support for Claude's prompt caching (up to 90% cost reduction)
Multi-Agent Coordination (Phase 8)
Agent Communication: Message-based agent-to-agent communication
Normandy.Coordination.AgentMessagefor structured messagingNormandy.Coordination.SharedContextfor stateless context sharingNormandy.Coordination.StatefulContext(GenServer + ETS) for stateful sharing
Orchestration Patterns: Multiple coordination strategies
Normandy.Coordination.SequentialOrchestratorfor pipeline executionNormandy.Coordination.ParallelOrchestratorfor concurrent executionNormandy.Coordination.HierarchicalCoordinatorfor manager-worker patterns- Simple and advanced APIs for flexible usage
Agent Processes: OTP-based agent supervision
Normandy.Coordination.AgentProcess(GenServer wrapper)Normandy.Coordination.AgentSupervisor(DynamicSupervisor)- Fault tolerance with Elixir/OTP patterns
Batch Processing
- Concurrent Processing: Efficient batch agent execution
Normandy.Batch.Processorfor concurrent batch processing- Configurable concurrency limits
- Result aggregation and error handling
Integration & Testing (Phase 8.5)
Integration Tests: Comprehensive real-world testing
- 56 integration tests with real Anthropic API calls
- Test helpers:
IntegrationHelperandNormandyIntegrationHelper - Tag-based test exclusion (
@moduletag :api,@moduletag :integration) - Coverage for multi-agent workflows, resilience, caching, and batch processing
LLM Client Integration: Claudio HTTP client migration
- Updated to Claudio v0.1.1 from hex.pm
- Migrated from Tesla to Req HTTP client
- Streaming error handling for
Req.Response.Async
Fixed
- Orchestrator APIs: Fixed
extract_resultto return full response maps instead of just chat_message strings - Function clause matching: Improved pattern matching for simple vs advanced orchestrator APIs
- Streaming callbacks: Fixed arity-2 callback support for streaming responses
Documentation
- Comprehensive README with usage examples
- Project roadmap (ROADMAP.md) tracking implementation phases
- MIT License
- Hex.pm package metadata and documentation configuration
Dependencies
elixir_uuid~> 1.2 - UUID generation for conversation turnspoison~> 6.0 - JSON encoding/decodingclaudio~> 0.1.1 - Anthropic Claude API clientdialyxir~> 1.4 (dev/test) - Static analysisstream_data~> 1.1 (dev/test) - Property-based testingex_doc~> 0.34 (dev) - Documentation generation
Test Coverage
- 443 unit tests (29 doctests + 21 properties + 393 tests)
- 62 integration tests (56 API + 6 comprehensive DSL tests, excluded by default)
- Total: 505 tests, all passing
- New test files:
test/coordination/pattern_test.exs(13 tests)test/coordination/reactive_test.exs(33 tests)test/coordination/agent_pool_test.exs(30 tests)test/dsl/agent_test.exs(8 tests)test/dsl/workflow_test.exs(14 tests)test/dsl/workflow_transform_integration_test.exs(4 tests)test/normandy_integration/dsl_comprehensive_test.exs(6 comprehensive integration tests)