Changelog
View SourceAll notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]
0.6.0 - 2026-05-01
Added
- Typed-struct cache control on multimodal content blocks: each of
Normandy.Components.ContentBlock.{Text,Image,Document}gains an optionalcache_controlfield pluswith_cache/1(ephemeral, the common case) andwith_cache/2(caller-supplied map, e.g.%{"type" => "ephemeral", "ttl" => "1h"}). Atom keys are accepted and stringified at serialization time.to_claudio/1emits thecache_controlkey only when set, so existing callers see no wire-shape change. Closes the gap left in0.5.1where multimodal cache breakpoints required hand-built raw maps. - Conversation-breakpoint auto-cache strategy: when
enable_caching: true,Normandy.LLM.ClaudioAdapternow annotates the last block of the last user message withcache_control: %{"type" => "ephemeral"}, mirroring how Anthropic recommends placing prompt-cache breakpoints on chat conversations. Triggers only for list-form or single-ContentBlock-struct content — plain-string user messages keep their existing wire shape so chat-text callers see no behaviour change. Caller-setcache_control(viawith_cache/1-2or hand-built atom/string-keyedcache_controlon a raw map) is preserved; the adapter never overrides it. Earlier user messages in the history are not annotated. - List-form system prompt caching: the system clause of
add_single_message/3previously short-circuitedenable_caching: truefor list-form content because Claudio'sset_system_with_cache/2only wraps strings. The adapter now annotates the last block of a list-form system prompt and routes it throughset_system/2with pre-shaped wire blocks. Symmetric with the existing string-system caching path. Normandy.Components.ContentBlock.CacheControl(@moduledoc false): internal helper that string-normalizes top-level cache_control keys and raisesArgumentErrorwhen an atom and string version of the same key collide post-normalization, so caller intent is never silently lost.
Changed
dispatch_multimodal/3named-helper patterns now requirecache_control: nilon both blocks. Claudio'sadd_message_with_image,add_message_with_image_url, andadd_message_with_documenttake raw args and rebuild blocks internally — anycache_controlon the sourceContentBlockstruct would have been silently dropped on the wire. With this change, cache-annotated blocks always go through the raw-list fallback path that preserves block fields.- Multimodal system prompt with
enable_caching: truenow emitscache_controlon the last system block. Previously this combination was a documented opt-out — the adapter ignoredenable_cachingfor list-form system content and required callers to hand-build annotated block maps. Wire-shape change for callers that hit this exact combination in0.5.x. - Claudio dependency bumped to
~> 0.5.0.
0.5.1 - 2026-04-29
Added
- Multimodal user input via list-shaped content blocks: agents can now
receive a list of content blocks (e.g.
[%{"type" => "text", ...}, %{"type" => "image", ...}]) throughMyAgent.run/2,MyAgent.run/3, andMyAgent.run_with_tools/2. The list flows throughprepare_input/1,AgentMemory, and the Claudio adapter unchanged, whereadd_single_message/3already dispatches it through the existing multimodal path. Two minimal upstream changes make this work:Normandy.Components.BaseIOSchemanow has afor: Listimpl whoseto_json/1returns the list verbatim (mirrors the four-callback shape of the existingBitString/Mapimpls), andNormandy.DSL.Agent.prepare_input/1passes lists through unchanged. Strings continue to wrap into%{chat_message: ...}and maps continue to pass through (unchanged). Callers that need prompt-cache breakpoints inside multimodal user content can hand-build raw block maps with a"cache_control"key — the adapter's raw-list path preserves them verbatim. Typed-struct caching support onNormandy.Components.ContentBlock.{Text,Image,Document}is deferred to a future release.
0.5.0 - 2026-04-29
Added
- Per-agent
max_tool_concurrency(bounded parallel tool execution):BaseAgentConfiggains amax_tool_concurrencyfield (default1). The tool loop inBaseAgentnow wraps each per-call worker throughTask.async_stream(ordered: true, max_concurrency: config.max_tool_concurrency, timeout: :infinity, on_timeout: :kill_task)in both the non-streaming and streaming branches. Default1preserves pre-0.5.0 sequential behaviour (modulo the worker-process semantics noted under Changed below). Values> 1opt the agent into parallel tool execution — each tool call runs in its ownTaskworker, ordered by the LLM's call sequence, with up to N running at once. OTel parent context is propagated softly (viaCode.ensure_loaded?(OpenTelemetry.Ctx)— Normandy does not add OTel as a hard dep) so consumer-side telemetry handlers continue to nest tool spans under the parentagent.runspan. - DSL macro
max_tool_concurrency/1: sets the compile-time default insideNormandy.DSL.Agent.agent do ... end. Runtime overrides onMyAgent.new/1(top-level keyword, or via:override) take precedence as for any other agent setting. - Input validation for
:max_tool_concurrency: non-integer values ("4",4.0, etc.) now raiseArgumentErrorrather than silently coercing to a default — a config bug should surface, not hide. Integers< 1are clamped to1to match the runtime tool-loop floor. Validation runs at both layers: at compile time inside the DSL__before_compile__(soMyAgent.config().max_tool_concurrencydoesn't lie about the value the agent will actually use), and at runtime insideBaseAgent.init/1fornew/1and:overridecallers. The sharedBaseAgent.normalize_max_tool_concurrency/1helper drives both paths. BaseAgent.unwrap_tool_task_result!/1(@doc false, public for testability): translates aTask.async_streamelement into the underlying tool result. The linkedTask.async_stream/3propagates worker raises to the caller via process-link before yielding, so{:exit, {exception, stacktrace}}is unreachable for raises in the current configuration; the helper still handles it (re-raising with the original stacktrace) along with{:exit, reason}— most importantly{:exit, :timeout}fromon_timeout: :kill_taskand any deliberateexit/1from tool wrapper code — so those fail loudly instead of hittingFunctionClauseErroragainst a{:ok, _}-only pattern.
Changed
- Streaming callback process semantics (
stream_with_tools/3): the callback now executes in theTask.async_streamworker process, not the caller — including atmax_tool_concurrency: 1, becauseTask.async_streamalways spawns one worker per closure. Callbacks that referencedself()inside (e.g.fn :tool_result, r -> send(self(), {:tool_result, r}) end) will now target the worker PID. To send messages back to the owner, capture the PID outside the callback first:parent = self(); fn :tool_result, r -> send(parent, ...) end. This is the canonical Elixir pattern for any callback that may run in a worker process. - Streaming
:tool_resultcallback ordering at concurrency > 1:stream_with_tools/3invokescallback.(:tool_result, result)from inside each worker as soon as that tool finishes, so atmax_tool_concurrency > 1callers observe:tool_resultevents in completion order, not LLM-call order. The final list of tool results sent back to the LLM stays in LLM-call order (Task.async_streamis invoked withordered: true). Callers that need call-order callback delivery should keepmax_tool_concurrency: 1or buffer + reorder client-side. - Tool loop refactor (
BaseAgent): extracted the per-tool-call body ofexecute_tool_loop/2andexecute_streaming_tool_loop/3into the private helpersexecute_one_tool_call/2andexecute_one_streaming_tool_call/2. Pure refactor — behaviour, ordering, and process semantics are identical to the previous inlineEnum.mapclosures. Sets up a follow-up change to swapEnum.mapfor an opt-in bounded parallel runner (per-agentmax_tool_concurrency) without churning the closure body again.
Security
- Atom-table hardening (
BaseAgent): replacedString.to_atom/1over LLM-supplied tool input keys withnormalize_tool_field_key/2, which only returns atoms that already exist as fields on the tool struct. LLM tool input is influenced by attacker-controllable prompt content (chat messages, webhooks); the previous code registered every unknown key in the global atom table on the way tostruct/2discarding it, and BEAM never garbage-collects atoms — sustained crafted input could exhaust the table and crash the VM. Unknown keys are now silently dropped, preserving the existing user-visible behaviour ofstruct/2.
Fixed
- Streaming tool input normalisation (
BaseAgent):execute_one_streaming_tool_call/2now routestool_call["input"]throughnormalize_tool_input/1instead of an ad-hoccasethat only acceptednil, maps, and binaries. Streaming tool input is raw LLM JSON, so a list/number/boolean previously raisedCaseClauseErrorand aborted the whole streaming tool loop; unexpected shapes now degrade to%{}. The redundantparse_json_input/1private helper (functionally identical to the binary clause ofnormalize_tool_input/1) is removed.
0.4.0 - 2026-04-25
Added
Multimodal Content Blocks: Image and document support for agent messages
Normandy.Components.ContentBlock.{Text, Image, Document}framework-neutral block types with per-moduleto_claudio/1emitting Anthropic wire shapesClaudioAdapter.add_single_message/3opportunistically dispatches to Claudio's named helpers for the three wrapped shapes (base64 image+text, URL image+text, document+text); other shapes (multi-block, reversed, image-alone, pre-shaped maps withcache_control) fall through to a raw-listadd_message/3Normandy.Components.Message.contentwidened from:structto:anywith extended@type tcoveringString.t() | struct() | [struct()]- Token accounting in
WindowManager,TokenCounter, andSummarizernow handles list content (image blocks ~1600 tokens, documents ~3000) instead of silently zero-counting them
Guardrails: First-class content-level constraint layer for agent I/O, composable across input and output stages
Normandy.Guardrailsrunner with short-circuit semanticsNormandy.Guardrails.Guardbehaviour for custom guardsNormandy.Guardrails.ViolationErrorraised on input violations- Built-in guards:
MaxLength,ForbiddenSubstrings,RegexGuard(:deny/:requiremodes),RequiredFields BaseAgentintegration via new:input_guardrails/:output_guardrailsconfig keys (input violations halt, output violations log and continue, mirroringValidationMiddleware)DSL macro
guardrails(:input | :output, [specs])inNormandy.DSL.Agent- Telemetry event
[:normandy, :agent, :guardrail, :violation]with:stage,:agent_name,:guards, and:violationsmetadata - Works on both non-streaming (
run/2) and streaming paths — see the streaming output guardrails entry below for streaming specifics
Streaming Output Guardrails: Output guardrails now run on streaming paths
:accumulatemode (default) — guards run on the final assistant text after the stream ends; log-and-continue on violation, matching non-streamingrun/2posture:incrementalmode (opt-in) — guards run every:output_guardrails_chunk_sizebytes of accumulated text plus a tail pass when the stream ends with unchecked bytes; on violation halts mid-stream, strips any in-flighttool_usecontent block, and returns with:guardrail_violationspopulated- Three signal channels on both modes:
:guardrail_violationstream callback event,:guardrail_violationsfield on the returned response, and the existing telemetry event (metadata gainsstreaming: trueandmode: :accumulate | :incremental) - New DSL macros inside
agent do … end:streaming_mode/1,streaming_chunk_size/1 - New
BaseAgentConfigfields::output_guardrails_streaming_mode,:output_guardrails_chunk_size
Fixed
- Streaming Cold-Start:
BaseAgent.stream_response/3andstream_with_tools/3no longer fail with"Client does not support streaming"when invoked as the first call through theNormandy.Agents.Modelprotocol. With protocol consolidation enabled (default in:dev/:prod), the consolidated impl module was not auto-loaded, so thefunction_exported?/3capability probe returned false. Now wraps the probe withCode.ensure_loaded/1(#9).
Changed
- Claudio dependency bumped to
~> 0.4.0. Required for streaming SSE events to decode with string-keyed data maps (matches the raw Anthropic JSON convention); earlierkeys: :atomsdecoding silently dropped callback dispatches in Normandy's adapter.
0.3.0 - 2026-04-18
Added
MCP and A2A Protocol Support: New protocols for interoperability
Normandy.MCP.ToolWrapperfor wrapping Model Context Protocol (MCP) toolsNormandy.MCP.Registryfor managing MCP tool collectionsNormandy.A2A.Serverfor agent-to-agent communication- Support for cross-agent tool execution and discovery
Structured Agent Lifecycle Logging & Telemetry: Enhanced observability
Loggercalls for agent, LLM, and tool lifecycle events- Telemetry events for:
[:normandy, :agent, :run, :start | :stop | :exception][:normandy, :llm, :call, :start | :stop | :exception][:normandy, :tool, :execute, :start | :stop | :exception]
- Automatic duration tracking for all operations
- Metadata enrichment with agent names, models, and tool names
- OpenTelemetry-friendly logging with span context correlation
Telemetry Metadata & Robustness:
- Agent names included in all telemetry metadata
- Improved error handling in LLM adapter calls
- Support for
Finchconnection pool inClaudioAdapter
DSL Enhancements:
- Exposed
run/3in DSL for direct streaming support - Improved agent definition ergonomics
- Exposed
Schema Enhancements
Schema-Based Tool Definition: New
SchemaBaseToolmixin for streamlined tool creationtool_schemamacro providing single source of truth for tool definitions- Automatic JSON schema generation and validation
- ~60% reduction in boilerplate code compared to manual approach
Tool Registry Metadata Methods: Enhanced introspection capabilities
get_metadata/2,list_metadata/1,filter_by_required_params/2, etc.- Find tools by constraints, parameter types, or required fields
Validation Middleware: Automatic validation for agent inputs and outputs
- Type-safe agent execution with path-based error messages
- Fail-fast on invalid inputs, warn on invalid LLM outputs
Changed
- Calculator Tool Migration: Migrated to schema-based approach with improved type safety
- HTTP Client: Added support for custom
Finchpools inClaudioAdapter - JSON Schema Type Format: Schema types now use atoms (
:object) instead of strings ("object") - CI/CD: Adjusted test coverage threshold to 60% and updated matrix testing
Fixed
- Streaming Stability: Restored tool loop, message conversion, and event shape in streaming responses
- Tool Loop: Fixed unwrap of double-nested JSON in
chat_messageafter tool loop completion - JSON Deserialization: Return structured content blocks from tool
to_jsoninstead of raw strings - Dependency Issues: Added default
Poisonadapter to prevent encoding errors in consuming apps - Logging: Preserved DSL-defined agent names in lifecycle logs
- Dialyzer: Resolved various type errors and added ignore patterns for clean analysis
- CI: Fixed compilation warnings and intermittent test failures
Test Coverage
- Total tests: 900+ (doctests + property tests + unit tests)
- 0 failures, 100% passing rate
0.2.0 - 2025-10-28
Added
CI/CD Infrastructure
- GitHub Actions Workflow: Comprehensive CI pipeline for automated testing
- Matrix testing across Elixir 1.15, 1.16, 1.17 and OTP 26, 27
- Separate jobs for unit tests, integration tests, Dialyzer, and dependency audits
- Smart caching for dependencies and PLT files
- Conditional integration test execution with API key support
- Documentation in
.github/workflows/README.md
Examples and Documentation
Comprehensive Examples: Three runnable examples demonstrating key features
- Customer support agent with custom tools and conversational memory
- Multi-agent research workflow with parallel execution
- Structured data extraction with validated output schemas
- Complete examples README with setup instructions and key concepts
Customer Support Example Application: Production-ready multi-agent system
- Four specialized agents (Greeter, Technical, Billing, Order Support)
- Custom tools for knowledge base, order lookup, refunds, and ticket creation
- Interactive CLI interface with session management
- Data stores for orders, tickets, and knowledge base
- Full application architecture documentation
Context Management Improvements
TokenCounter Test Coverage: Comprehensive unit tests for token counting
- 15 tests covering all TokenCounter functionality
- Mock-based testing for unit tests
- Integration tests for real API calls
- Error handling and edge case coverage
Date/Time Context Provider: Dynamic timestamp injection for prompts
Normandy.Components.DateTimeProviderfor temporal context- Configurable timezone support
- Test coverage for provider functionality
Development Tools
- JSON Deserializer: Improved JSON parsing with error handling
Normandy.LLM.JsonDeserializerfor robust JSON parsing- Fallback mechanisms for malformed JSON
- Integration tests for retry scenarios
Fixed
TokenCounter Implementation: Critical bug fixes for production use
- Fixed Claudio client initialization (map format instead of keyword list)
- Fixed agent structure access patterns (direct field access)
- Fixed system prompt extraction (pattern matching instead of get_in/2)
- Added comprehensive error handling for malformed agents
Access Protocol Issues: Resolved struct field access errors
- Replaced get_in/2 with pattern matching for BaseAgentConfig
- Improved error messages for malformed agent structures
Documentation
- Enhanced ExDoc configuration with organized module groups
- Examples directory with comprehensive usage documentation
- CI/CD workflow documentation with local testing commands
- Customer support application architecture guide
Test Coverage
- 443 unit tests (29 doctests + 21 properties + 393 tests)
- 62 integration tests (56 API + 6 comprehensive DSL tests)
- 15 new TokenCounter unit tests
- Total: 505+ tests, all passing
0.1.0 - 2025-10-26
Added
Declarative DSLs (Phase 8.6)
Agent DSL: Define agents with declarative syntax
Normandy.DSL.Agent-agent do ...endblocks for agent configuration- Macro-based configuration for model, temperature, prompts, tools
- Automatic initialization with
new/1and agent execution - Background, steps, and output_instructions directives
Workflow DSL: Compose multi-agent workflows
Normandy.DSL.Workflow-workflow do ... endblocks- Sequential execution:
step :name do ... end - Parallel execution:
parallel :name do ... end - Race patterns:
race :name do ... end - Data flow:
input(from: :step_name)or static values - Result transformation:
transform fn ... end - Conditional execution:
when_result do ... end - Automatic step orchestration and error handling
Pattern Matching Helpers: Utilities for result tuples
Normandy.Coordination.Pattern- Ergonomic {:ok, value} | {:error, reason} handling- Type checking:
ok?/1,error?/1 - Value extraction:
ok!/2,error!/2,unwrap!/1 - Filtering lists:
filter_ok/1,filter_errors/1 - Transformations:
map_ok/2,map_error/2 - Composition:
then/2,find_ok/1,collect_ok/1,all_ok/1,all_ok_map/1 - Wrapping utilities:
wrap/1,try_wrap/1
Reactive Coordination Patterns
Normandy.Coordination.Reactive- Concurrent agent execution primitivesrace/3- Return first successful result from multiple agentsall/3- Wait for all agents with optional fail-fast modesome/4- Quorum pattern (wait for N successful results)map/3- Transform agent resultswhen_result/3- Conditional execution based on results
Agent Pool Management
Normandy.Coordination.AgentPool- Connection pool pattern for agents- Transaction-based API with automatic checkout/checkin
- Manual checkout/checkin for advanced use cases
- Configurable pool size with overflow support
- LIFO/FIFO checkout strategies
- Automatic agent replacement on failure
- Pool statistics and monitoring
- Non-blocking checkout with timeout support
Core Foundation (Phases 1-7)
Schema System: Macro-based DSL for defining agent I/O schemas with JSON Schema generation
Normandy.Schemamodule withschemaandio_schemamacros- Type system with casting, dumping, and loading via
Normandy.Type - Changeset-like validation with
Normandy.Validate - Support for parameterized and custom types
Agent System: Core agent implementation with LLM integration
Normandy.Agents.BaseAgentwith init, run, and get_response methodsNormandy.Agents.BaseAgentConfigfor agent state management- Context provider system for dynamic prompt injection
- Tool/function calling support via
Normandy.Agents.ToolCallResponse
Memory Management: Conversational history tracking
Normandy.Components.AgentMemorywith turn-based organization- Message serialization and deserialization
- Configurable message history limits
Prompt System: Structured prompt generation
Normandy.Components.SystemPromptGeneratorwith section-based promptsNormandy.Components.PromptSpecificationfor prompt structureNormandy.Components.ContextProviderprotocol for dynamic context
Streaming Responses: Real-time LLM response streaming
- Streaming support in
Normandy.Agents.BaseAgent - Callback-based streaming with arity-2 callback support
- Streaming support in
Resilience Patterns: Fault tolerance and reliability
Normandy.Resilience.Retrywith exponential backoffNormandy.Resilience.CircuitBreakerfor preventing cascade failures- Integration with BaseAgent for automatic retry on failures
Context Window Management: Intelligent conversation management
Normandy.Context.WindowManagerfor automatic context managementNormandy.Context.TokenCounterfor accurate token countingNormandy.Context.Summarizerfor conversation summarization- Support for Claude's prompt caching (up to 90% cost reduction)
Multi-Agent Coordination (Phase 8)
Agent Communication: Message-based agent-to-agent communication
Normandy.Coordination.AgentMessagefor structured messagingNormandy.Coordination.SharedContextfor stateless context sharingNormandy.Coordination.StatefulContext(GenServer + ETS) for stateful sharing
Orchestration Patterns: Multiple coordination strategies
Normandy.Coordination.SequentialOrchestratorfor pipeline executionNormandy.Coordination.ParallelOrchestratorfor concurrent executionNormandy.Coordination.HierarchicalCoordinatorfor manager-worker patterns- Simple and advanced APIs for flexible usage
Agent Processes: OTP-based agent supervision
Normandy.Coordination.AgentProcess(GenServer wrapper)Normandy.Coordination.AgentSupervisor(DynamicSupervisor)- Fault tolerance with Elixir/OTP patterns
Batch Processing
- Concurrent Processing: Efficient batch agent execution
Normandy.Batch.Processorfor concurrent batch processing- Configurable concurrency limits
- Result aggregation and error handling
Integration & Testing (Phase 8.5)
Integration Tests: Comprehensive real-world testing
- 56 integration tests with real Anthropic API calls
- Test helpers:
IntegrationHelperandNormandyIntegrationHelper - Tag-based test exclusion (
@moduletag :api,@moduletag :integration) - Coverage for multi-agent workflows, resilience, caching, and batch processing
LLM Client Integration: Claudio HTTP client migration
- Updated to Claudio v0.1.1 from hex.pm
- Migrated from Tesla to Req HTTP client
- Streaming error handling for
Req.Response.Async
Fixed
- Orchestrator APIs: Fixed
extract_resultto return full response maps instead of just chat_message strings - Function clause matching: Improved pattern matching for simple vs advanced orchestrator APIs
- Streaming callbacks: Fixed arity-2 callback support for streaming responses
Documentation
- Comprehensive README with usage examples
- Project roadmap (ROADMAP.md) tracking implementation phases
- MIT License
- Hex.pm package metadata and documentation configuration
Dependencies
elixir_uuid~> 1.2 - UUID generation for conversation turnspoison~> 6.0 - JSON encoding/decodingclaudio~> 0.1.1 - Anthropic Claude API clientdialyxir~> 1.4 (dev/test) - Static analysisstream_data~> 1.1 (dev/test) - Property-based testingex_doc~> 0.34 (dev) - Documentation generation
Test Coverage
- 443 unit tests (29 doctests + 21 properties + 393 tests)
- 62 integration tests (56 API + 6 comprehensive DSL tests, excluded by default)
- Total: 505 tests, all passing
- New test files:
test/coordination/pattern_test.exs(13 tests)test/coordination/reactive_test.exs(33 tests)test/coordination/agent_pool_test.exs(30 tests)test/dsl/agent_test.exs(8 tests)test/dsl/workflow_test.exs(14 tests)test/dsl/workflow_transform_integration_test.exs(4 tests)test/normandy_integration/dsl_comprehensive_test.exs(6 comprehensive integration tests)