This document records the development milestones for Parapet v0.1–v0.9. These are planning tranches, not Hex release versions — the package was not published to hex.pm during this period.
For the changelog of published Hex releases (v0.10+), see CHANGELOG.md.
v0.9 — Performance, Scale & DX (2026-05-23)
- Shipped proactive TSDB cardinality protection: a
mix parapet.doctor cardinalitystatic analyzer plus a compile-time Parapet.Metrics.Validator enforcing a 10-label ceiling per metric, applied across all built-in metrics and adapter SLIs. - Delivered database scale and pruning: composite indexes for Incident, TimelineEntry, and
ToolAudit at over 100k rows, a Parapet.Evidence.Archiver with resolved-only retention, and
a
mix parapet.archivetask plus Oban cron worker that never prunes active investigating work. - Made the Operator UI responsive under load with bounded queue paging, index-aware Operator queries, and a 50k+ incident benchmark — and repaired the generated resolve flow so the active-to-resolved lifecycle is true again.
- Unified the Day-1 experience under
mix parapet.install, a deterministic Igniter orchestrator that chains spine/prometheus/ui with explicit opt-in extras, backed by severity-aware multi-nodemix parapet.doctorchecks (for example, Oban uniqueness). - Proved multi-node safety with Ecto-backed action claims and circuit breakers under concurrency simulation, plus an environment-conditional peer-node canary that skips cleanly without distributed Erlang.
- Hardened milestone closure: phases 6–14 backfilled milestone-grade verification surfaces, reconciled planning-artifact drift, tightened archive retention, and added a regression-catching closure-proof chain for the generated operator UI.
Stats: ~20,274 LOC (Elixir/EEx, lib+priv+test) · Phases 1–14 (5 core, 9 closure) · 36 plans · 88 commits · 2026-05-19 → 2026-05-23
v0.8 — Deterministic Escalation & Bounded Mitigation (2026-05-19)
- Built a durable Oban-backed escalation engine (Parapet.Escalation.Worker) that routes incidents to next tiers unless acknowledged or resolved.
- Implemented system-identity execution for Bounded Runbooks to safely perform auto-mitigations
using the Parapet.Operator API, logging all actions under the
:systemURN identity. - Created an Ecto-backed circuit breaker leveraging ToolAudit histories to prevent mitigation flap-loops: once a mitigation has run N times, the breaker trips and escalates instead.
- Updated the LiveView Operator UI to visualize escalation chains and distinctively style system-executed mitigations with manual trigger overrides for operator control.
Stats: ~13,900 LOC (Elixir/EEx) · Phases 1–4 · 8 plans · 2026-05-19
v0.7 — Async & Delivery Reliability (2026-05-18)
- Established safe telemetry contracts for Mailglass, Chimeway, and Rindle integrations to emit bounded async and delivery events using normalized event semantics for diverse external providers.
- Implemented out-of-the-box provider-first SLOs for async pipeline health and provider delivery states, including multi-burn-rate PromQL alerts.
- Created explicit fault-domain triage enrichment for async and delivery incidents, leveraging durable evidence — triage snapshot chronology — over UI-derived heuristics.
- Added safe, host-wired recovery runbook templates for stalled async work, covering dead-letter handling, provider outage recovery, stalled job cleanup, and callback delay flows.
Stats: ~13,401 LOC (Elixir/EEx) · Phases 4–7 · 12 plans · 2026-05-18
v0.6 — Change Correlation & Audit Trailing (2026-05-17)
- Implemented OpenTelemetry trace exemplar extraction from events and process dictionaries, appending trace identifiers to generated Prometheus metrics.
- Added trace identifier storage to Ecto Incident schemas and dynamically formatted trace links within the Operator UI for one-click navigation to external trace backends.
- Consumed Rulestead feature flag toggles via telemetry, creating durable timeline entries and suspect change markers to instantly correlate feature flag changes with SLO burn rates.
- Highlighted recent proximate system changes (like flag toggles) on active incidents in the Operator UI, distinguishing them visually from human actions.
- Implemented Parapet.Integrations.Threadline for compliance sync, mirroring Operator audit actions to Threadline event logs.
- Added dual audit modes (
:threadline_deferredand:dual_write) to satisfy strict compliance constraints, including bypassing internal Parapet storage entirely when deferred.
Stats: 8,968 LOC (Elixir/EEx) · Phases 1–3 · 9 plans · 2026-05-17
v0.5 — Proactive Resilience & Copilot Triage (2026-05-16)
- Implemented Parapet.Probe for defining and scheduling active synthetic canaries via NativeScheduler and ObanScheduler, enabling proactive health detection before alerts fire.
- Expanded Sigra and Accrue integrations to emit explicit login, signup, and checkout SLIs for business-critical journey monitoring.
- Built a Parapet MCP server to allow AI agents to safely read incident data and act as triage copilots, providing structured access without write permissions.
- Resolved compilation and type warnings across the project, achieving a clean zero-warning compilation state.
Stats: ~8,500 LOC (Elixir/EEx) · Phases 1–3 · 9 plans · 2026-05-16
v0.4 — Scoria AI Integration (2026-05-15)
- Implemented telemetry translation consuming Scoria.SRE.Telemetry events and producing Parapet Prometheus metrics and durable Ecto Incidents for AI infrastructure observability.
- Built Parapet.SLO.ScoriaEval to define and alert on Eval-Driven SLOs based on Scoria deterministic evaluation scores, with Grafana visualization for SLO error budget correlation.
- Added native tracking of AI Config Changes (scorer_version, baseline_version, model) to correlate configuration drift with SLO degradation.
- Monitored Scoria MCP tool failure modes (timeout, execution_failed, breaker_open, access_denied) as explicit SLIs using bounded atoms to protect Ecto from high-volume telemetry.
- Monitored Scoria workflow approval pauses as durable human-in-the-loop states, triggering alerts on stale requests, and extended the Operator UI with deep-links to Scoria's durable evidence.
Stats: 7,847 LOC (Elixir/EEx) · Phases 1–4 · 9 plans · 2026-05-15
v0.3 — Runbooks & Alert Routing (2026-05-12)
- Implemented a webhook receiver endpoint for Prometheus Alertmanager, automatically routing "firing" and "resolved" alerts to the durable Ecto Incident lifecycle with intelligent deduplication and correlation by alert name and labels.
- Created a structured Parapet.Runbook DSL for defining operator-triggered mitigation steps and attaching them based on SLOs or alert names.
- Extended the Operator UI to interactively display attached runbooks and execute one-click mitigations with complete ToolAudit logging.
- Built a modular Parapet.Notifier system with out-of-the-box Slack (Block Kit) and Microsoft Teams (Adaptive Cards) adapters to broadcast incident state changes and record timeline entries.
- Added UI capabilities for operators to explicitly acknowledge incidents and generate comprehensive Markdown retrospectives automatically.
Stats: 6,667 LOC (Elixir/EEx) · Phases 1–4 · 12 plans · 2026-05-12
v0.2 — Durable Spine and Operator UI (2026-05-11)
- Implemented the Parapet.Evidence context with Incident, TimelineEntry, and ToolAudit Ecto schemas for durable SRE tracking, separating ephemeral telemetry from low-volume Ecto data.
- Created
mix parapet.gen.spinegenerator to scaffold evidence migrations into host applications safely separated from high-volume telemetry. - Defined the Operator API with transactional audited commands and a WorkbenchContract for safe UI derivations.
- Created
mix parapet.gen.uito generate an isolated, secure, and visually responsive Phoenix LiveView Operator Workbench inside the host app. - Automated structural UI tests to guarantee responsive mobile and desktop layout fidelity without relying on human QA or full browser end-to-end tests.
- Implemented optional integration adapters for Mailglass, Chimeway, Accrue, Rindle, Threadline, and Rulestead leveraging a new capability registry, with all adapters compiling out cleanly when their sibling libraries are absent.
Stats: 3,164 LOC (Elixir/EEx) · Phases 1–3 · 11 plans · 2026-05-11
v0.1 — Trustworthy Spine (2026-05-10)
- Established the foundational Parapet telemetry contract, supervisor, and install generator, defining a documented telemetry surface treated as a public API with semver guarantees.
- Built core metrics instrumentation for HTTP, Ecto, and Oban safely via a robust API enforcing low-cardinality by default with explicit label contracts.
- Created an SLO DSL converting standard Elixir definitions to fully functional Prometheus recording and alerting rules.
- Delivered a seamless Day-1 experience with
mix parapet.doctorhealth checks and Grafana dashboard generation.
Stats: 1,992 LOC (Elixir) · Phases 1–4 · 15 plans · 2026-05-10