Squid Mesh is an embedded durable workflow runtime for Elixir applications. Workflows are declared as Elixir modules through a DSL, persisted through Jido journals, and executed by host-owned workers calling SquidMesh.execute_next/1.
The runtime stores workflow state, step attempts, retries, approvals, transitions, audit events, and recovery history in the host application's database through Jido.Storage and the default Ecto adapter. Squid Mesh does not run as a separate service, broker, or orchestration cluster. The host application retains its existing supervision tree, deployment model, repository, schedulers, and queue backend.
Storage portability is defined by the journal storage adapter contract, not arbitrary database compatibility. The production relational implementation uses a Postgres-compatible Ecto adapter. See the storage strategy for adapter guarantees.
Squid Mesh manages workflow progression, transition routing, retry semantics, pause and approval handling, replay and recovery policy, durable execution history, and graph inspection. Queue delivery, worker supervision, and backend leasing remain host-owned concerns.
The runtime builds on Jido for actions, execution, and journaling; Runic for workflow planning; and Spark for the DSL authoring surface.
Adoption status Squid Mesh provides a supported
0.1.xjournal runtime for embedded host-app workflows. Treat production rollout as an application-owned integration: run the host-app smoke and resilience checks, review the operational boundaries, and adopt the queue/leasing strategy that matches your deployment. See Production Readiness for the current baseline.
Start Here
The fastest way to start is the guided Livebook. It demonstrates creating a workflow, starting a journal-backed run, executing work with SquidMesh.execute_next/1, and inspecting the durable result.
| Goal | Resource |
|---|---|
| Run a guided interactive example | Getting Started Livebook |
| Integrate Squid Mesh into an existing application | Getting Started guide |
| Review a complete working example | Minimal host app |
The written guide covers installation, workflow creation, journal execution, run inspection, retries, manual gates, cron triggers, and Bedrock-backed leases.
Jido Primitive Boundary
Squid Mesh uses Jido as an internal runtime foundation while keeping the public workflow API focused on Squid Mesh concepts. The runtime uses these Jido primitives:
| Jido primitive | Squid Mesh use |
|---|---|
Jido.Agent | Rebuildable workflow and dispatch coordination state |
Jido.Action | Step execution interop, including raw Jido action modules and the native SquidMesh.Step adapter |
Jido.Storage | Journal and checkpoint persistence boundary |
Jido.Thread / Jido.Thread.Entry | Durable journal facts for run, dispatch, index, and catalog threads |
Jido.Exec | Action execution inside the journal executor |
Jido.Signal | Interop boundary envelope for Squid Mesh runtime command signals |
Support code uses lower-level primitives such as Jido.Thread.EntryNormalizer and validates built-in storage adapters like Jido.Storage.File and Jido.Storage.Redis. Workflow authors do not need to use these primitives directly.
Runtime command signals use SquidMesh.Runtime.Signal as the stable contract. SquidMesh.Runtime.Signal.JidoAdapter converts between SquidMesh.Runtime.Signal structs and Jido.Signal envelopes for advanced runtime integration. Public callers use Squid Mesh APIs directly and do not need to construct raw Jido.Signal values.
Journal-backed runtime commands are persisted as run-thread command receipts before their lifecycle facts. SquidMesh.inspect_run/2 exposes command history through snapshot.command_history, including signal type, payload, actor and comment when supplied, redacted metadata, idempotency key when relevant, and occurrence time.
Getting Started
Documentation and examples:
| Reference | Description |
|---|---|
| Getting Started | Setup and first workflow run |
| Workflow Authoring | Triggers, steps, transitions, retries, and compensation |
| Host App Integration | Phoenix and OTP integration |
| Reference Workflows | Approval, recovery, saga, and cron examples |
| Minimal Host App | Executable example application |
| Bedrock Minimal Host App | Backend-owned delivery with leases and retry requeue |
| Architecture | Runtime flow and component boundaries |
| Positioning Guide | Comparison with adjacent projects |
Installation
Add Squid Mesh to your dependencies:
defp deps do
[
{:squid_mesh, "~> 0.1.0"}
]
endIf your host application defines raw Jido.Action modules directly, add :jido explicitly as well:
defp deps do
[
{:jido, "~> 2.0"},
{:squid_mesh, "~> 0.1.0"}
]
endConfigure the repo and default queue:
config :squid_mesh,
repo: MiddleEarth.Repo,
queue: "default"Install and run the migration:
mix deps.get
mix squid_mesh.install
mix ecto.migrate
To keep workflow modules formatted consistently as DSL-style declarations, import Squid Mesh formatter rules in .formatter.exs:
[
import_deps: [:squid_mesh],
inputs: ["{mix,.formatter}.exs", "{config,lib,test}/**/*.{ex,exs}"]
]Finally, start one supervised worker loop. See Host App Integration for a minimal worker shape.
Optional: Bedrock Job Runner And Leases
Use Bedrock when the host application needs backend-owned delivery, delayed visibility, job leases, heartbeat/lease extension, retry requeue, and recovery. Keep workflow modules backend-neutral; Bedrock belongs behind host adapter modules.
If a simple supervised process can call SquidMesh.execute_next/1 often enough
for your workload, start there. Add Bedrock only when the host needs a durable
job runner to own payload delivery, delayed visibility, worker leases, and
redelivery after worker or node failure.
At a high level:
- Configure Squid Mesh with the host repo and the journal queue used by the Bedrock payload worker.
- Configure a Bedrock queue for Squid Mesh payload delivery.
- Start the host repo, Bedrock cluster, and Bedrock job queue under the host application's supervision tree.
- Add a delivery adapter that maps cron payloads to Bedrock jobs.
- Add a payload worker that calls
SquidMesh.execute_next/1while the Bedrock job lease is held. - Configure both lease layers: the Bedrock job lease for payload delivery and
heartbeat_interval_msfor Squid Mesh journal attempt claims.
Those leases are separate. The Bedrock lease protects job delivery; the Squid
Mesh heartbeat protects the workflow attempt claimed by execute_next/1.
The payload worker is the executor boundary. Keep these responsibilities separate:
| Concern | Owner |
|---|---|
| Persisted workflow state, step attempts, step retry policy, terminal run status | Squid Mesh |
| Claiming and executing the next visible workflow attempt | SquidMesh.execute_next/1 |
| Keeping a long-running workflow attempt claim alive | heartbeat_interval_ms passed to execute_next/1 |
| Payload delivery, delayed visibility, job leases, and redelivery after worker failure | Bedrock |
Do not enqueue one Bedrock job per workflow step, and do not model workflow step
retries as Bedrock job retries. A normal step failure, retry, or terminal run is
durable Squid Mesh state returned by SquidMesh.execute_next/1. Bedrock should
retry only job-level delivery failures, such as a crashed payload worker or a
transient backend error before the worker can finish draining journal attempts.
The payload worker should usually treat {:ok, snapshot} from
execute_next/1 as successful job progress even when the snapshot describes a
failed workflow run. Return {:error, reason} to Bedrock only when the payload
delivery or journal drain itself failed and should be redelivered.
The host-owned wiring looks like this in shape:
# config/config.exs
config :squid_mesh,
repo: MyApp.Repo,
queue: "tenant_a"
config :my_app, MyApp.SquidMeshDeliveryAdapter,
queue_id: "tenant_a",
topic: "squid_mesh:payload"
config :my_app, MyApp.Jobs.SquidMeshPayload,
journal_heartbeat_interval_ms: 10_000,
max_journal_attempts: 50defmodule MyApp.Jobs.SquidMeshPayload do
use Bedrock.JobQueue.Job,
topic: "squid_mesh:payload",
# Job retry covers payload delivery only. Step retry lives in the workflow DSL.
max_retries: 3
alias SquidMesh.Runtime.Runner
def perform(payload, _meta) when is_map(payload) do
case Runner.perform(payload) do
:ok -> drain_journal("tenant_a", 0)
{:ok, _snapshot} -> drain_journal("tenant_a", 0)
{:error, reason} -> {:error, reason}
end
end
defp drain_journal(_queue, 50), do: {:error, :journal_drain_limit_exceeded}
defp drain_journal(queue, count) do
case SquidMesh.execute_next(
queue: queue,
owner_id: "my-app-bedrock-worker",
heartbeat_interval_ms: 10_000
) do
{:ok, :none} -> :ok
# The snapshot may be completed, failed, paused, or still running.
# It is still successful job progress because Squid Mesh persisted it.
{:ok, _snapshot} -> drain_journal(queue, count + 1)
# Return an error only for executor/drain failures Bedrock should redeliver.
{:error, reason} -> {:error, reason}
end
end
endFor the concrete setup, see Bedrock Lease Backend Setup and the Bedrock Minimal Host App.
Workflows
Workflows are Elixir modules. A trigger declares the entrypoint and validates the payload before the run is persisted. Steps declare their inputs, outputs, retry policy, and compensation behavior. Transitions wire them together.
This workflow demonstrates manual gates, approval flows, conditional routing, retries, saga compensation, and irreversible steps:
defmodule MiddleEarth.Workflows.RingErrand do
use SquidMesh.Workflow
workflow do
trigger :leave_shire do
manual()
payload do
field :bearer, :string, default: "Frodo"
field :ring_id, :string
field :route_preference, :string, default: "moria"
end
end
step :pack_provisions, Hobbiton.Steps.PackProvisions,
output: :provisions
step :hide_at_prancing_pony, :pause
approval_step :council_vote,
output: :council,
deadline: [within: 300_000, due_soon: 60_000, escalation: :operator_action]
step :choose_path, Rivendell.Steps.ChoosePath,
input: [bearer: [:bearer], decision: [:council, :decision]],
output: :route
step :cross_moria, Fellowship.Steps.CrossMoria,
input: [:bearer, :provisions, :route],
retry: [max_attempts: 3, backoff: [type: :exponential]],
deadline: [within: 30_000, due_soon: 5_000, escalation: :diagnostic]
step :reserve_eagle, Eagles.Steps.ReserveRide,
compensate: Eagles.Steps.CancelRide
step :toss_ring, Mordor.Steps.TossRing,
irreversible: true
transition :pack_provisions, on: :ok, to: :hide_at_prancing_pony
transition :hide_at_prancing_pony, on: :ok, to: :council_vote
transition :council_vote, on: :ok, to: :choose_path
transition :choose_path, on: :ok, to: :cross_moria
transition :cross_moria, on: :ok, to: :reserve_eagle
transition :cross_moria, on: :error, to: :complete, recovery: :undo
transition :reserve_eagle, on: :ok, to: :toss_ring
transition :toss_ring, on: :ok, to: :complete
end
endSteps and approvals can declare diagnostic deadlines with deadline: [...].
Squid Mesh persists the due timestamps in runnable and manual-control facts and
surfaces evaluated states such as :on_time, :due_soon, :overdue, and
:escalated through list_runs/2, inspect_run/2,
inspect_run_graph/2, and explain_run/2. Alert delivery, paging, and
operator escalation remain host-owned; the runtime only records durable
deadline evidence and safe next actions.
Cron-triggered workflows use scheduling declarations:
defmodule Gondor.Workflows.BeaconWatch do
use SquidMesh.Workflow
workflow do
trigger :nightly_beacon_check do
cron "0 21 * * *", timezone: "Etc/UTC"
payload do
field :beacon_count, :integer, default: 7
end
end
step :inspect_hilltops, Gondor.Steps.InspectHilltops,
retry: [max_attempts: 3]
step :light_beacon, Gondor.Steps.LightBeacon,
compensate: Gondor.Steps.ExtinguishBeacon
transition :inspect_hilltops, on: :ok, to: :light_beacon
transition :light_beacon, on: :ok, to: :complete
end
endDependency-based workflows use after: [...] for parallel execution:
defmodule Gondor.Workflows.ParallelAttack do
use SquidMesh.Workflow
workflow do
trigger :start do
manual()
end
step :march_to_gate, Gondor.Steps.MarchToGate
step :rally_rohan, Rohan.Steps.RallyArmy
step :distract_sauron, Fellowship.Steps.DistractEnemy
step :declare_victory, Gondor.Steps.DeclareVictory,
after: [:march_to_gate, :rally_rohan, :distract_sauron]
end
endRunning Workflows
Start a workflow run:
{:ok, run} =
SquidMesh.start(
MiddleEarth.Workflows.RingErrand,
:leave_shire,
%{ring_id: "one-ring"}
)Inspect a run with full history:
SquidMesh.inspect_run(run.run_id, include_history: true)Get an operator-facing explanation:
{:ok, explanation} = SquidMesh.explain_run(run.run_id)
explanation.reason #=> :waiting_for_retry
explanation.evidence.command_counts #=> %{"start_run" => 1, "cancel_run" => 2}The explain_run/2 function summarizes the current state, valid next actions, and supporting evidence for dashboards and operational tooling.
Approvals and Manual Gates
Pause steps and approval steps block progression until explicitly resolved:
# Resume a paused step
SquidMesh.resume(run.run_id, %{actor: "strider", reason: "ready to proceed"})
# Approve or reject an approval gate
SquidMesh.approve(run.run_id, %{actor: "elrond", note: "approved"})
SquidMesh.reject(run.run_id, %{actor: "elrond", note: "rejected"})For idempotent command delivery, use explicit runtime signals:
alias SquidMesh.Runtime.Signal
{:ok, signal} =
Signal.approve_run(run.run_id, %{actor: "elrond", note: "approved"},
idempotency_key: "approval-#{run.run_id}"
)
{:ok, approved_run} = SquidMesh.apply_signal(signal)Reusing an idempotency key returns the existing result without creating duplicate command receipts. Approval steps persist their resolved targets and output metadata, surviving deploys and restarts.
Compensation and Recovery
Workflow authors can mark completed side effects as compensatable so operators and host tools can see the rollback contract when later work fails:
step :borrow_rope, Lothlorien.Steps.BorrowRope,
compensate: Lothlorien.Steps.ReturnRope
step :reserve_eagle, Eagles.Steps.ReserveRide,
compensate: Eagles.Steps.CancelRide
step :cross_moria, Fellowship.Steps.CrossMoria,
retry: [max_attempts: 3]A failed :cross_moria exposes the completed compensatable steps and their
declared callbacks through inspect_run/2, inspect_run_graph/2, and
explain_run/2. The callback metadata is persisted with each runnable so
dashboards can show rollback availability even if the workflow module changes.
For side effects that cannot be reversed, mark steps as irreversible: true or compensatable: false. Squid Mesh exposes these boundaries during inspection and blocks replay by default after irreversible execution.
Child Workflows
Steps can spawn child workflow runs for dynamic work expansion:
defmodule Hobbiton.Steps.SendInvites do
use SquidMesh.Step, name: :send_invites
@impl true
def run(%{party_id: party_id, guests: guests}, %SquidMesh.Step.Context{} = context) do
children =
for guest <- guests do
{:ok, child} =
SquidMesh.start_child_run(
context,
Hobbiton.Workflows.DeliverInvite,
%{party_id: party_id, guest_id: guest.id},
child_key: "invite_#{guest.id}"
)
child.run_id
end
{:ok, %{child_run_ids: children}}
end
endEach child run has independent inspection, retry, replay, and cancellation. Repeating the same child_key returns the existing child instead of creating duplicates.
Inspectable Dynamic Work
Host code can preview, record, or schedule bounded dynamic work for an active run. Preview is read-only, record persists inspection metadata, and schedule persists the same dynamic-work fact while planning executable runnable intents:
registry = %{"digest.deliver" => MyApp.Steps.DeliverDigest}
{:ok, preview} =
SquidMesh.preview_dynamic_work(
run.run_id,
%{
dynamic_key: "subscription_digest_fanout",
origin: %{
runnable_key: "run_123:schedule_digest:1",
step: "schedule_digest",
attempt: 1
},
reason: :runtime_fanout,
nodes: [
%{id: "deliver_digest:chat_1", action: "digest.deliver"}
]
},
action_registry: registry
)
preview.origin_node_id
preview.added_node_ids
preview.added_edge_ids
preview.recordable?
preview.graph.nodesAfter previewing, choose one durable write path. Use record_dynamic_work/3
when the dynamic structure should be inspectable only:
{:ok, snapshot} =
SquidMesh.record_dynamic_work(
run.run_id,
%{
dynamic_key: "subscription_digest_fanout",
origin: %{
runnable_key: "run_123:schedule_digest:1",
step: "schedule_digest",
attempt: 1
},
reason: :runtime_fanout,
nodes: [
%{id: "deliver_digest:chat_1", action: "digest.deliver"}
]
},
action_registry: registry
)Use schedule_dynamic_work/3 instead when the dynamic nodes should execute:
{:ok, snapshot} =
SquidMesh.schedule_dynamic_work(
run.run_id,
%{
dynamic_key: "subscription_digest_fanout",
origin: %{
runnable_key: "run_123:schedule_digest:1",
step: "schedule_digest",
attempt: 1
},
reason: :runtime_fanout,
nodes: [
%{
id: "deliver_digest:chat_1",
action: "digest.deliver",
input: %{subscription_id: "sub_123"}
}
]
},
action_registry: registry
)preview_dynamic_work/3, record_dynamic_work/3, and
schedule_dynamic_work/3 share validation for stable ids, origin metadata,
nodes, and optional edges against the current run snapshot. Scheduled dynamic
work requires :action_registry; each executable dynamic node must include an
approved action key and may include an :input map for its attempt. The origin
runnable must already be applied before executable dynamic work can be
scheduled.
Preview returns the normalized dynamic work plus a graph overlay without
appending a journal fact. It also exposes stable overlay metadata for visual
editors: the producer node id, added node ids, added edge ids, whether recording
would append a new durable fact, and warnings such as duplicate dynamic work.
Recording appends only the durable inspection fact. Scheduling appends that fact
and planned runnable intents in one run-thread write; the normal
execute_next/1 worker path claims, executes, retries, applies, and inspects the
dynamic attempts. A scheduled dynamic node may opt into persisted retry with
retry: [max_attempts: n]. Dynamic edges are graph-inspection metadata for now;
scheduled dynamic nodes are queued as independent runnable intents. Dynamic
steps are replay-unsafe by default and require manual review before irreversible
replay. Recording and scheduling the same dynamic node are alternatives, not a
promotion flow; scheduling an already-recorded node with the same id is rejected
by duplicate-node validation. Terminal runs reject new dynamic work.
inspect_run_graph/2 also exposes dynamic_work_overlays so dashboards and
visual editors can show producer nodes, added node ids, and added edge ids
without reconstructing them from raw dynamic-work records.
Long-Running Steps
Workers can ask the journal executor to renew the active claim while a step is running:
SquidMesh.execute_next(
owner_id: "billing-worker-1",
lease_for: 30,
heartbeat_interval_ms: 10_000
)The executor keeps raw claim tokens internal. Durable heartbeat entries store
only the claim-token hash and are fenced by the same claim id and token used for
completion or failure. The minimum heartbeat interval is 50ms; production
workers should choose a much larger interval relative to lease_for.
Runtime-Authored Specs
Host-owned editors or databases can activate validated workflow specs without runtime code generation. Use stable action keys, resolve them through an allowlist, then start the resolved spec through the public API:
registry = %{"digest.record_delivery" => MyApp.Steps.RecordDigestDelivery}
:ok = SquidMesh.Workflow.validate_spec(spec, action_registry: registry)
{:ok, run} =
SquidMesh.start_spec(spec, :manual_digest, payload,
action_registry: registry
)Squid Mesh persists the resolved definition with the run so workers and
inspect_run_graph/2 can inspect and execute it later. Replay for
runtime-authored spec runs is intentionally rejected until that lifecycle is
supported.
Visual-editor JSON can use the same host-owned action allowlist before a draft graph with top-level action keys is accepted:
:ok = SquidMesh.Workflow.EditorSpec.validate_map(editor_map, action_registry: registry)
{:ok, graph} = SquidMesh.Workflow.EditorSpec.preview_graph(editor_map, action_registry: registry)
{:ok, diff} = SquidMesh.Workflow.EditorSpec.diff(source_spec, editor_map, action_registry: registry)These editor APIs still validate, preview, and compare data only. Starting a
runtime-authored run remains the separate start_spec/3 or start_spec/4
boundary.
Cancellation, Replay, and Listing
{:ok, running_runs} = SquidMesh.list_runs(status: :running)
{:ok, _} = SquidMesh.cancel(run.run_id)
{:ok, _} = SquidMesh.replay(run.run_id)
# Replay past irreversible steps requires an explicit override
{:ok, _} = SquidMesh.replay(run.run_id, allow_irreversible: true)Graph Inspection
Inspect the workflow graph with execution state:
{:ok, graph} = SquidMesh.inspect_run_graph(run.run_id)
graph
|> SquidMesh.Runs.GraphInspection.to_map()
|> Map.take([:status, :current_node_ids, :nodes, :edges])The graph includes nodes, edges, and the selected transition path for conditional routing.
Nested workflow starts stay as separate runs; parent graph maps include
child_links so dashboards and visual editors can render subflow links without
treating child workflows as inline executable nodes.
Node Visibility and Redaction
Graph nodes can include host-domain inputs, outputs, errors, manual metadata,
and dynamic-work metadata. By default, inspect_run_graph/2 omits detailed
payload fields; request include_history: true only for trusted operator
surfaces.
Before exposing graph payloads outside a trusted boundary, apply a host-owned visibility policy:
{:ok, graph} = SquidMesh.inspect_run_graph(run.run_id, include_history: true)
{:ok, visible_graph} =
SquidMesh.ReadModel.Visibility.redact(graph, current_actor, MyApp.VisibilityPolicy)External/operator views preserve node ids, status, current state, recovery availability, dynamic-work shape, and safe edge topology while removing node payloads, errors, attempt internals, command history, and sensitive metadata.
Actor Visibility
Squid Mesh provides built-in support for actor-scoped visibility to safely expose workflow data to different users. The runtime tracks actor information in manual actions and provides flexible redaction policies:
# Define a visibility policy
defmodule MyApp.VisibilityPolicy do
@behaviour SquidMesh.ReadModel.Visibility.Policy
def visibility_scope(actor, _view) do
cond do
actor.role == "admin" -> :auditor # Full access
actor.role == "support" -> :operator # Operational details
true -> :external # Minimal information
end
end
end
# Apply redaction at API boundaries
{:ok, snapshot} = SquidMesh.inspect(run_id)
safe_view = SquidMesh.ReadModel.Visibility.redact(snapshot, current_user, MyApp.VisibilityPolicy)The three standard scopes provide appropriate data access:
:external- High-level status only, all sensitive data redacted:operator- Includes operational metrics and debugging information:auditor- Complete unredacted access for privileged users
See the Actor Visibility Guide for comprehensive documentation on implementing multi-tenant access patterns, role-based visibility, and security best practices.
Optional Dashboard
SquidSonar is the optional read-only Phoenix LiveView dashboard for Squid Mesh. Mount it inside a Phoenix host application to inspect recent runs, filter by status, search runtime metadata, and view run detail pages with diagnosis, history counts, last error information, and workflow graph visualization.
Contributing
Please review the existing runtime model and workflow semantics before proposing substantial changes. Contributions are most welcome in: runtime reliability, workflow ergonomics, inspection tooling, recovery semantics, documentation improvements, backend integrations, and executable examples.
- Contributing Guide
- Code of Conduct
- Elixir Forum discussion thread
- GitHub Issues
- Squid Mesh channel on the Jido Discord
License
Copyright 2024, released under the Apache 2.0 License.