Elixir-native host infrastructure declarations, planning, and runtime control.

HostKit is intended to be used from a normal Mix project with .exs infrastructure files. The DSL compiles to plain inspectable structs; Mix tasks are wrappers around the runtime API.

Design

  • Core owns systemd/systemdkit persistent units.
  • Core owns unitctl transient runtime primitives.
  • Integrations such as Caddy, Forgejo, object storage, and monitoring are providers.
  • DSL evaluation never applies changes to a host.
  • Planning and rendering are available as runtime APIs.

Example

use HostKit.DSL

project :toys do
  roots source: "/opt/toys/src",
        data: "/srv/toys",
        state: "/var/lib/toys",
        config: "/etc/toys"

  prefixes user: "toys-", unit: "toys-"

  host :elixir_toys do
    hostname "elixir.toys"
    user "dannote"
    sudo true
  end

  service :exograph do
    account "toys-exograph", system: true, home: "/var/lib/toys/exograph/home"
    directory "/srv/toys/exograph", owner: "toys-exograph", group: "toys-exograph", mode: 0o755

    daemon "toys-exograph.service" do
      description "Exograph search"
      after_target :network_online
      wants :network_online
      service_user "toys-exograph"
      working_directory "/opt/toys/src/exograph"
      exec_start ["/usr/local/bin/mix", "exograph.index.hex", "--web", "--port", "4200"]
      restart :on_failure
      restart_sec 10
      hardening :web_service
      read_write_paths ["/srv/toys/exograph", "/var/lib/toys/exograph"]
      wanted_by :multi_user
    end
  end
end

Plans and down plans

Rollback is represented as another HostKit plan. A plan change already carries before and after state, so HostKit can derive a down plan from the exact plan that was applied:

{:ok, plan} = HostKit.plan(project, target: prod)
{:ok, down_plan} = HostKit.down(plan)

HostKit.format_plan(down_plan)
HostKit.apply(down_plan, confirm: true)

Partial rollback uses the same plan model:

{:ok, down_plan} =
  HostKit.down(plan, only: [{:file, "/etc/gatehouse/config.exs"}])

Command-like operations need semantic down steps because HostKit cannot infer the opposite of an arbitrary command:

command :migrate,
  exec: {"bin/app", ["eval", "App.Release.migrate()"]},
  phase: :before_start,
  down: {"bin/app", ["eval", "App.Release.rollback()"]}

command :warm_cache,
  exec: {"bin/app", ["eval", "App.Cache.warm()"]},
  down: :noop

The down command is emitted as an ordinary command change in the down plan. down: :irreversible records an explicit warning and omits the command from the down plan.

Created resources use conservative rollback policies. File-like resources can be deleted by a down plan, but directories are kept unless explicitly opted in:

directory "/tmp/demo", rollback: :delete_if_created
directory "/srv/app", rollback: :keep
account :app, system: true, rollback: :keep
package :caddy, rollback: :keep

CLI usage mirrors this:

mix host_kit.plan infra/config.exs --host prod --out up.plan.json
mix host_kit.down up.plan.json --out down.plan.json
mix host_kit.apply --plan down.plan.json --confirm

Run tracking

Tracked applies write minimal run records under the project-configured HostKit runs root:

mix host_kit.apply --track --plan up.plan.json --confirm
mix host_kit.runs --host prod infra/config.exs
mix host_kit.runs --host prod --verbose infra/config.exs
mix host_kit.runs --host prod --latest --verbose infra/config.exs
mix host_kit.down --host prod --run 20260614-101148-demo-up --out down.plan.json infra/config.exs

Run records are intentionally compact: they identify the run, project, direction, timestamp, and applied change statuses. They do not replace plan artifacts; use plan artifacts for inspectable up/down plan contents. When a tracked apply is started from --plan, HostKit copies that up-plan artifact under the runs root and records the copied path so mix host_kit.down --last can work from the tracked run.

Tracked applies also write backup payloads for previous file-like state when that state was captured in the plan. Backup payloads live under hostkit_backups/<run-id>/ or the --backups-root override. mix host_kit.down --last and mix host_kit.down --run RUN_ID rewrite supported previous file-like state to %HostKit.BackupRef{} entries so generated down plans restore from backup payloads instead of embedding prior content. Backup-backed restore currently covers ordinary files plus rendered file resources such as env files, Caddy sites, proxy config, firewall/egress files, and systemd unit files when their previous rendered content was captured. Use mix host_kit.runs --verbose, --latest, or --id RUN_ID to inspect copied plan artifacts and backup payload paths.

Source updates are intentionally not inferred as reversible by default: a previous Git remote/ref may no longer be reachable. Treat source rollback as an explicit lifecycle operation or pair it with a backup/source-bundle strategy.

Run retention is explicit. Use mix host_kit.runs --prune --keep N to remove older run records plus their copied plan artifact and backup payload directories.

Elixir app lifecycle helpers

The Elixir app recipe can emit lifecycle commands for common BEAM deployment operations. Ecto migrations are represented as normal commands with explicit down commands:

elixir_app :shop do
  source github: "acme/shop", path: ".", ref: "main"
  phoenix host: "shop.example.com", secret_key_base: secret_env("SECRET_KEY_BASE")

  ecto release: "Shop.Release"
end

This emits a :before_start migration command that runs through the built release and a matching down command that calls Shop.Release.rollback().

For multiple repos, HostKit emits one ordered command per repo. Down plans reverse that order:

elixir_app :shop do
  source github: "acme/shop", path: ".", ref: "main"
  phoenix host: "shop.example.com", secret_key_base: secret_env("SECRET_KEY_BASE")

  ecto release: "Shop.Release" do
    repo "Shop.Repo"
    repo "Shop.AnalyticsRepo"
  end
end

The default expressions are:

Shop.Release.migrate(Shop.Repo)
Shop.Release.rollback(Shop.Repo)

Use :migrate and :rollback for custom release functions when the defaults do not fit.

Providers

Providers can contribute DSL modules, resource types, renderers, validators, and read/plan/apply lifecycle operations. Systemd and Unitctl are core primitives, not providers; integrations such as Caddy should be providers.

use HostKit.DSL, providers: [HostKit.Providers.Caddy]

project :demo do
  provider :caddy, HostKit.Providers.Caddy do
    set :sites_dir, "/etc/caddy/sites"
  end

  service :web do
    caddy_site :web, "example.com", path: "web.caddy" do
      encode [:zstd, :gzip]
      reverse_proxy "127.0.0.1:4000"
    end
  end
end

Host bootstrap packages and mise-managed runtimes

HostKit can install OS packages through the target package manager. The DSL is distribution-neutral by default and can be pinned to a manager when needed.

service :bootstrap do
  package :ca_certificates
  package :build_essential, as: "build-essential", update: true
end

HostKit can also bootstrap mise and install system-wide tool versions. This is intended for host bootstrap and workspace agents; application services should still prefer packaged release artifacts where possible.

service :bootstrap do
  mise path: "/usr/local/bin/mise", system_data_dir: "/usr/local/share/mise" do
    tool :erlang, "29.0.2"
    tool :elixir, "1.20.1"
  end
end

This applies through the mise CLI contract: it installs the binary with mise.run when missing, then runs mise install --system with MISE_SYSTEM_DATA_DIR set.

Package planning resolves semantic package names through Repology and caches responses in .host_kit/cache/repology for 24 hours by default. Use locks for deterministic apply:

mix host_kit.plan --write-package-lock host_kit.package.lock infra/config.exs
mix host_kit.apply --package-lock host_kit.package.lock --confirm infra/config.exs

Plan/apply artifacts make remote changes inspectable before apply. Prefer declaring the remote host in normal .exs HostKit config and selecting it with --host:

use HostKit.DSL

project :infra do
  host :prod do
    hostname "host.example"
    user "root"
    sudo true

    ssh identity_file: Path.expand("~/.ssh/id_ed25519"),
        password: secret_env("HOSTKIT_SSH_PASSWORD"),
        silently_accept_hosts: true
  end
end
mix host_kit.plan --host prod \
  --package-lock host_kit.package.lock \
  --out host_kit.plan.json infra/config.exs

mix host_kit.apply --host prod \
  --plan host_kit.plan.json --confirm infra/config.exs

Plan artifacts are JSON and intended to be inspectable. Secret references are stored as references, not values, for example:

{
  "$type": "struct",
  "module": "Elixir.HostKit.Secret",
  "fields": {
    "source": {
      "$type": "tuple",
      "items": [
        {"$type": "atom", "value": "env"},
        "HOSTKIT_SSH_PASSWORD"
      ]
    }
  }
}

secret_env/1 records an environment-backed secret reference and resolves it only at the control-plane boundary that needs the value. Use it for HostKit's own credentials, such as SSH passwords or future provider API tokens. Target application environment files use the env-file DSL, which is backed by the same secret reference type:

env_file "/etc/app/app.env" do
  set :mix_env, :prod
  secret :database_url, env: "DATABASE_URL"
end

Raw SSH flags remain available as an escape hatch: --remote, --user, --port, --identity-file, --password, and --password-env.

For Linux integration testing, use Incus as the lightweight native container/VM backend:

HOSTKIT_INCUS_SUDO=true HOSTKIT_SSH_PUBLIC_KEY=$HOME/.ssh/id_ed25519.pub \
  scripts/incus_integration_vm.sh ensure
HOSTKIT_INCUS_SUDO=true scripts/incus_integration_vm.sh ip

Set HOSTKIT_INCUS_TYPE=vm to launch an Incus VM instead of the default container, and HOSTKIT_INCUS_INSTANCE=name to change the instance name. Run the remote CLI integration against Incus with HOSTKIT_INTEGRATION_TOOL=incus, or against a pre-existing host declared in .exs config with HOSTKIT_INTEGRATION_TOOL=remote HOSTKIT_INTEGRATION_CONFIG=examples/integration_hosts.example.exs.

A real remote validation can use the same host config and a shell-provided secret:

HOSTKIT_SSH_PASSWORD='...' \
HOSTKIT_INTEGRATION_TOOL=remote \
HOSTKIT_INTEGRATION_CONFIG=examples/integration_hosts.example.exs \
mix test test/integration/cli_remote_test.exs --include integration

Project-local DSLs

Use HostKit.ProjectDSL in consuming projects to build local conventions without baking them into HostKit. Load project-local DSL files explicitly through the runtime API or Mix task --require option:

# infra/toys_infra.exs
defmodule ToysInfra do
  use HostKit.ProjectDSL

  root :source, "/opt/toys/src"
  root :data, "/srv/toys"
  root :state, "/var/lib/toys"
  root :config, "/etc/toys"

  prefix :user, "toys-"
  prefix :unit, "toys-"

  defservice :toy_service do
    let :service_user, do: prefixed(:user, service_name())
    let :unit_name, do: prefixed(:unit, service_name()) <> ".service"

    path :source_dir, root(:source), service_name()
    path :data_dir, root(:data), service_name()
    path :state_dir, root(:state), service_name()
    path :config_dir, root(:config), service_name()

    macro :standard_user do
      account service_user(), system: true, home: state_path("home")
    end
  end
end
# infra/config.exs
use HostKit.DSL, providers: [HostKit.Providers.Caddy]
use ToysInfra

project :toys do
  toy_service :exograph do
    standard_user()

    systemd_service unit_name() do
      working_directory source_dir()
      read_write_paths [data_dir(), state_dir(), source_dir()]
    end
  end
end

Runtime API

{:ok, project} = HostKit.load("infra/config.exs", require: ["toys_infra.exs"])
{:ok, plan} = HostKit.plan(project)
#=> %HostKit.Plan{changes: [%HostKit.Change{action: :create, ...}]}

prod = HostKit.Target.ssh(:prod, host: "elixir.toys", user: "dannote", sudo: true)
{:ok, remote_plan} = HostKit.plan(project, target: prod, reader: HostKit.Remote)

HostKit.format_plan(plan)
{:ok, results} = HostKit.apply(plan, dry_run: true)

# Supported apply resources: accounts, directories, files, systemd services, and systemd timers.
{:ok, results} = HostKit.apply(plan, confirm: true, sudo: true)

# Command and filesystem operations are routed through a runner boundary.
{:ok, results} = HostKit.apply(plan, confirm: true, runner: HostKit.Runner.Local)

prod = HostKit.Target.ssh(:prod, host: "elixir.toys", user: "dannote", sudo: true)

{:ok, results} = HostKit.apply(plan, target: prod, confirm: true)

{:ok, conn} = HostKit.Runner.SSH.Connection.open(host: "elixir.toys", user: "dannote")
try do
  prod = HostKit.Target.ssh(:prod, runner: {HostKit.Runner.SSH.Connection, conn: conn}, sudo: true)
  {:ok, remote_plan} = HostKit.plan(project, target: prod, reader: HostKit.Remote)
after
  HostKit.Runner.SSH.Connection.close(conn)
end

{:ok, unit} = HostKit.Render.render(project, {:systemd_service, "toys-exograph.service"})

Storage volumes

HostKit models storage as named metadata instead of repeated path strings:

volume =
  HostKit.Storage.volume(:repositories,
    path: "/srv/toys/forgejo/repositories",
    owner: "toys-forgejo",
    group: "toys-forgejo",
    mode: 0o750,
    backup: true
  )

directory HostKit.Storage.directory(volume)
read_write_paths HostKit.Storage.read_write_paths([volume])

Service conventions can derive these paths without project-specific macros and later reuse the same volume metadata for systemd sandboxing, Unitctl transient runtimes, and backups.

project :toys do
  roots data: "/srv/toys", config: "/etc/toys"
  prefixes user: "toys-", unit: "toys-"

  service :forgejo do
    storage :repositories, under: :data, path: "repositories", mode: 0o750, backup: true
    storage :config, under: :config, owner: "root", group: service_user(), writable: false, secret: true

    daemon unit_name() do
      run user: service_user(), read_write_paths: writable_storage_paths()
    end
  end
end

HostKit agent

HostKit can run as a supervised OTP application. The supervision tree currently starts agent state and a monitor worker:

HostKit.Agent.status()
HostKit.Agent.configure(project: project, target: HostKit.Target.local(:prod))
HostKit.Agent.run_plan()
HostKit.Agent.run_monitor()

HostKit can also declare its own outer systemd supervisor unit:

HostKit.Agent.Systemd.service(
  exec_start: ["/opt/host_kit/bin/host_kit", "agent", "--config", "/etc/host_kit/config.exs"]
)

State snapshots can be written for audit/drift history:

HostKit.State.write(plan, "/var/lib/host_kit/state/latest-plan.json")
HostKit.State.read("/var/lib/host_kit/state/latest-plan.json")

This gives a clean two-layer supervision model: OTP inside the BEAM and systemd outside it.

Firewall policy

HostKit can declare project- or host-scoped firewall policy:

firewall do
  allow tcp: 22, from: :any
  allow tcp: [80, 443], from: :any
  allow tcp: 9100, from: {10, 44, 0, 0, 24}
  deny :all
end

Host-scoped policy lives inside host:

host :prod, hostname: "elixir.toys" do
  firewall do
    allow tcp: 22, from: :any
    deny :all
  end
end

Extract, render, plan, and apply policies with:

HostKit.Firewall.policies(project)
HostKit.Firewall.Nftables.render(policy)
HostKit.plan(project, reader: HostKit.Local)
HostKit.apply(plan, confirm: true, nft_reload: true)

Firewall policy is written to /etc/nftables.d/hostkit.nft by default and validated with nft -c -f before optional reload.

Workspace inside monitoring

Workspace services can declare checks that are intended to run inside the sandbox later via a workspace agent:

workspace :blog, owner: :alice do
  service :preview do
    inside do
      monitor :mix, task: "test", every: "5m"
      monitor :port, port: 4000
      monitor :git, clean: true
    end
  end
end

Extract them with:

HostKit.Workspace.inside_monitors(project)

Workspace execution and tenants

Tenants can own workspaces:

tenant :alice, quota: [memory: "4G"] do
  agent port: 4173
end

Workspace command specs can be built for transient execution:

HostKit.Workspace.exec_spec(project, :alice, :blog, ["mix", "test"])
HostKit.Workspace.exec(project, :alice, :blog, ["mix", "test"])

Inside monitors currently return :pending_workspace_agent, reserving execution for the sandbox agent boundary.

OpenTelemetry Collector config

Telemetry declarations can be converted to an OpenTelemetry Collector config map:

HostKit.OtelCollector.config(project, endpoint: "otel.example:4317")

Workspace sandbox profiles

Systemd-backed sandbox profiles can be applied inside daemons:

workspace :blog, owner: :alice do
  service :preview do
    daemon unit_name() do
      run exec_start: ["mix", "phx.server"]
      sandbox :vibe_dev
    end
  end
end

Profiles include :vibe_dev, :strict_app, and :untrusted, and can be overridden:

sandbox :untrusted,
  resources: [memory_max: "256M"],
  sandbox: [private_network: false]

Workspace preview helper

Workspace services can expose a preview route with a named listener and Caddy site:

workspace :blog, owner: :alice do
  service :preview do
    daemon unit_name() do
      run exec_start: ["mix", "phx.server"]
    end

    preview :http, port: 4000, domain: "alice-blog.dev.example.com"
  end
end

This expands to listen :http, a Caddy reverse proxy to that listener, an HTTP monitor, telemetry metadata, and Caddy access-log metadata.

Workspace agent helper

Workspaces can declare the default sandbox agent service as ordinary HostKit resources:

workspace :blog, owner: :alice do
  agent port: 4173
end

This expands to a service with an account, workspace directory, systemd daemon, loopback listener, logs, telemetry, systemd monitor, and loopback-only network policy.

Workspace scope

workspace scopes ordinary HostKit DSL for user sandboxes while keeping resources inspectable:

workspace :blog, owner: :alice do
  service :preview do
    directory root_path(:data), mode: :private_dir

    daemon unit_name() do
      run exec_start: ["mix", "phx.server"]
      listen :http, port: 4000, on: :loopback
    end
  end
end

Inside a workspace, services get workspace metadata plus separate path and identity names:

root_path(:data) # .../alice/blog/preview
unit_name()      # prefix-alice-blog-preview.service

Named listeners

Services can declare named listeners and reuse them from provider declarations:

daemon unit_name() do
  run exec_start: ["/usr/bin/env", "true"]
  listen :http, port: 3000, on: :loopback
end

caddy_site :web, "web.example.com" do
  reverse_proxy listener(:http)
end

Named listeners are stored as service metadata and render Caddy upstreams like 127.0.0.1:3000 at the provider boundary.

Network addresses and policy

Network addresses can use Elixir tuple forms and semantic aliases:

listen 3000, on: :loopback
listen 4000, on: {127, 0, 0, 1}
network_policy deny: :all, allow: [:loopback, {10, 44, 0, 0, 24}]

Systemd services compile network policy to:

IPAddressDeny=any
IPAddressAllow=localhost 10.44.0.0/24
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX

Log management intent

Log management can be declared globally, per service, or on individual resources:

observability do
  logs driver: :journald,
       retention: "14d",
       ship: true,
       attributes: [deployment_environment: :prod]
end

Systemd service log declarations also add unit directives:

daemon unit_name() do
  run exec_start: ["/usr/bin/env", "true"]
  logs identifier: service_name(), stdout: :journal, stderr: :journal
end

Extract log intent with:

HostKit.Logs.configs(project)

Read recent journald logs through local or remote targets:

HostKit.Logs.read("toys-forgejo.service", target: prod, since: "1h")
HostKit.Logs.tail("toys-forgejo.service", target: prod, lines: 100)

OpenTelemetry collection intent

Observability defaults can be enabled once at project or service scope and inherited by resources:

observability do
  telemetry logs: true,
            metrics: true,
            traces: false,
            attributes: [deployment_environment: :prod]
end

Resource-level overrides are still available:

daemon unit_name() do
  run exec_start: ["/usr/bin/env", "true"]
  telemetry logs: :journald, metrics: false, service_name: service_name()
end

Extract collection intent with:

HostKit.Telemetry.signals(project)

Systemd services and Caddy sites get default collection intent even without global defaults:

# systemd: logs: :journald, metrics: :systemd
# caddy: logs: :access, metrics: :http

Monitoring metadata

Declarations can carry monitoring intent for a later monitoring service or config generator:

daemon unit_name() do
  run exec_start: ["/usr/bin/env", "true"]
  monitor :systemd, expect: [state: :active], severity: :critical
end

caddy_site :web, "web.example.com" do
  reverse_proxy "127.0.0.1:4000"
  monitor :http, url: "https://web.example.com", expect: [status: 200]
end

Extract or run checks with:

HostKit.Monitor.checks(project)
HostKit.Monitor.run(project, target: prod)

Initial execution supports systemd state, HTTP status, and filesystem existence checks.

File modes

Mode values can be raw octal, semantic aliases, tuples, keywords, or capability lists:

mode: :secret_group_file
mode: {:rw, :r, nil}
mode: [owner: :rw, group: :r]
mode: [:setgid, :owner_rwx, :group_rwx, :other_rx]

Resources store normalized integer modes, so plan/apply remains simple.

Env files and secrets

HostKit has a Dotenvy-validated env file resource. Secret values are resolved at apply time. Drift detection compares metadata and non-secret set entries; secret entry values are not read into plan artifacts for comparison.

env_file root_path(:config, "env"), owner: "root", group: service_user(), mode: 0o640 do
  set :MIX_ENV, :prod
  set :PORT, 4000
  secret :SECRET_KEY_BASE, env: "SECRET_KEY_BASE"
end

Runtime isolation

HostKit uses shared runtime isolation structs for persistent systemd units and future transient Unitctl workloads:

sandbox = HostKit.Runtime.Sandbox.new(:strict_web)
resources = HostKit.Runtime.Resources.new(memory_max: "512M", cpu_quota: "50%")

service sandbox |> HostKit.Runtime.Sandbox.to_systemd_service_options()
service resources |> HostKit.Runtime.Resources.to_systemd_service_options()

Built-in profiles include :web_service, :strict_web, :small, :medium, and :large.

Runtime controls

HostKit exposes Unitctl as its core transient runtime layer:

{:ok, spec} =
  HostKit.Runtime.Spec.new(
    name: "demo-check",
    command: ["/usr/bin/env", "true"],
    sandbox: %{no_new_privileges: true, private_tmp: true}
  )

{:ok, instance} = HostKit.Runtime.start(spec)
{:ok, state} = HostKit.Runtime.status(instance)
:ok = HostKit.Runtime.stop(instance)

Mix tasks

mix host_kit.dump --require toys_infra.exs infra/config.exs
mix host_kit.plan --require toys_infra.exs infra/config.exs
mix host_kit.plan --require toys_infra.exs infra/config.exs --local
mix host_kit.plan --require toys_infra.exs infra/config.exs --local --ignore systemd_service:toys-exograph.service
mix host_kit.plan --require toys_infra.exs infra/config.exs --remote elixir.toys --user dannote --sudo
mix host_kit.apply --require toys_infra.exs infra/config.exs --local --dry-run
mix host_kit.render --require toys_infra.exs infra/config.exs systemd_service toys-exograph.service