cgroup v2 primitives — create a cgroup, place processes into it, set resource limits, read counters, freeze and thaw.
Why a separate subsystem
cgroups are a coherent kernel concept (per-process resource
accounting and limits) with their own filesystem-shaped interface
under /sys/fs/cgroup. Linx.Process spawns workloads, but the
question of "constrain this workload to 256 MiB of memory and at
most 100 processes" is cgroup-shaped, not clone-shaped — and these
primitives are useful even when no clone is involved (Erlang
processes themselves can be supervised by cgroups, for instance).
cgroupfs is the API
cgroup v2 exposes its entire interface as a read/write filesystem
under /sys/fs/cgroup. Every operation here is plain
File.read/1 / File.write/2 against an interface file. No NIF,
no Port, no :os.cmd("cgcreate ...") — just the filesystem the
kernel already exposes.
v2 only
Linx targets modern Linux. cgroup v1 (the legacy
controller-per-mount hierarchy) is not supported.
supported?/0 returns true iff the unified hierarchy is
mounted at /sys/fs/cgroup.
Primitives, not policy
The caller chooses the path. Linx does not bake in
/sys/fs/cgroup/linx/<name> as a parent. A container engine built
on Linx picks /sys/fs/cgroup/myengine/...; a workload supervisor
picks something else. Naming convention is the consumer's choice.
Composition with Linx.Process
Place a workload into a cgroup at the checkpoint — the same window
Linx.Netlink uses to configure a child's netns from the host
before proceed/1:
{:ok, c} = Linx.Process.spawn(argv: [...], namespaces: [...])
host_pid = receive do {:linx_process, :ready, p} -> p end
{:ok, cg} = Linx.Cgroup.create("/sys/fs/cgroup/myorg/web-42")
:ok = Linx.Cgroup.set_memory_max(cg, 256 * 1024 * 1024)
:ok = Linx.Cgroup.add_process(cg, host_pid)
:ok = Linx.Process.proceed(c)Linx.Process itself has no awareness of cgroups; the checkpoint
is the integration surface and that is enough.
Forward compatibility
stats/1 reads the curated counters it knows; an unrecognised line in
a *.stat file (a counter a newer kernel added) is silently dropped,
so the returned %Stats{} stays valid. Reach for read/2 to get any
raw field without a typed reader.
Summary
Functions
Moves OS process pid (and so its future children) into cg by
writing the pid's decimal text to <cg>/cgroup.procs.
Creates a cgroup at path.
Removes the cgroup at path.
Enables controllers on cg so its children can use them.
Freezes every process in cg by writing "1" to
<cg>/cgroup.freeze.
Reads cgroup interface file file (e.g. "memory.current") under
cg. Returns {:ok, trimmed_string} — cgroupfs interface files
end in newlines that the caller almost never wants — or
{:error, %Linx.Cgroup.Error{}}.
Sets the CPU bandwidth limit for cg (cpu.max).
Sets the memory limit for cg (memory.max).
Sets the pids limit for cg (pids.max).
Reads a curated snapshot of cg's resource counters as a
Linx.Cgroup.Stats struct.
Returns true iff the cgroup v2 unified hierarchy is mounted.
Thaws a previously-frozen cgroup by writing "0" to
<cg>/cgroup.freeze. Idempotent on an already-thawed cgroup.
Writes value to cgroup interface file file (e.g.
"memory.max") under cg. value is rendered via
to_string/1, so atoms (:max), integers, and binaries all work
directly.
Types
Functions
@spec add_process(cgroup(), pos_integer()) :: :ok | {:error, Linx.Cgroup.Error.t()}
Moves OS process pid (and so its future children) into cg by
writing the pid's decimal text to <cg>/cgroup.procs.
The classic checkpoint composition with Linx.Process:
host_pid = receive do {:linx_process, :ready, p} -> p end
:ok = Linx.Cgroup.add_process(cg, host_pid)
:ok = Linx.Process.proceed(c)The pid the kernel accepts is in the cgroup's own namespace —
on a :cgroup-namespaced workload this matters; outside one
it's the global pid.
@spec create(Path.t()) :: {:ok, cgroup()} | {:error, Linx.Cgroup.Error.t()}
Creates a cgroup at path.
Idempotent: an already-existing cgroup (EEXIST) is treated as
success — calling create/1 twice in a row is safe. Other
failures (e.g. parent missing, no permission) return
{:error, %Linx.Cgroup.Error{}}.
Returns {:ok, path} so the path can flow into the rest of the
API by piping: Linx.Cgroup.create(path) |> elem(1) |> Linx.Cgroup.add_process(pid).
@spec destroy(cgroup()) :: :ok | {:error, Linx.Cgroup.Error.t()}
Removes the cgroup at path.
Succeeds only once the cgroup is empty — the kernel returns
EBUSY while any process is still in the cgroup, surfaced as
{:error, %Linx.Cgroup.Error{errno: :ebusy}}. Pattern-match on
that to handle "still has live processes" without surprise.
@spec enable_controllers(cgroup(), [atom()]) :: :ok | {:partial, [{atom(), Linx.Cgroup.Error.t()}]}
Enables controllers on cg so its children can use them.
Each controller in controllers is written individually as
"+<name>" to <cg>/cgroup.subtree_control. Writing
controllers one at a time means a single rejected name doesn't
lose the controllers that did take — the partial state is
surfaced to the caller for them to act on.
Returns:
:ok— every controller in the list was accepted (or the list was empty).{:partial, failures}— one or more controllers were rejected.failuresis a non-empty list of{controller_atom, %Linx.Cgroup.Error{}}tuples for the ones that failed. Controllers not in the list are not touched. Common failures: the controller is not available in<cg>/cgroup.controllers(not delegated from the parent →EINVAL/ENOENT), or the kernel doesn't recognize the name.
Accepts standard cgroup v2 controller atoms: :cpu, :cpuset,
:io, :memory, :pids, :rdma, :hugetlb, :misc. The
atom is rendered with to_string/1 so any new controller a
future kernel adds is reachable without code changes here.
Why one-at-a-time
The kernel rejects the whole write if any controller in a
space-separated "+a +b +c" blob is invalid. Writing one at a
time lets us tell the caller exactly which controllers landed
and which didn't, instead of all-or-nothing.
@spec freeze(cgroup()) :: :ok | {:error, Linx.Cgroup.Error.t()}
Freezes every process in cg by writing "1" to
<cg>/cgroup.freeze.
All processes in the cgroup (and its descendants) are suspended
by the kernel — they stop scheduling but stay resident. Pair
with thaw/1. Always available on cgroup v2; no controller
needs to be enabled.
@spec read(cgroup(), String.t()) :: {:ok, String.t()} | {:error, Linx.Cgroup.Error.t()}
Reads cgroup interface file file (e.g. "memory.current") under
cg. Returns {:ok, trimmed_string} — cgroupfs interface files
end in newlines that the caller almost never wants — or
{:error, %Linx.Cgroup.Error{}}.
Raw escape hatch for fields without a typed reader.
@spec set_cpu_max(cgroup(), {pos_integer(), pos_integer()} | :max) :: :ok | {:error, Linx.Cgroup.Error.t()}
Sets the CPU bandwidth limit for cg (cpu.max).
Accepts either:
{quota_us, period_us}— both microseconds. The cgroup may usequota_usof CPU time perperiod_usof wall time.{50_000, 100_000}is "half a CPU".:max— clear the limit (the kernel default).
Requires the cpu controller to be enabled in the parent.
@spec set_memory_max(cgroup(), non_neg_integer() | :max) :: :ok | {:error, Linx.Cgroup.Error.t()}
Sets the memory limit for cg (memory.max).
Accepts an integer (bytes — the kernel's memory.max unit) or
the atom :max to clear the limit.
Requires the memory controller to be enabled in the parent's
cgroup.subtree_control (see enable_controllers/2). If the
controller isn't delegated, the kernel returns
ENOENT on the write because the interface file doesn't exist.
@spec set_pids_max(cgroup(), non_neg_integer() | :max) :: :ok | {:error, Linx.Cgroup.Error.t()}
Sets the pids limit for cg (pids.max).
Accepts an integer (maximum number of processes) or the atom
:max to clear the limit. Requires the pids controller to be
enabled in the parent.
@spec stats(cgroup()) :: {:ok, Linx.Cgroup.Stats.t()} | {:error, Linx.Cgroup.Error.t()}
Reads a curated snapshot of cg's resource counters as a
Linx.Cgroup.Stats struct.
Returns {:ok, %Linx.Cgroup.Stats{}} if the cgroup exists. Each
field is nil if its source isn't available — either because
the controller isn't delegated to the parent (interface file
missing) or the kernel is too old to expose it.
Returns {:error, %Linx.Cgroup.Error{operation: :stats}} if the
cgroup directory itself doesn't exist or isn't readable.
@spec supported?() :: boolean()
Returns true iff the cgroup v2 unified hierarchy is mounted.
Canonical check: /sys/fs/cgroup/cgroup.controllers only exists
on the v2 hierarchy (on v1, /sys/fs/cgroup is a tmpfs with
per-controller subdirectories instead). A true return here is
the prerequisite for everything else in this module.
@spec thaw(cgroup()) :: :ok | {:error, Linx.Cgroup.Error.t()}
Thaws a previously-frozen cgroup by writing "0" to
<cg>/cgroup.freeze. Idempotent on an already-thawed cgroup.
@spec write(cgroup(), String.t(), term()) :: :ok | {:error, Linx.Cgroup.Error.t()}
Writes value to cgroup interface file file (e.g.
"memory.max") under cg. value is rendered via
to_string/1, so atoms (:max), integers, and binaries all work
directly.
Raw escape hatch for fields without a typed setter.