Linux seccomp ("SECure COMPuting") primitives — per-thread cBPF syscall-filter facilities exposed as Elixir verbs.
What seccomp is
A seccomp filter is a small cBPF program the kernel runs on every
syscall entry. Its return value tells the kernel whether to allow
the syscall, return an errno, kill the calling process or thread,
raise SIGSYS, or log and proceed. Filters install per-thread; they
never come off once on; and they only get looser via reset, never
tighter, after install. Together those properties let a workload
drop its syscall envelope to a small documented set before
execve, so a 0-day in the kernel's capability check still can't
reach the relevant code path if the syscall is gated.
See seccomp(2) and the kernel's
Documentation/userspace-api/seccomp_filter.rst for the canonical
reference.
What Linx exposes — and what it doesn't
This module is a primitive. It exposes:
Detection.
supported?/0(whether the kernel has the facility at all) andarch/0(which architecture we're building filters for).Filter construction. Two layers:
Sugar:
allow_list/2("only these syscalls"),deny_list/2("not these"), and the fluentLinx.Seccomp.BuilderDSL.Data:
from_rules/1for consumers that translate external policies — Dockerseccomp.json, custom DSLs, runtime policy — into a plain[{action, syscall_atom}, ...]Elixir list and hand it to Linx.to_rules/1is the inverse for filters Linx itself built.
Install.
install/2is checkpoint-bound, the same shape asLinx.Capabilities.drop_bounding/2— the same commit pattern, because the kernel forbids cross-thread seccomp installation. The child agent inlinx_process.cdoes the actualseccomp(2)call at the parked checkpoint.
Higher-level concerns — parsing JSON profiles, looking up which syscalls nginx 1.24 needs, tracking workload-to-filter mappings — are policy and orchestration. Those live in consumers that build on Linx.
Motivating composition
{:ok, c} = Linx.Process.spawn(argv: ["/usr/sbin/nginx"],
no_new_privs: true)
receive do {:linx_process, :ready, _} -> :ok end
{:ok, filter} = Linx.Seccomp.allow_list(
~w(read write openat close fstat brk mmap munmap mprotect
accept4 bind listen socket connect setsockopt
rt_sigaction rt_sigprocmask rt_sigreturn exit_group)a,
default: :kill_process
)
:ok = Linx.Seccomp.install(c, filter)
:ok = Linx.Process.proceed(c)After proceed/1, nginx runs with that exact syscall envelope.
A bug that tries execve(2) (not on the list) kills the process;
the kernel never enters do_execve.
Forward compatibility
Linx.Seccomp.Syscalls.from_number/2 returns :unknown for a syscall
number outside Linx's per-arch table rather than crashing, so decoding
a filter that references a newer syscall degrades gracefully.
Construction is strict the other way: an unknown syscall atom is
rejected at build time, since a typo must never silently widen a filter.
Per-argument matching (allow_if/3), multi-arch routing, and
SECCOMP_USER_NOTIF are deferred to future work.
Summary
Types
An architecture atom. Linx v1 supports :x86_64 and :aarch64;
any other host arch yields :unsupported and the filter-build
verbs reject it.
Functions
Build an allow-list filter: every listed syscall gets :allow,
every other syscall gets the default action.
The current host architecture as an atom — :x86_64, :aarch64,
or :unsupported.
Convenience for Linx.Seccomp.Builder.new/0 — start an empty
builder pipeline.
Build a deny-list filter: every listed syscall gets the deny action, every other syscall gets the default action.
Build a filter from a normalised rules list — the data-layer API.
Install a compiled filter on a parked Linx.Process session.
Returns true iff the running kernel exposes seccomp filtering —
i.e. /proc/self/status contains a Seccomp: line.
Inverse of from_rules/1 — extract the rules list from a filter
Linx itself built.
Types
Functions
@spec allow_list( Enumerable.t(), keyword() ) :: {:ok, Linx.Seccomp.Filter.t()} | {:error, term()}
Build an allow-list filter: every listed syscall gets :allow,
every other syscall gets the default action.
Options:
:default— the action for non-listed syscalls. Defaults to:kill_process— allow-lists are contracts ("I have enumerated what's safe"); a syscall outside is a bug or attack and should fail loudly.
Errors
Same shape as from_rules/1. See its docs for the full list.
Examples
{:ok, filter} = Linx.Seccomp.allow_list(
~w(read write openat close exit_group)a,
default: :kill_process
)
# Looser default — useful when the goal is to log unlisted
# syscalls for profiling rather than killing the workload.
{:ok, filter} = Linx.Seccomp.allow_list([:read, :write],
default: :log)
@spec arch() :: arch()
The current host architecture as an atom — :x86_64, :aarch64,
or :unsupported.
Resolved on first call from
:erlang.system_info(:system_architecture) and cached in
:persistent_term for the rest of the VM's life (the host arch
can't change). Cheap on every subsequent call.
Examples
iex> Linx.Seccomp.arch() in [:x86_64, :aarch64, :unsupported]
true
@spec builder() :: Linx.Seccomp.Builder.t()
Convenience for Linx.Seccomp.Builder.new/0 — start an empty
builder pipeline.
Example
Linx.Seccomp.builder()
|> Linx.Seccomp.Builder.allow(:read)
|> Linx.Seccomp.Builder.deny(:ptrace)
|> Linx.Seccomp.Builder.build(default: :kill_process)
@spec deny_list( Enumerable.t(), keyword() ) :: {:ok, Linx.Seccomp.Filter.t()} | {:error, term()}
Build a deny-list filter: every listed syscall gets the deny action, every other syscall gets the default action.
Options:
:default— the action for non-listed syscalls. Defaults to:allow— deny-lists are graceful-degradation shapes (Docker's default profile).:deny_action— the action for listed syscalls. Defaults to{:errno, :eperm}.
Errors
Same shape as from_rules/1.
Examples
# Docker-style: deny the dangerous syscalls, allow the rest.
{:ok, filter} = Linx.Seccomp.deny_list(
~w(kexec_load init_module delete_module ptrace mount)a
)
# Same denies but with a sharper edge — kill instead of EPERM.
{:ok, filter} = Linx.Seccomp.deny_list(
[:kexec_load, :init_module],
deny_action: :kill_process
)
@spec from_rules({[Linx.Seccomp.Filter.rule()], Linx.Seccomp.Filter.action()}) :: {:ok, Linx.Seccomp.Filter.t()} | {:error, term()}
Build a filter from a normalised rules list — the data-layer API.
Accepts {rules, default_action} where rules is a list of
{action, syscall_atom} tuples and default_action is the
fallthrough verdict. The seam external consumers (a
seccomp.json adapter, custom DSLs, runtime policy) use to hand
fully-resolved policy to Linx — the consumer's job is "translate
JSON to this list shape"; Linx's job starts here.
The filter targets the current host architecture (see arch/0).
Filters built for one arch don't install on another; multi-arch
filters are deferred.
Returns
{:ok, %Linx.Seccomp.Filter{}}on success — the filter's:rulesfield carries the normalised{rules, default}soto_rules/1can introspect it later.{:error, {:unsupported_arch, arch}}— the host arch isn't in Linx's supported list (:x86_64,:aarch64).{:error, {:bad_action, term}}— the default or one of the per-rule actions isn't a recognised verdict.{:error, {:unknown_syscall, atom}}— a rule names a syscall atom that isn't in the per-arch table. See Linx.Seccomp.Syscalls "Extending this table" for how to add one.{:error, {:duplicate_rule, atom}}— the same syscall appears in more than one rule.{:error, {:bad_rule, term}}— an element of the rules list isn't a{action, syscall_atom}tuple.{:error, %Linx.Seccomp.Error{operation: :build, errno: :e2big}}— the filter would need a jump > 255 instructions (jump-trampoline support is deferred; the current ~150-syscall table fits comfortably under this limit).
Examples
rules = [
{:allow, :read},
{:allow, :write},
{{:errno, :eperm}, :ptrace},
{:kill_process, :kexec_load}
]
{:ok, filter} = Linx.Seccomp.from_rules({rules, :allow})
# Errors are caller-actionable atoms:
Linx.Seccomp.from_rules({[{:allow, :not_a_real_syscall}], :allow})
# => {:error, {:unknown_syscall, :not_a_real_syscall}}
@spec install(Linx.Process.t(), Linx.Seccomp.Filter.t()) :: :ok | {:error, :not_ready | :running | :no_process}
Install a compiled filter on a parked Linx.Process session.
Checkpoint-bound — the same shape as
Linx.Capabilities.drop_bounding/2. The kernel forbids
cross-thread seccomp(2), so the child agent in linx_process.c
does the actual install at the checkpoint window before
execve.
If PR_SET_NO_NEW_PRIVS isn't already on (either because the
caller didn't pass no_new_privs: true to Linx.Process.spawn/1
or because the workload isn't privileged enough to install
without NNP), the agent sets it automatically before the
seccomp(2) call — the "be helpful" path. Callers who want the principled
posture should still pass the spawn opt; the auto-set is just a
fallback so an unprivileged caller who forgot doesn't get a
confusing EPERM.
Errors
{:error, :not_ready}— session hasn't reached the checkpoint yet. Wait for{:linx_process, :ready, _}first.{:error, :running}— pastproceed/1, the child hasexecve'd; installing now is too late.{:error, :no_process}— the session emitted its terminal event.
Kernel-level install failures arrive asynchronously as
{:linx_process, :error, errno, :seccomp_install} or
{:linx_process, :error, errno, :seccomp_no_new_privs} on the
session's owner mailbox, the same shape as other pre-execve
failures.
Examples
{:ok, c} = Linx.Process.spawn(argv: ["/usr/sbin/nginx"],
no_new_privs: true)
receive do {:linx_process, :ready, _} -> :ok end
{:ok, filter} = Linx.Seccomp.allow_list(~w(read write …)a)
:ok = Linx.Seccomp.install(c, filter)
:ok = Linx.Process.proceed(c)
@spec supported?() :: boolean()
Returns true iff the running kernel exposes seccomp filtering —
i.e. /proc/self/status contains a Seccomp: line.
True on every Linux ≥ 3.5, which is every kernel Linx targets.
Useful as a precondition guard in setup checks; this module's
build verbs don't gate on it themselves (a missing line would
manifest as an install-time ENOSYS from the agent).
@spec to_rules(Linx.Seccomp.Filter.t()) :: {:ok, {[Linx.Seccomp.Filter.rule()], Linx.Seccomp.Filter.action()}} | {:error, :no_rules}
Inverse of from_rules/1 — extract the rules list from a filter
Linx itself built.
Filters whose :rules field is nil (which would arise from a
consumer path that loads externally-supplied raw BPF blobs)
return {:error, :no_rules}. The current build verbs always
populate :rules, so this is reliable for any filter Linx
itself produced.
Examples
iex> {:ok, f} = Linx.Seccomp.allow_list([:read, :write])
iex> {:ok, {rules, default}} = Linx.Seccomp.to_rules(f)
iex> rules
[{:allow, :read}, {:allow, :write}]
iex> default
:kill_process