Linx.User (Linx v0.1.0)

Copy Markdown View Source

Linux user-namespace configuration primitives — /proc/<pid>/uid_map, /proc/<pid>/gid_map, /proc/<pid>/setgroups.

Why a separate subsystem

User namespaces are a coherent kernel concept (per-namespace uid/gid mappings + capability translation) with their own procfs surface for configuration. Linx.Process creates user namespaces via clone(CLONE_NEWUSER); what the workload's identity looks like inside that namespace — root vs unprivileged, mapped vs the kernel-default "nobody" — is configured by writing the mapping files this module wraps.

procfs is the API

Every operation here is plain File.read/1 / File.write/2 against:

  • /proc/<pid>/uid_map — write-once user-id mapping
  • /proc/<pid>/gid_map — write-once group-id mapping
  • /proc/<pid>/setgroups"allow" / "deny" gate

No NIF, no Port, no setns(2) dance — the kernel handles all the namespace targeting based on the path. The write-once semantics are a kernel rule, not a Linx choice: once a map has been set for a user ns, subsequent writes return EPERM.

No :in option

Unlike Linx.Mount (where the syscall must be called from inside the target's mount namespace), uid/gid map writes happen via the host's view of procfs. Verbs take a pid as their first argument — the target child's host pid, typically obtained from {:linx_process, :ready, host_pid}.

The setgroups order

When an unprivileged caller (no CAP_SETGID in the parent user ns) writes gid_map, the kernel requires /proc/<pid>/setgroups first contain "deny". Skipping it returns EPERM. deny_setgroups/1 is the primitive; the setup_maps/2 convenience does this in the right order automatically.

Composition with Linx.Process

The canonical rootless flow:

{:ok, c} = Linx.Process.spawn(
             argv: ["/bin/bash"],
             namespaces: [:user, :mount, :pid, :uts, :ipc],
             stdio: :pty)

host_pid = receive do {:linx_process, :ready, p} -> p end

# "root inside ↔ me outside" -- the canonical rootless mapping.
:ok = Linx.User.deny_setgroups(host_pid)
:ok = Linx.User.set_uid_map(host_pid, [{0, my_host_uid, 1}])
:ok = Linx.User.set_gid_map(host_pid, [{0, my_host_gid, 1}])

:ok = Linx.Process.proceed(c)

Linx.Process has zero awareness of user-namespace mappings; the checkpoint between :ready and proceed/1 is the only coupling, exactly the way Linx.Netlink / Linx.Cgroup / Linx.Mount integration works.

Forward compatibility

read_uid_map/1 / read_gid_map/1 parse the map files defensively — a line that isn't three non-negative integers is skipped rather than crashing the read.

Summary

Types

One mapping entry: {inside_id, outside_id, length} — all non-negative integers; length > 0.

Host pid of a target process. Typically the value carried in {:linx_process, :ready, host_pid} from a Linx.Process session spawned with the :user namespace.

Functions

Writes "deny" to /proc/<pid>/setgroups.

Reads and parses /proc/<pid>/gid_map into a list of %Linx.User.Map{} entries. Same shape as read_uid_map/1.

Reads and parses /proc/<pid>/uid_map into a list of %Linx.User.Map{} entries.

Writes a gid mapping to /proc/<pid>/gid_map. Same shape and write-once semantics as set_uid_map/2.

Writes a uid mapping to /proc/<pid>/uid_map.

Applies the canonical map-setup sequence in one call: deny_setgroups/1set_uid_map/2set_gid_map/2.

Returns true iff user namespaces are configurable on this host.

Types

mapping()

@type mapping() :: {non_neg_integer(), non_neg_integer(), pos_integer()}

One mapping entry: {inside_id, outside_id, length} — all non-negative integers; length > 0.

pid_target()

@type pid_target() :: pos_integer()

Host pid of a target process. Typically the value carried in {:linx_process, :ready, host_pid} from a Linx.Process session spawned with the :user namespace.

Functions

deny_setgroups(pid)

@spec deny_setgroups(pid_target()) :: :ok | {:error, Linx.User.Error.t()}

Writes "deny" to /proc/<pid>/setgroups.

Required before set_gid_map/2 for unprivileged callers (no CAP_SETGID in the parent user ns) — the kernel rejects the gid_map write otherwise. Privileged callers may skip it; the effect is idempotent (re-writing "deny" over an already-denied setgroups returns :ok).

Common failure modes:

  • {:error, %Linx.User.Error{errno: :enoent}} — the target pid no longer exists.
  • {:error, %Linx.User.Error{errno: :eperm}} — calling at the wrong moment (some kernel versions; rare in modern setups).

read_gid_map(pid)

@spec read_gid_map(pid_target()) ::
  {:ok, [Linx.User.Map.t()]} | {:error, Linx.User.Error.t()}

Reads and parses /proc/<pid>/gid_map into a list of %Linx.User.Map{} entries. Same shape as read_uid_map/1.

read_uid_map(pid)

@spec read_uid_map(pid_target()) ::
  {:ok, [Linx.User.Map.t()]} | {:error, Linx.User.Error.t()}

Reads and parses /proc/<pid>/uid_map into a list of %Linx.User.Map{} entries.

A user namespace whose maps haven't been written yet returns {:ok, []} — the file exists but is empty.

Examples

iex> Linx.User.read_uid_map(host_pid)
{:ok, [#Linx.User.Map<0 -> 1000>]}

iex> Linx.User.read_uid_map(host_pid)  # multi-range
{:ok, [
  #Linx.User.Map<0 -> 0>,
  #Linx.User.Map<1..65535 -> 100000..165535>
]}

set_gid_map(pid, mappings)

@spec set_gid_map(pid_target(), [mapping()]) ::
  :ok | {:error, Linx.User.Error.t() | {:bad_map, term()}}

Writes a gid mapping to /proc/<pid>/gid_map. Same shape and write-once semantics as set_uid_map/2.

Unprivileged callers must call deny_setgroups/1 first — the kernel returns EPERM otherwise. The Linx.User.setup_maps/2 convenience does this sequence automatically.

set_uid_map(pid, mappings)

@spec set_uid_map(pid_target(), [mapping()]) ::
  :ok | {:error, Linx.User.Error.t() | {:bad_map, term()}}

Writes a uid mapping to /proc/<pid>/uid_map.

mappings is a non-empty list of {inside_id, outside_id, length} non-negative-integer tuples (with length > 0). The kernel serialises it as one line per entry; we render and write the whole blob in one syscall.

The write is write-once per user namespace — a second call returns EPERM. Plan your mapping fully before calling.

Examples

# The canonical rootless "root inside ↔ me outside" mapping:
:ok = Linx.User.set_uid_map(host_pid, [{0, my_uid, 1}])

# A multi-range identity map (needs CAP_SETUID or
# newuidmap(1)):
:ok = Linx.User.set_uid_map(host_pid,
  [{0, 0, 1}, {1, 100_000, 65_536}])

Errors

  • {:error, {:bad_map, reason}} — caller-side input mistake (not a list, wrong-arity tuple, negative id, zero length).
  • {:error, %Linx.User.Error{}} — kernel-level rejection. Common: :eperm (write-once already done; map too broad for an unprivileged caller), :einval (overlapping or invalid range), :enoent (target pid is gone).

setup_maps(pid, opts)

@spec setup_maps(
  pid_target(),
  keyword()
) ::
  :ok
  | {:error,
     Linx.User.Error.t() | {:bad_map, term()} | {:bad_setgroups, term()}}

Applies the canonical map-setup sequence in one call: deny_setgroups/1set_uid_map/2set_gid_map/2.

Options

  • :uid (required) — mappings list for uid_map; same shape as set_uid_map/2.
  • :gid (required) — mappings list for gid_map.
  • :setgroups (default :deny) — whether to write "deny" to /proc/<pid>/setgroups before the gid_map write. :skip is for privileged callers who don't need the kernel's setgroups gate.

Returns :ok if every step succeeded, or the first error encountered (with the failing step's :operation):

:ok                                                 -- everything worked
{:error, %Linx.User.Error{operation: :deny_setgroups, ...}}
{:error, %Linx.User.Error{operation: :set_uid_map, ...}}
{:error, %Linx.User.Error{operation: :set_gid_map, ...}}
{:error, {:bad_map, _}}                             -- bad uid/gid input
{:error, {:bad_setgroups, value}}                   -- bad :setgroups opt

Steps that ran successfully before a later step failed are not rolled back — the kernel's write-once semantics mean uid_map / gid_map can't be undone, and deny_setgroups is idempotent anyway. The error tells you exactly where the sequence stopped.

Example

:ok = Linx.User.setup_maps(host_pid,
  uid: [{0, my_uid, 1}],
  gid: [{0, my_gid, 1}]
)

supported?()

@spec supported?() :: boolean()

Returns true iff user namespaces are configurable on this host.

Canonical check: /proc/self/uid_map exists, which is true on every Linux kernel ≥ 3.8 with CONFIG_USER_NS=y (the default for every mainline distribution kernel).