Linx.Mount (Linx v0.1.0)

Copy Markdown View Source

Linux filesystem-mount primitives — mount(2), umount2(2), pivot_root(2), and the read-side /proc/.../mountinfo parser.

Why a separate subsystem

Mounts are a coherent kernel concept (the filesystem hierarchy a process sees) with their own syscalls, their own configuration via /proc/.../mountinfo, and per-namespace semantics that compose cleanly with Linx.Process's :mount namespace. Like Linx.Cgroup, mount primitives are useful even outside the cloned-child case — bind-mounting host paths, propagating mount changes between namespaces, debugging mount tables.

The classic mount API

Linx wraps the classic syscalls — mount(2), umount2(2), pivot_root(2) — not the newer fsopen/fsmount/move_mount family (Linux ≥ 5.2). The classic calls are universally documented, map one-to-one onto the tools operators already know, and are single-shot calls on the calling thread (no fork), so a NIF wraps them safely. The fd-based API is deferred to a future revision.

Cross-namespace via :in

Every mutating verb takes an :in option naming the mount namespace to operate on:

  • :self (default) — the BEAM's mount namespace.
  • {:pid, n} — the mount namespace of pid n.
  • {:path, p} — an explicit path to a namespace file (typically /proc/<n>/ns/mnt).

The mechanism is the same throwaway-thread + setns(2) trick Linx.Netlink uses for opening sockets in another netns. It works for any process whose namespace files exist — parked at a Linx.Process checkpoint, fully running after proceed/1, or any other live pid. The :in option is lifecycle-agnostic.

Composition with Linx.Process

Mount /proc inside a child's fresh :mount namespace at the checkpoint, then proceed:

{:ok, c} = Linx.Process.spawn(argv: ["/bin/bash"], namespaces: [:mount, :pid])
host_pid = receive do {:linx_process, :ready, p} -> p end
:ok = Linx.Mount.mount("proc", "/proc", "proc", in: {:pid, host_pid})
:ok = Linx.Process.proceed(c)

The same call works post-proceed/1 against a running container for hot-mounting volumes or remounting paths.

Forward compatibility

list/0..1 parse /proc/.../mountinfo defensively: a line that doesn't match the expected shape — or carries an optional-field tag Linx doesn't recognise — is silently skipped rather than crashing the whole parse. A future kernel adding optional fields can't break a mount-table read.

Summary

Types

Target of a list/1 call — either a pid (reads /proc/<pid>/mountinfo) or an explicit path to a mountinfo file.

Functions

Bind-mounts source at target — makes the contents of source visible at target as well, like a hardlink for directories.

Returns the BEAM's mount table by parsing /proc/self/mountinfo.

Returns the mount table for target's mount namespace.

Mounts source at target with filesystem type fstype.

Atomically relocates an existing mount from source to target.

Swaps the mount-namespace's root: makes new_root the new / and stashes the old root at put_old.

Remounts the filesystem at target with new flags.

Unmounts the filesystem at target.

Types

list_target()

@type list_target() :: {:pid, pos_integer()} | {:path, Path.t()}

Target of a list/1 call — either a pid (reads /proc/<pid>/mountinfo) or an explicit path to a mountinfo file.

Functions

bind(source, target, opts \\ [])

@spec bind(String.t(), String.t(), keyword()) ::
  :ok | {:error, Linx.Mount.Error.t() | {:bad_flag, atom()} | {:bad_in, term()}}

Bind-mounts source at target — makes the contents of source visible at target as well, like a hardlink for directories.

Equivalent to mount/4 with flags: [:bind | user_flags] and an empty fstype. The kernel ignores fstype for bind mounts; the filesystem is whatever already lives at source.

Options

  • :flags — extra flag atoms to OR with :bind. Useful values:
    • :rec — recursive bind, descending into any submounts underneath source.
    • :ro — read-only at the target (effective via a follow-up remount/2 on Linux ≥ 2.6.26; combining :bind and :ro on the initial call still creates a rw mount because of a kernel quirk).
  • :data — filesystem-specific options string (rare for bind mounts).
  • :create — create an empty file at target before binding if it's missing (see mount/4). For binding device nodes (/dev/null, …) onto a freshly-mounted /dev tmpfs.
  • :in — the target mount namespace (see mount/4).

Returns :ok or {:error, %Linx.Mount.Error{operation: :mount}}.

list()

@spec list() :: {:ok, [Linx.Mount.Entry.t()]} | {:error, atom()}

Returns the BEAM's mount table by parsing /proc/self/mountinfo.

Returns {:ok, [%Linx.Mount.Entry{}, ...]} on success or {:error, posix_atom} if the file can't be read (extremely unusual on a healthy host).

list(arg)

@spec list(list_target()) :: {:ok, [Linx.Mount.Entry.t()]} | {:error, atom()}

Returns the mount table for target's mount namespace.

target is {:pid, n} (reads /proc/<n>/mountinfo) or {:path, p} (reads p directly — typically used with paths like /proc/<n>/mountinfo already constructed).

Returns {:ok, [%Linx.Mount.Entry{}, ...]} or {:error, posix_atom}; common failures: :enoent (pid no longer exists), :eacces (BEAM can't read that pid's /proc).

Note that list/1 does not enter the target's mount namespace via setns — it just reads the target's mountinfo file from the BEAM's namespace, which is sufficient. The mutating verbs (which do need setns) are the ones that operate on a separate throwaway thread.

mount(source, target, fstype, opts \\ [])

@spec mount(String.t(), String.t(), String.t(), keyword()) ::
  :ok | {:error, Linx.Mount.Error.t() | {:bad_flag, atom()} | {:bad_in, term()}}

Mounts source at target with filesystem type fstype.

Options

  • :flags — a list of flag atoms (see the table below). Mapped to the OR'd MS_* integer the kernel expects.
  • :data — a filesystem-specific options string (e.g. "size=64M,mode=755" for tmpfs). Defaults to "".
  • :create — when true, create an empty file at target (inside the target namespace) before mounting, if it doesn't already exist. For device-node bind mounts onto a fresh /dev tmpfs, where the placeholder must live on the tmpfs itself. Defaults to false.

Flag atoms

atomMS_* constant
:roMS_RDONLY
:nosuidMS_NOSUID
:nodevMS_NODEV
:noexecMS_NOEXEC
:syncMS_SYNCHRONOUS
:remountMS_REMOUNT (driven by remount/2)
:mandlockMS_MANDLOCK
:dirsyncMS_DIRSYNC
:noatimeMS_NOATIME
:nodiratimeMS_NODIRATIME
:bindMS_BIND (driven by bind/3)
:moveMS_MOVE (driven by move/2)
:recMS_REC — recursive variant
:silentMS_SILENT
:privateMS_PRIVATE — propagation
:sharedMS_SHARED — propagation
:slaveMS_SLAVE — propagation
:unbindableMS_UNBINDABLE — propagation
:relatimeMS_RELATIME
:strictatimeMS_STRICTATIME
:lazytimeMS_LAZYTIME

Returns :ok or {:error, %Linx.Mount.Error{operation: :mount}} on failure. Common errnos: :eperm (no CAP_SYS_ADMIN), :enoent (source or target missing), :einval (incompatible flags), :ebusy (target is busy), :enodev (unknown fstype).

Cross-namespace

The :in option chooses which mount namespace to operate on:

  • :self (default) — the BEAM's own mount namespace.
  • {:pid, n} — pid n's mount namespace (reads /proc/<n>/ns/mnt). Works whether n is parked at a Linx.Process checkpoint or fully running.
  • {:path, p} — an explicit path to a namespace file.
:ok = Linx.Mount.mount("proc", "/proc", "proc", in: {:pid, host_pid})

proc and the PID namespace

A proc filesystem binds to the PID namespace of the mounting task, not the mount namespace. When fstype is "proc" and :in is {:pid, n}, this enters pid n's PID namespace too (forking the mount into it), so the mounted /proc reflects the container's processes rather than the host's — no extra option needed. (For :self or {:path, _}, the caller's PID namespace is used.)

Cross-namespace failures surface with stage-tagged operations in %Linx.Mount.Error{}:open_ns / :setns / :thread, plus :create (the :create placeholder) and :open_pidns / :setns_pid / :pipe / :fork (the proc pidns path). See Linx.Mount.Error's @moduledoc.

move(source, target, opts \\ [])

@spec move(String.t(), String.t(), keyword()) ::
  :ok | {:error, Linx.Mount.Error.t() | {:bad_flag, atom()} | {:bad_in, term()}}

Atomically relocates an existing mount from source to target.

Equivalent to mount/4 with flags: [:move]. The mount table entry stays the same — same filesystem, same inode count — only the mount point changes. Subprocesses with the old path open continue to work via the still-valid fd; new lookups go through the new path.

Returns :ok or {:error, %Linx.Mount.Error{operation: :mount}}.

Common errors: :einval (source isn't a mount point, or source/target share a propagation peer group — move requires unshared propagation on both ends), :enoent (target's parent doesn't exist).

pivot_root(new_root, put_old, opts \\ [])

@spec pivot_root(String.t(), String.t(), keyword()) ::
  :ok | {:error, Linx.Mount.Error.t() | {:bad_in, term()}}

Swaps the mount-namespace's root: makes new_root the new / and stashes the old root at put_old.

Wraps pivot_root(2). After a successful call, processes in the target mount namespace see new_root's contents as /; the former root tree is accessible at put_old. The standard next step in container init is to umount("/old_root", flags: [:detach]) to discard the old root entirely.

Options

  • :in — the same shape as mount/4 (:self / {:pid, n} / {:path, p}). Picks which mount namespace's root to swap.

Kernel constraints

pivot_root(2) is one of the pickiest syscalls in Linux. The call returns :einval unless all of these hold:

  • new_root is a directory and a mount point. The typical setup is a bind-mount-to-self: Linx.Mount.bind(new_root, new_root).
  • put_old is a directory under new_root. By convention: Path.join(new_root, "old_root"), created beforehand.
  • No other filesystem is mounted on put_old.
  • The propagation of new_root's mount and the current root's mount are not both shared. Usually: mark new_root private before calling pivot_root.

See pivot_root(2) for the full list.

CWD handling

pivot_root requires the calling thread's CWD to be inside new_root. The NIF runs on a worker thread that unshares its fs_struct and chdirs into new_root before the syscall, so the BEAM's CWD stays at whatever it was. The chdir is a worker- thread concern; the caller doesn't observe it.

Composition

The headline use case is rootfs swapping inside a freshly-spawned container at the checkpoint, before proceed/1:

{:ok, c} = Linx.Process.spawn(argv: ["/init"], namespaces: [:mount, ...])
host_pid = receive do {:linx_process, :ready, p} -> p end

:ok = Linx.Mount.bind(rootfs, rootfs, in: {:pid, host_pid})
:ok = Linx.Mount.mount("", rootfs, "", flags: [:private], in: {:pid, host_pid})
:ok = Linx.Mount.pivot_root(rootfs, Path.join(rootfs, "old_root"), in: {:pid, host_pid})
:ok = Linx.Mount.umount("/old_root", flags: [:detach], in: {:pid, host_pid})

:ok = Linx.Process.proceed(c)

After proceed/1, the workload execves /init from inside the new rootfs.

Returns :ok or {:error, %Linx.Mount.Error{operation: :pivot_root | :chdir | :open_ns | :unshare | :setns | :thread}}.

remount(target, opts \\ [])

@spec remount(
  String.t(),
  keyword()
) ::
  :ok | {:error, Linx.Mount.Error.t() | {:bad_flag, atom()} | {:bad_in, term()}}

Remounts the filesystem at target with new flags.

Equivalent to mount/4 with flags: [:remount | user_flags] and empty source + fstype. The kernel knows what's mounted there and applies the new flags in place.

Typical use: making a bind mount read-only after the fact:

:ok = Linx.Mount.bind(source, target)
:ok = Linx.Mount.remount(target, flags: [:ro, :bind])

The :bind flag is required when remounting a bind mount with new flags — without it, the kernel tries to remount the underlying filesystem instead.

Options

  • :flags — flag atoms (:ro, :nosuid, :nodev, etc.) to apply. Reuses the catalog from mount/4.
  • :data — filesystem-specific options string.

Returns :ok or {:error, %Linx.Mount.Error{operation: :mount}}.

umount(target, opts \\ [])

@spec umount(
  String.t(),
  keyword()
) ::
  :ok | {:error, Linx.Mount.Error.t() | {:bad_flag, atom()} | {:bad_in, term()}}

Unmounts the filesystem at target.

Options

  • :flags — a list of flag atoms:
    • :forceMNT_FORCE. Try harder when the filesystem is busy (only meaningful for NFS-style network filesystems).
    • :detachMNT_DETACH. Lazy unmount: detach from the namespace immediately, clean up when the last user is done.
    • :expireMNT_EXPIRE. Mark for later auto-unmount.
    • :no_followUMOUNT_NOFOLLOW. Don't follow symlinks at target.

Returns :ok or {:error, %Linx.Mount.Error{operation: :umount}}.

Cross-namespace

Same :in option as mount/4. To unmount a path inside a running container:

:ok = Linx.Mount.umount("/proc", in: {:pid, container_pid})