The Problem
Erlang's built-in Port mechanism copies all data from the child's stdout into the BEAM's mailbox immediately. If a child produces data faster than Elixir code consumes it, the mailbox grows unbounded → OOM.
NetRunner's Solution
NetRunner uses NIF-based I/O with enif_select to implement demand-driven backpressure. Data stays in the OS pipe buffer until explicitly read.
sequenceDiagram
participant E as Elixir Consumer
participant GS as GenServer
participant NIF as NIF (dirty IO)
participant Pipe as OS Pipe Buffer
participant Child as Child Process
E->>GS: Process.read(p)
GS->>NIF: nif_read(fd, 65535)
alt Data available
NIF->>Pipe: read(fd, buf, 65535)
Pipe-->>NIF: bytes
NIF-->>GS: {:ok, binary}
GS-->>E: {:ok, binary}
else Pipe empty (EAGAIN)
NIF->>NIF: enif_select(fd, READ)
NIF-->>GS: {:error, :eagain}
GS->>GS: Park caller in operations queue
Note over Pipe,Child: Child writes, pipe fills
Pipe-->>GS: {:select, fd, ref, :ready_input}
GS->>NIF: nif_read(fd, 65535) [retry]
NIF-->>GS: {:ok, binary}
GS-->>E: {:ok, binary}
endHow It Works
Read Path
NetRunner.Process.read/2callsGenServer.call(pid, {:read, :stdout, max_bytes}, :infinity)- GenServer tries
Pipe.read(pipe, max_bytes)→ callsNif.nif_read(resource, max_bytes) - NIF runs on dirty IO scheduler:
- Calls
read(fd, buf, max_bytes) - If data available: returns
{:ok, binary}immediately - If
EAGAIN: callsenif_select(fd, ERL_NIF_SELECT_READ), returns{:error, :eagain}
- Calls
- On
EAGAIN, GenServer parks the caller in the operations queue - When data arrives, BEAM's event loop detects fd readiness via epoll/kqueue
- BEAM sends
{:select, resource, ref, :ready_input}to GenServer - GenServer retries all parked read operations
Write Path
NetRunner.Process.write/2callsGenServer.call(pid, {:write, data}, :infinity)- GenServer enters
write_loop:Pipe.write(pipe, data)→Nif.nif_write(resource, data)- If fully written: returns
:ok - If partial write: retries immediately with remaining data
- If
EAGAIN: parks caller, waits for{:select, ..., :ready_output}
- Partial writes are retried immediately because the kernel may have room for more
Why Partial Write Retry Matters
Without immediate retry, a partial write would park the caller, but enif_select might not fire again because the pipe buffer isn't actually full — the NIF just happened to write less than requested. The write loop ensures we keep writing until we either:
- Complete the write (all bytes sent)
- Get
EAGAIN(pipe buffer truly full →enif_selectregistered → will get notified)
Pipe Buffer Sizes
The OS pipe buffer acts as the natural flow control mechanism:
| Platform | Default Pipe Buffer | Effect |
|---|---|---|
| Linux | 64 KB (configurable up to 1 MB via fcntl(F_SETPIPE_SZ)) | Child blocks on write() when buffer full |
| macOS | 64 KB | Same blocking behavior |
When the Elixir consumer stops reading:
- OS pipe buffer fills up
- Child's
write()call blocks (kernel-level backpressure) - Child naturally slows down or stops producing
- No memory growth on the BEAM side
Comparison with Alternatives
| Approach | Backpressure | Memory Safety |
|---|---|---|
System.cmd / Ports | None — mailbox flooding | OOM on fast producers |
Exile | Yes — NIF + enif_select | Safe |
MuonTrap | None — Port-based | OOM on fast producers |
erlexec | Limited — single port bottleneck | Bottleneck limits throughput |
| NetRunner | Yes — NIF + enif_select | Safe |