Hex.pm Docs CI License

OTP-native Chrome DevTools Protocol browser automation for Elixir. Launch headless Chrome and drive it directly over a Mint.WebSocket connection — no ChromeDriver, no Node.js.

CDPEx.with_page([], fn page ->
  {:ok, _} = CDPEx.Page.navigate(page, "https://example.com")
  CDPEx.Page.html(page)
end)
#=> {:ok, "<html>…</html>"}

Why CDPEx?

It drives Chrome over CDP the way Puppeteer and Playwright do — but it's pure Elixir: the browser and each page's CDP connection are supervised OTP processes (a page is a lightweight handle over its connection). A Chrome crash or a dropped socket surfaces to the caller as {:error, reason} instead of a hung session, and terminate/2 guarantees the OS process is reaped (no zombie Chromes).

CDPExchrome_remote_interfaceChromicPDFWallaby
TransportCDP (WebSocket)CDP (WebSocket)CDP (WebSocket)WebDriver / ChromeDriver
Runtime depsmint_web_socket, jasonhackney + othersa fewChromeDriver process
Supervised lifecycle✅ (PDF pool)partial
Scopegeneral automationlow-level clientPDF / screenshotstesting
Node.js requirednononono

If you want a small, dependency-light CDP client with proper OTP supervision — and you don't want a ChromeDriver process or a Node sidecar — that's the gap CDPEx fills.

Status

Transports: pages default to one WebSocket each (strong crash isolation). Opt into sessionId multiplexing — many pages over the one browser socket — with new_page(browser, transport: :session); the trade-off is shared fate (a dropped browser connection drops all of its session pages).

Stealth / anti-fingerprinting presets remain out of scope for now (evidence-gated).

Installation

Add cdp_ex to your deps in mix.exs:

def deps do
  [
    {:cdp_ex, "~> 0.1"}
  ]
end

You also need Chrome or Chromium installed. CDPEx finds it via, in order: the :chrome_binary option, CDP_EX_CHROME_BINARY, CHROME_BINARY, then an OS default. For reproducible setups, point it at a Chrome for Testing binary.

Sandbox

CDPEx launches Chrome with --no-sandbox by default, since the sandbox can't start in many CI/container environments. If you run as root or drive untrusted pages, re-enable it by overriding :args — see CDPEx.Chrome.

Running in containers

CDPEx is validated on macOS, but its defaults already target Linux containers: --no-sandbox, --disable-dev-shm-usage, and --disable-setuid-sandbox are on by default, so headless Chrome starts out of the box on a constrained host (e.g. 2 vCPU / 2 GB). A few things smooth the path further:

  • Tune :launch_timeout for cold starts. Chrome's first launch in a fresh container is slower than on a warm dev machine. :launch_timeout is a ceiling, not a fixed wait (readiness is polled and returns as soon as Chrome is reachable), so a generous value costs nothing on a fast launch:

    CDPEx.launch(launch_timeout: 30_000)
  • Fresh-profile cost. Each launch creates a new --user-data-dir (removed on stop), so there's no warm disk cache between launches and the first navigation pays a cold-start cost. For throughput, prefer one long-lived browser (launch/1

    • reuse) over a throwaway with_page([...]) per request, or pass a persistent :user_data_dir.
  • /dev/shm sizing. Docker defaults /dev/shm to 64 MB, which Chrome can exhaust (crashing tabs). CDPEx ships --disable-dev-shm-usage by default (Chrome writes to /tmp instead), so it works on a small /dev/shm as-is. To size it up instead (--shm-size=1g), drop that flag via a custom :args.

  • Memory. On a 2 GB host a single browser with a few pages is comfortable; each open page and large screenshots/PDFs add transient memory. Close pages promptly, or use transport: :session to cut per-page socket/process overhead when you don't need crash isolation.

  • --remote-allow-origins. Some Chrome builds enforce an Origin check on the DevTools WebSocket upgrade and may reject a CDP client with a 403. CDPEx doesn't set this by default; if you hit {:error, {:ws_upgrade, _}} at connect, add it:

    CDPEx.launch(launch_timeout: 30_000, extra_args: ["--remote-allow-origins=*"])

Usage

with_page/3 opens a page, runs your function, and always tears everything down — even if the function raises:

# Throwaway browser + page for one job:
{:ok, title} =
  CDPEx.with_page([], fn page ->
    {:ok, _} = CDPEx.Page.navigate(page, "https://example.com")
    CDPEx.Page.evaluate(page, "document.title")
  end)

Explicit lifecycle

{:ok, browser} = CDPEx.launch(headless: true)
{:ok, page}    = CDPEx.new_page(browser)

{:ok, _page} = CDPEx.Page.navigate(page, "https://example.com")
:ok          = CDPEx.Page.wait_for_selector(page, "h1")
{:ok, html}  = CDPEx.Page.html(page)
{:ok, "Example Domain"} = CDPEx.Page.evaluate(page, "document.querySelector('h1').textContent")
{:ok, _png}  = CDPEx.Page.screenshot(page, path: "example.png")

:ok = CDPEx.close_page(browser, page)
:ok = CDPEx.stop(browser)

Under your supervision tree

Because terminate/2 reaps Chrome, supervise the browser with a :shutdown timeout (not :brutal_kill):

children = [
  {CDPEx.Browser, name: MyBrowser, headless: true}
]
Supervisor.start_link(children, strategy: :one_for_one)

Pooling

Reuse warm browsers instead of paying a cold launch per job with CDPEx.Pool:

children = [
  {CDPEx.Pool, name: MyPool, size: 4, launch_opts: [headless: true]}
]
Supervisor.start_link(children, strategy: :one_for_one)

# Borrow a warm browser for one fetch (a pooled drop-in for with_page/3):
CDPEx.Pool.with_page(MyPool, fn page ->
  {:ok, _} = CDPEx.Page.navigate(page, "https://example.com")
  CDPEx.Page.html(page)
end)

Browsers launch lazily up to :size and are reused; checkout/2 blocks (up to :checkout_timeout) when all are busy. Launches are asynchronous, so the pool stays responsive during a cold start and warms multiple browsers concurrently under load. A caller that crashes returns its browser automatically, and a crashed browser is relaunched on demand.

Page operations

FunctionDescription
navigate/3Go to a URL, waiting for networkAlmostIdle (configurable)
wait_for_selector/3Poll until a CSS selector matches
wait_for_response/3Block until a network response URL matches (fn / Regex / substring)
wait_for_network_idle/2Block until the network settles (Puppeteer "networkidle")
evaluate/3Run JS and return the value (returnByValue)
click/3Synthetic .click() on the first match
html/2Full serialized DOM (document.documentElement.outerHTML)
screenshot/2PNG bytes, or write to :path
observe_network/2Stream Network request/response events to the caller
response_body/3Fetch a response body by requestId (Network.getResponseBody)
enable_request_interception/2Pause matching requests for the caller to resolve
continue_request/3 / fulfill_request/3 / fail_request/3Resolve a paused request (proceed / synthetic response / fail)
authenticate/4Answer a proxy / HTTP Basic auth challenge (call before navigate/3)

Full API: hexdocs.pm/cdp_ex.

Error handling

Operations return {:error, reason} on failure. Rather than hard-code the reason shapes, classify them to drive retries:

case CDPEx.Page.navigate(page, url) do
  {:ok, page} ->
    {:ok, page}

  {:error, reason} ->
    if CDPEx.transient?(reason), do: retry(), else: {:error, reason}
end

CDPEx.classify_error/1 buckets a reason as :transient (connection dropped or couldn't be established, timeout, Chrome died or was slow to start, an internal helper crashed, or a connection-layer net::ERR_* navigation error — a fresh attempt may succeed), :terminal (selector miss, JS exception, usage/validation error — it won't), or :unknown (payload-dependent, e.g. an ambiguous net::ERR_* navigation error or a CDP error code — you decide). The library tracks the error surface, so the transient/terminal decision lives in one place instead of drifting across callers. The reason shapes are documented as t:CDPEx.error_reason/0.

Retries are yours to bound: cap attempts, back off, and on a :transient result re-establish the resource (open a fresh page/browser) rather than reusing a dead handle — a dead page keeps returning :noproc.

Telemetry

CDPEx emits :telemetry events and attaches no handlers — attach your own to record them (emitting with nothing attached is a no-op). Events: [:cdp_ex, :launch, …] and [:cdp_ex, :navigate, …] spans, [:cdp_ex, :page, :start | :stop], and [:cdp_ex, :error]. See CDPEx.Telemetry for the full taxonomy (measurements + metadata).

:telemetry.attach(
  "cdp-nav",
  [:cdp_ex, :navigate, :stop],
  fn _event, %{duration: d}, %{url: url, status: status}, _config ->
    ms = System.convert_time_unit(d, :native, :millisecond)
    IO.puts("#{url} -> #{inspect(status)} in #{ms}ms")
  end,
  nil
)

status (and the post-redirect final_url) are nil unless the navigation used response: true — see CDPEx.Page.navigate/3.

Development

mix deps.get
mix test                         # unit tests (no Chrome needed)
mix test --include integration   # real-Chrome tests (set CDP_EX_CHROME_BINARY)
mix ci                           # format, credo, dialyzer, unit tests

Integration tests are tagged :integration and excluded by default; they launch a real Chrome and drive it against a local fixture HTTP server.

Acknowledgements

Built on mint_web_socket. Inspired by the production CDP work in ChromicPDF and by Puppeteer's protocol layer.

License

MIT — see LICENSE.