OTP-native Chrome DevTools Protocol (CDP) browser automation for Elixir.
CDPEx launches a headless Chrome process and drives it directly over the
Chrome DevTools Protocol on a Mint.WebSocket connection — no ChromeDriver
and no Node.js. Browsers and their WebSocket connections are supervised
processes, so a Chrome crash surfaces to callers as {:error, reason} rather
than a hung session.
This module is the high-level facade. See CDPEx.Page for page operations.
Example
{:ok, browser} = CDPEx.launch()
{:ok, page} = CDPEx.new_page(browser)
{:ok, _page} = CDPEx.Page.navigate(page, "https://example.com")
{:ok, html} = CDPEx.Page.html(page)
:ok = CDPEx.stop(browser)Or, resource-safe, with with_page/3:
CDPEx.with_page([], fn page ->
{:ok, _} = CDPEx.Page.navigate(page, "https://example.com")
CDPEx.Page.html(page)
end)Observability is via :telemetry — see CDPEx.Telemetry for the event taxonomy
(launch / navigate spans, page open/close, and error events). Silent by default.
Error handling
Every operation returns {:error, reason} on failure; error_reason/0 documents
the reason shapes. To drive retries without hard-coding that list, classify the
reason instead of matching it:
case CDPEx.Page.navigate(page, url) do
{:ok, page} ->
{:ok, page}
{:error, reason} ->
if CDPEx.transient?(reason), do: retry(), else: {:error, reason}
endclassify_error/1 buckets a reason as :transient (a fresh attempt may succeed),
:terminal (it won't), or :unknown (payload-dependent — you decide). It tracks
the error surface as the library evolves, so the transient/terminal decision stays
in one place rather than being reimplemented (and re-drifting) downstream. Retries
stay yours to bound: cap attempts, back off, and on :transient re-establish the
resource (a fresh page/browser) rather than reusing a dead handle.
Status
Pages default to one WebSocket each (strong crash isolation); opt into
sessionId multiplexing (many pages over the one browser socket) with
new_page(browser, transport: :session), trading isolation for fewer sockets.
Connection pooling, network interception, and stealth remain out of scope.
Summary
Types
The result of classify_error/1.
The reason shapes that appear in {:error, reason} across CDPEx.
Functions
Classifies an error reason as :transient, :terminal, or :unknown.
Closes a page opened with new_page/2.
Launches a headless Chrome browser and returns its process pid.
Opens a new page. See CDPEx.Browser.new_page/2 for options.
Stops a browser started with launch/1, closing all pages and killing Chrome.
Convenience over classify_error/1: true only when the error is :transient.
Runs fun with a fresh page, guaranteeing the page (and, when given launch
options, the browser) is cleaned up afterwards — even if fun raises.
Types
@type error_classification() :: :transient | :terminal | :unknown
The result of classify_error/1.
Intentionally open: match :transient (or :terminal) explicitly and fall through
with a catch-all rather than enumerating all three atoms, so a future bucket can be
added without breaking exhaustive matches.
@type error_reason() :: CDPEx.Connection.call_error() | CDPEx.Chrome.launch_error() | {:ws_connect, term()} | {:ws_upgrade, term()} | :timeout | :unknown_page | :already_authenticated | :already_intercepting | {:timeout, :await_event} | {:conflict, :authenticated | :intercepting} | {:navigate, String.t()} | {:no_document_response, String.t()} | {:capture_failed, term()} | {:idle_wait_failed, term()} | {:selector_not_found, String.t()} | {:evaluate_exception, term()} | {:unserializable_value, String.t()} | {:unexpected_evaluate, term()} | {:invalid_args, term()} | {:invalid_source, term()} | {:invalid_error_reason, term()} | {:invalid_transport, term()} | {:invalid_proxy, term()} | {:unsupported_transport, term()} | {:invalid_response_body, String.t()} | {:invalid_pdf_data, String.t()} | {:invalid_screenshot_data, String.t()} | {:write_failed, term()}
The reason shapes that appear in {:error, reason} across CDPEx.
Error reasons are part of the public contract — pattern-match the tagged kinds
({:cdp_error, …}, {:timeout, …}, {:ws_closed, …}, …); their payloads (a CDP
method, an exit status, a stderr/contents excerpt) are open and may gain detail.
The only bare, context-free reasons are :noproc, the high-level :timeout,
:unknown_page, :already_authenticated, and :already_intercepting —
self-describing control-flow outcomes with no payload to carry, the way GenServer
uses :noproc. Validation failures that do have offending data to surface are
tagged instead ({:invalid_response_body, excerpt}, {:invalid_pdf_data, excerpt},
{:invalid_screenshot_data, excerpt}).
To act on a failure without hard-coding this list, use classify_error/1 — it
buckets any reason as :transient / :terminal / :unknown and tracks this union,
so retry logic isn't reimplemented (and re-drifted) downstream.
Two sub-unions are machine-checked: CDPEx.Connection.call_error/0 and
CDPEx.Chrome.launch_error/0 are precisely specced on call/5 / launch/1, so
Dialyzer catches a shape change in those at the source. The remaining members —
the page-level tagged kinds and bare atoms — are hand-maintained (kinds such as
{:cdp_error, method, payload} also wrap arbitrary CDP data), kept honest by a
compile-time coverage test that fails if any member here lacks a classify_error/1
test exemplar — so a member can't be added to this type without being classified.
That guard is one-directional (type → classified): the reverse — an error a producer
returns but never adds here — still relies on review, and such a stray reason would
fall through classify_error/1 to :unknown.
Two timeout shapes, by layer: the low-level CDPEx.Connection.call/5 and
await_event/4 return {:timeout, context} (a CDP method, or :await_event),
while the high-level CDPEx.Page wait_for_* functions and CDPEx.Pool.checkout/2
return a bare :timeout ("the awaited condition didn't happen in time").
A WebSocket frame that fails to decode is not a standalone reason: the connection
stops on the decode failure, so callers observe it nested, as
{:ws_closed, {:ws_decode, _}}.
Functions
@spec classify_error(term()) :: error_classification()
Classifies an error reason as :transient, :terminal, or :unknown.
reason is the value from any {:error, reason} this library returns (see
error_reason/0). The classification answers one question — might a fresh
attempt succeed? — so you drive retries from one place instead of reimplementing
the decision (and re-drifting it) in every caller:
case CDPEx.Page.navigate(page, url) do
{:ok, page} ->
{:ok, page}
{:error, reason} ->
case CDPEx.classify_error(reason) do
:transient -> retry_with_fresh_page()
_ -> {:error, reason}
end
endThe buckets:
:transient— environmental or timing failures: the connection dropped or couldn't be established ({:ws_closed, _},{:ws_connect, _},{:ws_upgrade, _},:noproc), a wait or call timed out (:timeout,{:timeout, _}), Chrome died or was slow to start ({:chrome_exited, _, _},{:debug_url_not_found, _},{:devtools_file_malformed, _}), an internal capture/idle helper crashed ({:capture_failed, _},{:idle_wait_failed, _}), or a navigation hit a connection/network-layernet::ERR_*(e.g.{:navigate, "net::ERR_CONNECTION_REFUSED"}).:terminal— deterministic outcomes: a selector didn't match, JS threw, a usage/validation error, or a missing Chrome binary. Retrying the same call yields the same error. (:already_authenticated/:already_interceptingare terminal for the ordinary double-call; the narrow post-timeout teardown raceauthenticate/4documents — where a retry can still succeed — is signalled by the preceding{:timeout, _}, which is itself:transient.):unknown— the outcome depends on a payload or timing this function does not crack: an ambiguous navigationnet::ERR_*(DNSERR_NAME_NOT_RESOLVED,ERR_ABORTED,ERR_BLOCKED_BY_*— unlike the connection-layer codes above), the CDP error code ({:cdp_error, _, _}), the file-write posix reason ({:write_failed, _}), or whether a{:no_document_response, _}was a same-document hop or a slow miss. Also covers any termCDPExdoesn't produce. Decide the retry policy yourself.
Retries are the caller's responsibility: bound the attempts and back off. A
:transient result means re-establish the resource — open a fresh page/browser
or call CDPEx.Pool.checkout/2 again — not retry the same handle (a dead page keeps
returning :noproc). Some :transient reasons are still a judgment call for your
environment — retrying a :timeout / net::ERR_TIMED_OUT multiplies wall-time and
browser memory, so a resource-constrained caller may reasonably treat timeouts as
terminal. The input is typed term() so the catch-all stays reachable;
routing through this instead of matching error_reason/0 directly trades Dialyzer
exhaustiveness for a stable, library-maintained dispatch point.
@spec close_page(pid(), CDPEx.Page.t()) :: :ok | {:error, :unknown_page}
Closes a page opened with new_page/2.
Returns {:error, :unknown_page} if page was not opened on browser.
@spec launch(keyword()) :: GenServer.on_start()
Launches a headless Chrome browser and returns its process pid.
Accepts the launch options documented in CDPEx.Chrome (e.g. :headless,
:chrome_binary, :extra_args, :window_size, :launch_timeout). On slow
cold-start hosts (e.g. headless Chrome in a constrained container) raise
:launch_timeout — it is a ceiling, not a fixed wait. For long-lived use, prefer
putting CDPEx.Browser under your own supervisor with a :shutdown timeout.
Proxy
Pass :proxy to route the browser through a proxy — a URL or a keyword list:
CDPEx.launch(proxy: "http://user:pass@host:8080")
CDPEx.launch(proxy: [server: "host:8080", username: "u", password: "p"])It sets Chrome's --proxy-server and, when credentials are given, automatically
answers the proxy auth challenge on each page — so you just new_page/2 and
CDPEx.Page.navigate/3, no manual CDPEx.Page.authenticate/4. See CDPEx.Proxy for
the accepted forms (the keyword form avoids percent-encoding special-character
passwords).
A credentialed proxy requires the default :dedicated transport: new_page(transport: :session) on such a browser returns {:error, {:unsupported_transport, :session}},
and an auto-armed page can't also use enable_request_interception/2 (both drive the
Fetch domain). A malformed :proxy — or combining it with a full :args override —
fails the launch with {:error, {:invalid_proxy, _}} (:proxy appends to :extra_args,
which an :args override discards, so the two are mutually exclusive; use one). Don't
set --proxy-server in :extra_args yourself when using :proxy.
@spec new_page( pid(), keyword() ) :: {:ok, CDPEx.Page.t()} | {:error, term()}
Opens a new page. See CDPEx.Browser.new_page/2 for options.
@spec stop(pid()) :: :ok
Stops a browser started with launch/1, closing all pages and killing Chrome.
Convenience over classify_error/1: true only when the error is :transient.
Conservative by design — :unknown is not transient, so an unrecognized or
payload-dependent error won't be auto-retried. Match classify_error/1 directly
when you want to treat :unknown specially, and see its note on bounded,
resource-re-establishing retries — this classifies, it does not retry.
@spec with_page(pid() | keyword(), (CDPEx.Page.t() -> result), keyword()) :: result | {:error, term()} when result: var
Runs fun with a fresh page, guaranteeing the page (and, when given launch
options, the browser) is cleaned up afterwards — even if fun raises.
Pass an existing browser pid to reuse it, or a keyword list of launch options
to spin up a throwaway browser for the duration of the call. Returns whatever
fun returns, or {:error, reason} if the page/browser could not be created.
With launch options, the throwaway browser is linked but contained: if it
crashes during the call (e.g. its connection drops) with_page returns
{:error, reason} instead of letting the crash propagate to the caller. To do
that it briefly traps exits in the calling process for the duration of the call.
Only the browser's own {:EXIT, _, _} is drained — a foreign process linked
to the caller that exits during this window has its exit delivered as a message
left in the caller's mailbox, so a caller that links other processes and relies
on un-trapped exit propagation should pass a pre-launched browser pid instead.
On slow cold-start hosts, raise :launch_timeout (a ceiling, not a fixed wait).
# against an existing browser
CDPEx.with_page(browser, fn page ->
{:ok, _} = CDPEx.Page.navigate(page, "https://example.com")
CDPEx.Page.html(page)
end)
# throwaway browser + page
CDPEx.with_page([headless: true], &CDPEx.Page.html/1)