CDPEx.Page (CDPEx v0.9.0)

Copy Markdown View Source

A page (tab) handle and the operations you run against it.

A CDPEx.Page is a lightweight struct — not a process — holding the page's CDPEx.Connection pid and target id. Operations are functions over that connection, so the OTP properties (supervision, crash isolation) live in the connection/browser layer while page calls stay ergonomic.

Obtain one with CDPEx.new_page/2. If the underlying page dies (navigation to a new target, a crash), operations return {:error, :noproc} and you should open a fresh page.

Operations

Summary

Functions

Returns attribute name of the first element matching css, or nil when the element or attribute is absent.

Arms HTTP/proxy authentication on this page with username/password.

Calls a JavaScript function with args and returns its value.

Clears all browser cookies. Lazily enables Network. Options: :timeout.

Clicks the first element matching css with a real, trusted mouse event.

Lets a paused request proceed (Fetch.continueRequest), optionally rewriting it.

Returns all browser cookies as a list of CDP cookie maps.

Disables request interception — unsubscribes the caller from Fetch.requestPaused and disables the Fetch domain. Resolve any still-paused requests first.

Enables request interception: pauses matching requests and delivers a Fetch.requestPaused event to the calling process for each one. You must then resolve every paused request with continue_request/3, fulfill_request/3, or fail_request/3 (keyed by its "requestId") — an unresolved request stalls the page.

Evaluates a JavaScript expression and returns its value (returnByValue).

Fails a paused request (Fetch.failRequest).

Answers a paused request with a synthetic response (Fetch.fulfillRequest) — the page never hits the network for it.

Returns the page's full serialized HTML (document.documentElement.outerHTML).

Navigates to url and (by default) waits until the network is almost idle.

Starts observing network traffic, delivering CDP Network events to the calling process.

Renders the page to PDF (Page.printToPDF).

Presses a single named key, dispatching real keyDown/keyUp events.

Returns a response's body by its request_id (from a Network.responseReceived event), via Network.getResponseBody.

Captures a PNG screenshot.

Sets cookies. Each is a CDP CookieParam map — at least "name", "value", and a "url" or "domain". Lazily enables Network. Options: :timeout.

Sets extra HTTP headers sent with every subsequent request on this page.

Overrides the page's User-Agent (Emulation.setUserAgentOverride).

Overrides the viewport via Emulation.setDeviceMetricsOverride.

Stops observing — unsubscribes the caller from the network :events. Leaves the Network domain enabled.

Returns the textContent of the first element matching css, or nil when no element matches.

Types text into the first element matching css.

Returns {:ok, true} when the first element matching css is rendered and visible (has layout boxes, not display: none / visibility: hidden), {:ok, false} otherwise — including when no element matches.

Polls a JavaScript expression until it is truthy, or timeout elapses.

Waits for a navigation lifecycle milestone, without issuing a navigation.

Blocks until the network has been idle — at most :max_inflight in-flight requests — for :idle_time ms continuously, or timeout.

Blocks until a network response whose URL matches matcher arrives, or timeout.

Polls until css matches an element, or timeout elapses.

Types

t()

@type t() :: %CDPEx.Page{
  browser: pid(),
  conn: pid(),
  session_id: String.t() | nil,
  target_id: String.t()
}

Functions

attribute(page, css, name, opts \\ [])

@spec attribute(t(), String.t(), String.t(), keyword()) ::
  {:ok, String.t() | nil} | {:error, term()}

Returns attribute name of the first element matching css, or nil when the element or attribute is absent.

authenticate(page, username, password, opts \\ [])

@spec authenticate(t(), String.t(), String.t(), keyword()) :: :ok | {:error, term()}

Arms HTTP/proxy authentication on this page with username/password.

Headless Chrome launched with --proxy-server=host:port can't send proxy credentials, so an authenticated proxy rejects the connection (net::ERR_INVALID_AUTH_CREDENTIALS). Call this after new_page/2 and before navigate/3: it answers the proxy (or HTTP Basic) auth challenge with the given credentials. It also covers Basic-auth-gated origins.

This enables the CDP Fetch domain for the page, which pauses (and auto-continues) every request — measurable overhead on heavy pages.

Only :dedicated pages (the new_page/2 default) are supported; a :session page returns {:error, {:unsupported_transport, :session}}. A page that isn't one of this browser's open pages returns {:error, :unknown_page}, a page that is already authenticated returns {:error, :already_authenticated}, and a page that already has request interception enabled returns {:error, {:conflict, :intercepting}} (auth and interception both drive the Fetch domain — use one per page).

If an earlier authenticate/4 on this page was abandoned while still arming (e.g. its call timed out under heavy load), the orphaned handler is torn down automatically and the page becomes re-authenticatable — a retry succeeds once that teardown completes (it may briefly still see {:error, :already_authenticated} in the meantime).

The bad-credentials loop guard keys on the request id, so a single request that must answer both a proxy and an origin challenge isn't supported — the second challenge is cancelled (Puppeteer-parity).

Options:

  • :source — which challenges to answer: :any (default), :proxy, :server. An unknown value returns {:error, {:invalid_source, value}}.

call_function(page, function_declaration, args \\ [], opts \\ [])

@spec call_function(t(), String.t(), [term()], keyword()) ::
  {:ok, term()} | {:error, term()}

Calls a JavaScript function with args and returns its value.

function_declaration is a JS function expression (e.g. "(a, b) => a + b"). args are JSON-encoded (not string-interpolated) before being applied, so passing data values through them is safe. A thrown exception is {:error, {:evaluate_exception, details}}; non-serializable args return {:error, {:invalid_args, reason}}.

Trusted input

function_declaration is interpolated into the page script verbatim — treat it as trusted code and never build it from untrusted input.

Options: :timeout (default 15_000), :await_promise (default false).

clear_cookies(page, opts \\ [])

@spec clear_cookies(
  t(),
  keyword()
) :: :ok | {:error, term()}

Clears all browser cookies. Lazily enables Network. Options: :timeout.

click(page, css, opts \\ [])

@spec click(t(), String.t(), keyword()) :: :ok | {:error, term()}

Clicks the first element matching css with a real, trusted mouse event.

Scrolls the element into view, then dispatches Input.dispatchMouseEvent mousePressed/mouseReleased at its center, so event.isTrusted is true.

The click is dispatched at the element's center but is not verified against what is actually on top — if an overlay covers the center, the overlay receives the event and the call still returns :ok. {:error, {:not_clickable, css}} reflects the layout at call time; on pages that animate or lazy-render content in, the box may not be ready yet, so retry at the caller (e.g. after wait_for_selector/3) rather than treating it as permanent.

Options:

  • :trustedfalse falls back to a synthetic JS .click() (no hit-testing, event.isTrusted == false); the escape hatch for elements not at a clickable point. Default true.
  • :timeout — default 15000.

Returns :ok, {:error, {:selector_not_found, css}} when nothing matches, or {:error, {:not_clickable, css}} when the element has no usable box (zero-size or not visible even after scroll).

continue_request(page, request_id, opts \\ [])

@spec continue_request(t(), String.t(), keyword()) :: :ok | {:error, term()}

Lets a paused request proceed (Fetch.continueRequest), optionally rewriting it.

:url, :method, and :headers are verbatim overrides, not merges. In particular :headers replaces the entire request header set, so passing it to set one header drops everything Chrome would otherwise send (User-Agent, Accept, Cookie, …). Omit :headers to leave the original request headers intact (the same gotcha as Puppeteer's continueRequest({headers})).

Options (all optional): :url, :method, :headers (a name => value map or keyword list), :post_data (a binary or iodata, base64-encoded for you), :timeout.

cookies(page, opts \\ [])

@spec cookies(
  t(),
  keyword()
) :: {:ok, [map()]} | {:error, term()}

Returns all browser cookies as a list of CDP cookie maps.

Lazily enables the Network domain. Options: :timeout (default 10_000).

disable_request_interception(page, opts \\ [])

@spec disable_request_interception(
  t(),
  keyword()
) :: :ok | {:error, term()}

Disables request interception — unsubscribes the caller from Fetch.requestPaused and disables the Fetch domain. Resolve any still-paused requests first.

Call this from the same process that called enable_request_interception/2: the unsubscribe is keyed to self(), so a disable from a different process leaves the original subscriber still receiving (now-unresolvable) pauses.

enable_request_interception(page, opts \\ [])

@spec enable_request_interception(
  t(),
  keyword()
) :: :ok | {:error, term()}

Enables request interception: pauses matching requests and delivers a Fetch.requestPaused event to the calling process for each one. You must then resolve every paused request with continue_request/3, fulfill_request/3, or fail_request/3 (keyed by its "requestId") — an unresolved request stalls the page.

Each pause arrives as {:cdp_event, conn, "Fetch.requestPaused", params, session_id}; handle it in a handle_info. The caller is subscribed before the domain is enabled, so no paused request is missed.

Drive interception from one long-lived process

Use the same process for enable_request_interception/2, the pause handling, and disable_request_interception/2 — the subscription is keyed to its pid. That process is registered with the browser as the interception owner: if it exits without disabling, the browser auto-Fetch.disables the page, so a crashed or forgetful caller can't leave it bricked (every request paused with no resolver). While interception is enabled you must still resolve every pause.

Only :dedicated pages are supported; a :session-transport page is rejected with {:error, {:unsupported_transport, :session}} (mirroring authenticate/4) — its subscription and owner-monitor would outlive close_page, which never stops the shared browser connection.

Mutually exclusive with authenticate/4 on the same page — both drive the Fetch domain. The conflict is enforced: enabling interception on an authenticated page returns {:error, {:conflict, :authenticated}}, and authenticate/4 on an intercepting page returns {:error, {:conflict, :intercepting}}. Re-enabling interception on a page that already has it returns {:error, :already_intercepting}.

Options:

  • :patterns — CDP RequestPatterns (default [%{"urlPattern" => "*"}], all requests)
  • :timeout — ms for the enable call (default 10_000)

evaluate(page, js, opts \\ [])

@spec evaluate(t(), String.t(), keyword()) :: {:ok, term()} | {:error, term()}

Evaluates a JavaScript expression and returns its value (returnByValue).

A thrown JS exception is {:error, {:evaluate_exception, details}}.

The value must be JSON-serializable. Under returnByValue, three results are worth knowing (return a serializable projection like el.outerHTML / el.id to avoid them):

  • a DOM node or function serializes lossily to {:ok, %{}} — an empty map, not the object and not an error;
  • an unserializable number Chrome reports only as an unserializableValue (NaN, Infinity, -0, a BigInt) has no by-value value and surfaces as {:error, {:unserializable_value, uv}} (the raw string, e.g. "NaN");
  • a value Chrome can't serialize at all — a self-referential object like window, a circular structure, or a Symbol — fails the call as {:error, {:cdp_error, "Runtime.evaluate", _}}.

Options: :timeout (default 15_000), :await_promise (default false).

fail_request(page, request_id, opts \\ [])

@spec fail_request(t(), String.t(), keyword()) :: :ok | {:error, term()}

Fails a paused request (Fetch.failRequest).

:reason (default :failed) is one of :failed, :aborted, :timed_out, :access_denied, :connection_closed, :connection_reset, :connection_refused, :name_not_resolved, :internet_disconnected, :address_unreachable, :blocked_by_client, :blocked_by_response. An unknown value returns {:error, {:invalid_error_reason, value}}.

fulfill_request(page, request_id, opts \\ [])

@spec fulfill_request(t(), String.t(), keyword()) :: :ok | {:error, term()}

Answers a paused request with a synthetic response (Fetch.fulfillRequest) — the page never hits the network for it.

Options: :status (response code, default 200), :headers (a name => value map or keyword list), :body (a binary or iodata, base64-encoded for you), :timeout.

html(page, opts \\ [])

@spec html(
  t(),
  keyword()
) :: {:ok, String.t()} | {:error, term()}

Returns the page's full serialized HTML (document.documentElement.outerHTML).

observe_network(page, opts \\ [])

@spec observe_network(
  t(),
  keyword()
) :: :ok | {:error, term()}

Starts observing network traffic, delivering CDP Network events to the calling process.

Subscribes the caller to :events (default the request + response lifecycle), then enables the Network domain (idempotent). Each event arrives as {:cdp_event, conn, method, params, session_id} — handle them in a handle_info. Call stop_observing_network/2 to unsubscribe.

Start observing before navigating: requests already in flight when you call this are not captured. Delivery is scoped to this page's session, so on a :session-transport connection you receive only this page's Network events, not those of other pages sharing the socket. Observing several pages from one process accumulates — you receive each observed page's events, told apart by the session_id element.

Options:

  • :eventsNetwork.* method names (default request + response lifecycle)
  • :timeout — ms for the enable call (default 10_000)

pdf(page, opts \\ [])

@spec pdf(
  t(),
  keyword()
) :: {:ok, binary()} | {:error, term()}

Renders the page to PDF (Page.printToPDF).

Returns {:ok, data} where data is the PDF bytes — or, when :path is given, the written file path (also a binary). Options: :path, :landscape (default false), :print_background (default true), :timeout (default 30_000).

press(page, css, key, opts \\ [])

@spec press(t(), String.t() | nil, String.t(), keyword()) :: :ok | {:error, term()}

Presses a single named key, dispatching real keyDown/keyUp events.

When css is a selector, the element is focused first; when css is nil, the key goes to the currently-focused element. Supported keys: Enter, Tab, Escape, Backspace, Delete, ArrowUp/ArrowDown/ArrowLeft/ArrowRight, Home, End.

Returns :ok, {:error, {:selector_not_found, css}}, or {:error, {:unknown_key, key}}. Option: :timeout (default 15000).

response_body(page, request_id, opts \\ [])

@spec response_body(t(), String.t(), keyword()) :: {:ok, binary()} | {:error, term()}

Returns a response's body by its request_id (from a Network.responseReceived event), via Network.getResponseBody.

Returns {:ok, body} (decoding base64 when Chrome sends it that way) or {:error, reason}. The Network domain must have been enabled (e.g. via observe_network/2) when the request was captured — unlike the other Network ops this does not lazily enable it, since enabling now can't recover a past body. If it wasn't enabled, the call surfaces as {:error, {:cdp_error, "Network.getResponseBody", _}}.

The body is only retrievable once the request has finished loading. Calling this in the window between Network.responseReceived (what wait_for_response/3 resolves on) and Network.loadingFinished can return {:error, {:cdp_error, "Network.getResponseBody", %{"code" => -32000}}} ("No data found …"). Wait for the network to settle (e.g. wait_for_network_idle/2) before reading, or retry on that transient error.

Options: :timeout (default 10_000).

screenshot(page, opts \\ [])

@spec screenshot(
  t(),
  keyword()
) :: {:ok, binary()} | {:error, term()}

Captures a PNG screenshot.

Returns {:ok, data} where data is the PNG bytes — or, when :path is given, the written file path (also a binary).

Options: :path (write to file), :full_page (capture beyond the viewport, default false), :timeout (default 30_000).

set_cookies(page, cookies, opts \\ [])

@spec set_cookies(t(), [map()], keyword()) :: :ok | {:error, term()}

Sets cookies. Each is a CDP CookieParam map — at least "name", "value", and a "url" or "domain". Lazily enables Network. Options: :timeout.

set_extra_headers(page, headers, opts \\ [])

@spec set_extra_headers(t(), %{optional(String.t()) => String.t()}, keyword()) ::
  :ok | {:error, term()}

Sets extra HTTP headers sent with every subsequent request on this page.

headers is a map of header name => value; set them before navigating for them to apply to that navigation. Lazily enables Network. Options: :timeout.

set_user_agent(page, user_agent, opts \\ [])

@spec set_user_agent(t(), String.t(), keyword()) :: :ok | {:error, term()}

Overrides the page's User-Agent (Emulation.setUserAgentOverride).

Options:

  • :user_agent_metadata — a CDP Emulation.UserAgentMetadata map (string keys, e.g. %{"platform" => "macOS", "mobile" => false, "brands" => [...]}). Sets the UA Client Hints surface (navigator.userAgentData, the Sec-CH-UA* headers) so it stays consistent with the UA string. Overriding only the string leaves Client Hints at Chrome's defaults — a visible navigator.userAgent ↔ Client-Hints mismatch. Passed through verbatim, so it must match the CDP shape: a partial or empty map is not dropped — it's sent as-is and Chrome rejects the whole setUserAgentOverride call. Omit the option entirely to leave Client Hints at Chrome's default.
  • :accept_language — value for the Accept-Language header and navigator.language (e.g. "en-US").
  • :timeout (default 10_000).

Example

Page.set_user_agent(page, "Mozilla/5.0 (Macintosh…) Chrome/120.0.0.0 Safari/537.36",
  user_agent_metadata: %{
    "platform" => "macOS",
    "platformVersion" => "14.0.0",
    "architecture" => "arm",
    "model" => "",
    "mobile" => false,
    "brands" => [
      %{"brand" => "Chromium", "version" => "120"},
      %{"brand" => "Not=A?Brand", "version" => "24"}
    ]
  },
  accept_language: "en-US"
)

set_viewport(page, width, height, opts \\ [])

@spec set_viewport(t(), pos_integer(), pos_integer(), keyword()) ::
  :ok | {:error, term()}

Overrides the viewport via Emulation.setDeviceMetricsOverride.

width/height are CSS pixels. Options: :device_scale_factor (default 1), :mobile (default false), :timeout. Returns :ok.

stop_observing_network(page, opts \\ [])

@spec stop_observing_network(
  t(),
  keyword()
) :: :ok

Stops observing — unsubscribes the caller from the network :events. Leaves the Network domain enabled.

Pass the same :events you gave observe_network/2. Both default to the request + response lifecycle, but if you observed with a custom list you must repeat it here — otherwise the original subscriptions are never removed and the caller keeps receiving those events.

text(page, css, opts \\ [])

@spec text(t(), String.t(), keyword()) :: {:ok, String.t() | nil} | {:error, term()}

Returns the textContent of the first element matching css, or nil when no element matches.

type(page, css, text, opts \\ [])

@spec type(t(), String.t(), String.t(), keyword()) :: :ok | {:error, term()}

Types text into the first element matching css.

Focuses the element, then sends the text via Input.insertText (fires input and change; does not emit per-character keydown/keyup).

Returns :ok or {:error, {:selector_not_found, css}}. Option: :timeout (default 15000).

visible?(page, css, opts \\ [])

@spec visible?(t(), String.t(), keyword()) :: {:ok, boolean()} | {:error, term()}

Returns {:ok, true} when the first element matching css is rendered and visible (has layout boxes, not display: none / visibility: hidden), {:ok, false} otherwise — including when no element matches.

wait_for_function(page, js, opts \\ [])

@spec wait_for_function(t(), String.t(), keyword()) :: :ok | {:error, term()}

Polls a JavaScript expression until it is truthy, or timeout elapses.

The expression is coerced with !!(...), so JS truthiness applies. Returns :ok, {:error, :timeout}, or {:error, reason} if a non-transient evaluate error occurs (e.g. a thrown exception or a dropped connection). Options: :timeout (default 5_000), :interval (poll interval ms, default 100).

wait_for_navigation(page, opts \\ [])

@spec wait_for_navigation(
  t(),
  keyword()
) :: :ok | {:error, term()}

Waits for a navigation lifecycle milestone, without issuing a navigation.

Useful after a click/3 (or other in-page action) that triggers navigation.

Options:

  • :wait_until:network_almost_idle (default), :load, or :none
  • :timeout — ms (default 30_000)

Returns :ok, {:error, :timeout}, or {:error, reason} if the connection drops while waiting.

wait_for_network_idle(page, opts \\ [])

@spec wait_for_network_idle(
  t(),
  keyword()
) :: :ok | {:error, term()}

Blocks until the network has been idle — at most :max_inflight in-flight requests — for :idle_time ms continuously, or timeout.

The Puppeteer "networkidle" primitive: use it after a click/3 (or other action) that kicks off XHR/fetch hydration to wait for the page to settle. Idleness is measured from the call onward — requests already in flight when you call are not counted — so call it right after triggering the work.

Returns :ok once idle, {:error, :timeout} if it never settles within timeout (e.g. a streaming / SSE / long-poll connection that never closes), or {:error, reason} if the connection drops. Lazily enables the Network domain.

The idle wait runs in a short-lived helper process with its own subscription, so it composes safely with observe_network/2 on the same page — the caller's subscription and any buffered events are left untouched. A normal connection drop surfaces as {:error, :noproc} / {:error, {:ws_closed, _}}; only an abnormal crash of that internal helper returns {:error, {:idle_wait_failed, reason}}.

Options:

  • :idle_time — ms of continuous idleness required (default 500)
  • :max_inflight — in-flight requests still considered idle (default 0; 2 is Puppeteer's networkidle2)
  • :timeout — overall ceiling in ms (default 30_000)

wait_for_response(page, matcher, opts \\ [])

@spec wait_for_response(
  t(),
  (String.t() -> boolean()) | Regex.t() | String.t(),
  keyword()
) ::
  {:ok, map()} | {:error, term()}

Blocks until a network response whose URL matches matcher arrives, or timeout.

Useful after a click/3 (or other in-page action) that triggers an XHR/fetch: wait for the specific response, then read it with response_body/3 (the returned params carry the "requestId").

matcher selects on the response URL and may be:

  • a function (url :: String.t() -> boolean()),
  • a Regex (matched against the URL), or
  • a binary substring (matched with String.contains?/2).

A function matcher runs inside the connection process for each candidate response, so keep it fast and side-effect-free (a slow matcher stalls the socket, and every page on a :session-transport connection with it).

Returns {:ok, params} — the full Network.responseReceived params (HTTP status under params["response"]["status"], request id under params["requestId"]) — or {:error, :timeout} if nothing matched in time, or {:error, reason} if the connection drops. Lazily enables the Network domain. Only responses observed after this call are considered, so call it before triggering the request.

Returns {:ok, params}, not a bare :ok

Unlike wait_for_navigation/2 / wait_for_selector/3 / wait_for_network_idle/2 (which return :ok), this returns the matched event — match on {:ok, params}.

Options: :timeout — ms (default 30_000).

wait_for_selector(page, css, opts \\ [])

@spec wait_for_selector(t(), String.t(), keyword()) :: :ok | {:error, term()}

Polls until css matches an element, or timeout elapses.

Returns :ok, {:error, :timeout}, or {:error, reason} if a non-transient evaluate error occurs (e.g. the connection drops). Options: :timeout (default 5_000), :interval (poll interval ms, default 100).