CDPEx.Page (CDPEx v0.4.0)

Copy Markdown View Source

A page (tab) handle and the operations you run against it.

A CDPEx.Page is a lightweight struct — not a process — holding the page's CDPEx.Connection pid and target id. Operations are functions over that connection, so the OTP properties (supervision, crash isolation) live in the connection/browser layer while page calls stay ergonomic.

Obtain one with CDPEx.new_page/2. If the underlying page dies (navigation to a new target, a crash), operations return {:error, :noproc} and you should open a fresh page.

Operations

Summary

Functions

Returns attribute name of the first element matching css, or nil when the element or attribute is absent.

Arms HTTP/proxy authentication on this page with username/password.

Calls a JavaScript function with args and returns its value.

Clears all browser cookies. Lazily enables Network. Options: :timeout.

Clicks the first element matching css (a synthetic JS .click()).

Lets a paused request proceed (Fetch.continueRequest), optionally rewriting it.

Returns all browser cookies as a list of CDP cookie maps.

Disables request interception — unsubscribes the caller from Fetch.requestPaused and disables the Fetch domain. Resolve any still-paused requests first.

Enables request interception: pauses matching requests and delivers a Fetch.requestPaused event to the calling process for each one. You must then resolve every paused request with continue_request/3, fulfill_request/3, or fail_request/3 (keyed by its "requestId") — an unresolved request stalls the page.

Evaluates a JavaScript expression and returns its value (returnByValue).

Fails a paused request (Fetch.failRequest).

Answers a paused request with a synthetic response (Fetch.fulfillRequest) — the page never hits the network for it.

Returns the page's full serialized HTML (document.documentElement.outerHTML).

Navigates to url and (by default) waits until the network is almost idle.

Starts observing network traffic, delivering CDP Network events to the calling process.

Renders the page to PDF (Page.printToPDF).

Returns a response's body by its request_id (from a Network.responseReceived event), via Network.getResponseBody.

Captures a PNG screenshot.

Sets cookies. Each is a CDP CookieParam map — at least "name", "value", and a "url" or "domain". Lazily enables Network. Options: :timeout.

Sets extra HTTP headers sent with every subsequent request on this page.

Overrides the page's User-Agent (Emulation.setUserAgentOverride).

Overrides the viewport via Emulation.setDeviceMetricsOverride.

Stops observing — unsubscribes the caller from the network :events. Leaves the Network domain enabled.

Returns the textContent of the first element matching css, or nil when no element matches.

Returns {:ok, true} when the first element matching css is rendered and visible (has layout boxes, not display: none / visibility: hidden), {:ok, false} otherwise — including when no element matches.

Polls a JavaScript expression until it is truthy, or timeout elapses.

Waits for a navigation lifecycle milestone, without issuing a navigation.

Blocks until the network has been idle — at most :max_inflight in-flight requests — for :idle_time ms continuously, or timeout.

Blocks until a network response whose URL matches matcher arrives, or timeout.

Polls until css matches an element, or timeout elapses.

Types

t()

@type t() :: %CDPEx.Page{
  browser: pid(),
  conn: pid(),
  session_id: String.t() | nil,
  target_id: String.t()
}

Functions

attribute(page, css, name, opts \\ [])

@spec attribute(t(), String.t(), String.t(), keyword()) ::
  {:ok, String.t() | nil} | {:error, term()}

Returns attribute name of the first element matching css, or nil when the element or attribute is absent.

authenticate(page, username, password, opts \\ [])

@spec authenticate(t(), String.t(), String.t(), keyword()) :: :ok | {:error, term()}

Arms HTTP/proxy authentication on this page with username/password.

Headless Chrome launched with --proxy-server=host:port can't send proxy credentials, so an authenticated proxy rejects the connection (net::ERR_INVALID_AUTH_CREDENTIALS). Call this after new_page/2 and before navigate/3: it answers the proxy (or HTTP Basic) auth challenge with the given credentials. It also covers Basic-auth-gated origins.

This enables the CDP Fetch domain for the page, which pauses (and auto-continues) every request — measurable overhead on heavy pages.

Only :dedicated pages (the new_page/2 default) are supported; a :session page returns {:error, {:unsupported_transport, :session}}. A page that isn't one of this browser's open pages returns {:error, :unknown_page}, a page that is already authenticated returns {:error, :already_authenticated}, and a page that already has request interception enabled returns {:error, {:conflict, :intercepting}} (auth and interception both drive the Fetch domain — use one per page).

The bad-credentials loop guard keys on the request id, so a single request that must answer both a proxy and an origin challenge isn't supported — the second challenge is cancelled (Puppeteer-parity).

Options:

  • :source — which challenges to answer: :any (default), :proxy, :server. An unknown value returns {:error, {:invalid_source, value}}.

call_function(page, function_declaration, args \\ [], opts \\ [])

@spec call_function(t(), String.t(), [term()], keyword()) ::
  {:ok, term()} | {:error, term()}

Calls a JavaScript function with args and returns its value.

function_declaration is a JS function expression (e.g. "(a, b) => a + b"). args are JSON-encoded (not string-interpolated) before being applied, so passing data values through them is safe. A thrown exception is {:error, {:evaluate_exception, details}}; non-serializable args return {:error, {:invalid_args, reason}}.

Trusted input

function_declaration is interpolated into the page script verbatim — treat it as trusted code and never build it from untrusted input.

Options: :timeout (default 15_000), :await_promise (default false).

clear_cookies(page, opts \\ [])

@spec clear_cookies(
  t(),
  keyword()
) :: :ok | {:error, term()}

Clears all browser cookies. Lazily enables Network. Options: :timeout.

click(page, css, opts \\ [])

@spec click(t(), String.t(), keyword()) :: :ok | {:error, term()}

Clicks the first element matching css (a synthetic JS .click()).

Returns :ok, or {:error, {:selector_not_found, css}} when nothing matches.

continue_request(page, request_id, opts \\ [])

@spec continue_request(t(), String.t(), keyword()) :: :ok | {:error, term()}

Lets a paused request proceed (Fetch.continueRequest), optionally rewriting it.

:url, :method, and :headers are verbatim overrides, not merges. In particular :headers replaces the entire request header set, so passing it to set one header drops everything Chrome would otherwise send (User-Agent, Accept, Cookie, …). Omit :headers to leave the original request headers intact (the same gotcha as Puppeteer's continueRequest({headers})).

Options (all optional): :url, :method, :headers (a name => value map or keyword list), :post_data (a binary or iodata, base64-encoded for you), :timeout.

cookies(page, opts \\ [])

@spec cookies(
  t(),
  keyword()
) :: {:ok, [map()]} | {:error, term()}

Returns all browser cookies as a list of CDP cookie maps.

Lazily enables the Network domain. Options: :timeout (default 10_000).

disable_request_interception(page, opts \\ [])

@spec disable_request_interception(
  t(),
  keyword()
) :: :ok | {:error, term()}

Disables request interception — unsubscribes the caller from Fetch.requestPaused and disables the Fetch domain. Resolve any still-paused requests first.

Call this from the same process that called enable_request_interception/2: the unsubscribe is keyed to self(), so a disable from a different process leaves the original subscriber still receiving (now-unresolvable) pauses.

enable_request_interception(page, opts \\ [])

@spec enable_request_interception(
  t(),
  keyword()
) :: :ok | {:error, term()}

Enables request interception: pauses matching requests and delivers a Fetch.requestPaused event to the calling process for each one. You must then resolve every paused request with continue_request/3, fulfill_request/3, or fail_request/3 (keyed by its "requestId") — an unresolved request stalls the page.

Each pause arrives as {:cdp_event, conn, "Fetch.requestPaused", params, session_id}; handle it in a handle_info. The caller is subscribed before the domain is enabled, so no paused request is missed.

Drive interception from one long-lived process

Use the same process for enable_request_interception/2, the pause handling, and disable_request_interception/2 — the subscription is keyed to its pid. That process is registered with the browser as the interception owner: if it exits without disabling, the browser auto-Fetch.disables the page, so a crashed or forgetful caller can't leave it bricked (every request paused with no resolver). While interception is enabled you must still resolve every pause.

Only :dedicated pages are supported; a :session-transport page is rejected with {:error, {:unsupported_transport, :session}} (mirroring authenticate/4) — its subscription and owner-monitor would outlive close_page, which never stops the shared browser connection.

Mutually exclusive with authenticate/4 on the same page — both drive the Fetch domain. The conflict is enforced: enabling interception on an authenticated page returns {:error, {:conflict, :authenticated}}, and authenticate/4 on an intercepting page returns {:error, {:conflict, :intercepting}}. Re-enabling interception on a page that already has it returns {:error, :already_intercepting}.

Options:

  • :patterns — CDP RequestPatterns (default [%{"urlPattern" => "*"}], all requests)
  • :timeout — ms for the enable call (default 10_000)

evaluate(page, js, opts \\ [])

@spec evaluate(t(), String.t(), keyword()) :: {:ok, term()} | {:error, term()}

Evaluates a JavaScript expression and returns its value (returnByValue).

A thrown JS exception is {:error, {:evaluate_exception, details}}.

Options: :timeout (default 15_000), :await_promise (default false).

fail_request(page, request_id, opts \\ [])

@spec fail_request(t(), String.t(), keyword()) :: :ok | {:error, term()}

Fails a paused request (Fetch.failRequest).

:reason (default :failed) is one of :failed, :aborted, :timed_out, :access_denied, :connection_closed, :connection_reset, :connection_refused, :name_not_resolved, :internet_disconnected, :address_unreachable, :blocked_by_client, :blocked_by_response. An unknown value returns {:error, {:invalid_error_reason, value}}.

fulfill_request(page, request_id, opts \\ [])

@spec fulfill_request(t(), String.t(), keyword()) :: :ok | {:error, term()}

Answers a paused request with a synthetic response (Fetch.fulfillRequest) — the page never hits the network for it.

Options: :status (response code, default 200), :headers (a name => value map or keyword list), :body (a binary or iodata, base64-encoded for you), :timeout.

html(page, opts \\ [])

@spec html(
  t(),
  keyword()
) :: {:ok, String.t()} | {:error, term()}

Returns the page's full serialized HTML (document.documentElement.outerHTML).

observe_network(page, opts \\ [])

@spec observe_network(
  t(),
  keyword()
) :: :ok | {:error, term()}

Starts observing network traffic, delivering CDP Network events to the calling process.

Subscribes the caller to :events (default the request + response lifecycle), then enables the Network domain (idempotent). Each event arrives as {:cdp_event, conn, method, params, session_id} — handle them in a handle_info. Call stop_observing_network/2 to unsubscribe.

Start observing before navigating: requests already in flight when you call this are not captured. On a session-transport page the caller receives every session's events on the shared connection (subscriptions are keyed by method, not session); match on the session_id element to filter to this page.

Options:

  • :eventsNetwork.* method names (default request + response lifecycle)
  • :timeout — ms for the enable call (default 10_000)

pdf(page, opts \\ [])

@spec pdf(
  t(),
  keyword()
) :: {:ok, binary()} | {:error, term()}

Renders the page to PDF (Page.printToPDF).

Returns {:ok, data} where data is the PDF bytes — or, when :path is given, the written file path (also a binary). Options: :path, :landscape (default false), :print_background (default true), :timeout (default 30_000).

response_body(page, request_id, opts \\ [])

@spec response_body(t(), String.t(), keyword()) :: {:ok, binary()} | {:error, term()}

Returns a response's body by its request_id (from a Network.responseReceived event), via Network.getResponseBody.

Returns {:ok, body} (decoding base64 when Chrome sends it that way) or {:error, reason}. The Network domain must have been enabled (e.g. via observe_network/2) when the request was captured — unlike the other Network ops this does not lazily enable it, since enabling now can't recover a past body. If it wasn't enabled, the call surfaces as {:error, {:cdp_error, "Network.getResponseBody", _}}. Options: :timeout (default 10_000).

screenshot(page, opts \\ [])

@spec screenshot(
  t(),
  keyword()
) :: {:ok, binary()} | {:error, term()}

Captures a PNG screenshot.

Returns {:ok, data} where data is the PNG bytes — or, when :path is given, the written file path (also a binary).

Options: :path (write to file), :full_page (capture beyond the viewport, default false), :timeout (default 30_000).

set_cookies(page, cookies, opts \\ [])

@spec set_cookies(t(), [map()], keyword()) :: :ok | {:error, term()}

Sets cookies. Each is a CDP CookieParam map — at least "name", "value", and a "url" or "domain". Lazily enables Network. Options: :timeout.

set_extra_headers(page, headers, opts \\ [])

@spec set_extra_headers(t(), %{optional(String.t()) => String.t()}, keyword()) ::
  :ok | {:error, term()}

Sets extra HTTP headers sent with every subsequent request on this page.

headers is a map of header name => value; set them before navigating for them to apply to that navigation. Lazily enables Network. Options: :timeout.

set_user_agent(page, user_agent, opts \\ [])

@spec set_user_agent(t(), String.t(), keyword()) :: :ok | {:error, term()}

Overrides the page's User-Agent (Emulation.setUserAgentOverride).

Options: :timeout (default 10_000).

set_viewport(page, width, height, opts \\ [])

@spec set_viewport(t(), pos_integer(), pos_integer(), keyword()) ::
  :ok | {:error, term()}

Overrides the viewport via Emulation.setDeviceMetricsOverride.

width/height are CSS pixels. Options: :device_scale_factor (default 1), :mobile (default false), :timeout. Returns :ok.

stop_observing_network(page, opts \\ [])

@spec stop_observing_network(
  t(),
  keyword()
) :: :ok

Stops observing — unsubscribes the caller from the network :events. Leaves the Network domain enabled.

Pass the same :events you gave observe_network/2. Both default to the request + response lifecycle, but if you observed with a custom list you must repeat it here — otherwise the original subscriptions are never removed and the caller keeps receiving those events.

text(page, css, opts \\ [])

@spec text(t(), String.t(), keyword()) :: {:ok, String.t() | nil} | {:error, term()}

Returns the textContent of the first element matching css, or nil when no element matches.

visible?(page, css, opts \\ [])

@spec visible?(t(), String.t(), keyword()) :: {:ok, boolean()} | {:error, term()}

Returns {:ok, true} when the first element matching css is rendered and visible (has layout boxes, not display: none / visibility: hidden), {:ok, false} otherwise — including when no element matches.

wait_for_function(page, js, opts \\ [])

@spec wait_for_function(t(), String.t(), keyword()) :: :ok | {:error, term()}

Polls a JavaScript expression until it is truthy, or timeout elapses.

The expression is coerced with !!(...), so JS truthiness applies. Returns :ok, {:error, :timeout}, or {:error, reason} if a non-transient evaluate error occurs (e.g. a thrown exception or a dropped connection). Options: :timeout (default 5_000), :interval (poll interval ms, default 100).

wait_for_navigation(page, opts \\ [])

@spec wait_for_navigation(
  t(),
  keyword()
) :: :ok | {:error, term()}

Waits for a navigation lifecycle milestone, without issuing a navigation.

Useful after a click/3 (or other in-page action) that triggers navigation.

Options:

  • :wait_until:network_almost_idle (default), :load, or :none
  • :timeout — ms (default 30_000)

Returns :ok, {:error, :timeout}, or {:error, reason} if the connection drops while waiting.

wait_for_network_idle(page, opts \\ [])

@spec wait_for_network_idle(
  t(),
  keyword()
) :: :ok | {:error, term()}

Blocks until the network has been idle — at most :max_inflight in-flight requests — for :idle_time ms continuously, or timeout.

The Puppeteer "networkidle" primitive: use it after a click/3 (or other action) that kicks off XHR/fetch hydration to wait for the page to settle. Idleness is measured from the call onward — requests already in flight when you call are not counted — so call it right after triggering the work.

Returns :ok once idle, {:error, :timeout} if it never settles within timeout (e.g. a streaming / SSE / long-poll connection that never closes), or {:error, reason} if the connection drops. Lazily enables the Network domain.

Don't combine with observe_network/2 on the same page

This subscribes the calling process to the Network request-lifecycle events for the duration of the call, then unsubscribes on the way out. If the same process is also running observe_network/2 on this page, that subscription is torn down. Drive the idle wait from a process that isn't also observing the network.

Options:

  • :idle_time — ms of continuous idleness required (default 500)
  • :max_inflight — in-flight requests still considered idle (default 0; 2 is Puppeteer's networkidle2)
  • :timeout — overall ceiling in ms (default 30_000)

wait_for_response(page, matcher, opts \\ [])

@spec wait_for_response(
  t(),
  (String.t() -> boolean()) | Regex.t() | String.t(),
  keyword()
) ::
  {:ok, map()} | {:error, term()}

Blocks until a network response whose URL matches matcher arrives, or timeout.

Useful after a click/3 (or other in-page action) that triggers an XHR/fetch: wait for the specific response, then read it with response_body/3 (the returned params carry the "requestId").

matcher selects on the response URL and may be:

  • a function (url :: String.t() -> boolean()),
  • a Regex (matched against the URL), or
  • a binary substring (matched with String.contains?/2).

Returns {:ok, params} — the full Network.responseReceived params (HTTP status under params["response"]["status"], request id under params["requestId"]) — or {:error, :timeout} if nothing matched in time, or {:error, reason} if the connection drops. Lazily enables the Network domain. Only responses observed after this call are considered, so call it before triggering the request.

Returns {:ok, params}, not a bare :ok

Unlike wait_for_navigation/2 / wait_for_selector/3 / wait_for_network_idle/2 (which return :ok), this returns the matched event — match on {:ok, params}.

Options: :timeout — ms (default 30_000).

wait_for_selector(page, css, opts \\ [])

@spec wait_for_selector(t(), String.t(), keyword()) :: :ok | {:error, term()}

Polls until css matches an element, or timeout elapses.

Returns :ok, {:error, :timeout}, or {:error, reason} if a non-transient evaluate error occurs (e.g. the connection drops). Options: :timeout (default 5_000), :interval (poll interval ms, default 100).